D(St)reams of Anomalies - The real world does not slow down for bad data

For this project, I have considered Twitter_volume_FB dataset from Kaggle.

Steps to execute:

Download the files from the github repository.
Get the Twitter_volume_FB.csv file by unzipping the .zip file.
Place the csv files in datasets folder and place the datasets folder in notebooks folder. The notebooks folder should also have ipynb file as well.
Navigate to terminal and type "jupyter notebook"
Navigate to the folder where the notebook is placed.
From the menu icon cell, click on Run all which will run the whole notebook from the first cell. Verify the results.

The project is all about showing the anomalies in the dataset and how they can be easily identified.

Steps to follow:

Set up a data science project structure in a new git repository in your GitHub account
Download the benchmark data set from https://www.kaggle.com/boltzmannbrain/nab or https://github.com/numenta/NAB/tree/master/data
Load the one of the data set into panda data frames
Formulate one or two ideas on how feature engineering would help the data set to establish additional value using exploratory data analysis
Build one or more anomaly detection models to determine the anomalies using the other columns as features
Document your process and results
Commit your notebook, source code, visualizations and other supporting files to the git repository in GitHub

Project-6-Anomalies