

35 stars
15 forks
2 issues


List of commits on branch master.

No commits found

There are no commits on branch master.


The README file for this repository.

Timeseries Forecasting with Deep Learning

This Python project uses LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit) based Recurrent Neural Networks to forecast (predict) timeseries using Keras + Theano. We compare the results produced by each of these deep neural networks with those from a linear regression model.

Dataset: Number of daily births in Quebec, Jan. 01 '77 - Dec. 31 '90 (Hipel & McLeod, 1994)

##Usage I suggest you install Virtualenv before trying this out.

git clone
cd deeplearning-timeseries
virtualenv ENV
source ENV/bin/activate
pip install --upgrade pip
pip install keras h5py pandas sklearn

##Architecture & Model Properties We use Keras' Sequential model to construct recurrent neural networks. There are 3 layers:

  • Layer 1 : Either a LSTM (with output dimension 10, and statefulness enabled) layer or a GRU (with output dimension 4) layer.
  • Layer 2 : A Dropout layer with dropout probability = 0.2, to prevent overfitting.
  • Layer 3 : A fully-connected Dense Layer with output dimension 1.
  • Default optimizer: rmsprop; Default # of epochs: 150.
  • Accuracy Metric: Mean Squared Error.
This architecture can certainly further be optimized - I just haven't had the chance to experiment too much thanks to my laptop's constraints!

##Results & Observations

  1. The LSTM-RNN model performed the best with a MSE of 1464.78 (look back = 37).
  2. Naively making the RNN "deeper" did not yield immediate results; I didn't fine-tune the parameters (output_dim, for example) though.
  3. Making the LSTM network stateful (setting stateful=true when initializing the LSTM layer) did yield a significant performance improvement though. In stateless LSTM layers, the cell states are reset at each sequence. When stateful=true however, the states are propagated onto the next batch i.e. the state of the sample located at index trainX[i] will be used in the computation of the sample trainX[i+k] in the next batch, where k is the batch size. You can read more about this at the Keras docs.
  4. Using Glorot initializations yielded a performance improvement. However, using He uniform initialization (Gaussian initialization scaled by fan_in) yielded even better results than with Glorot.


  • ```data/number-of-daily-births-in-quebec.csv``` : Dataset.
  • ``````: Contains the class ```LSTM_RNN``` for LSTM-based Recurrent Neural Networks.
  • ``````: Contains the class ```GRU_RNN``` for GRU-based Recurrent Neural Networks.
  • ``````: Loads and preprocesses the dataset, creates LSTM-RNN, GRU-RNN and Linear Regression models, and outputs results.
  • ##To-do

  • K-fold cross validation.
  • Add plots to aid in visualization.
  • ##References

    1. On the use of ‘Long-Short Term Memory’ neural networks for time series prediction, Gomez-Git et. al, 2014.
    2. Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Srivastava et. al 2014.
    3. Learning to forget, Gers, Schmidhuber & Cummins, 2000.
    4. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, Chung et. al, 2014.