Case Studies

If you're interested in time series analysis and forecasting, this is the right place to be. The Time Series Lab (TSL) software platform makes time series analysis available to anyone with a basic knowledge of statistics. Future versions will remove the need for a basic knowledge altogether by providing fully automated forecasting systems. The platform is designed and developed in a way such that results can be obtained quickly and verified easily. At the same time, many advanced time series and forecasting operations are available for the experts. In our case studies, we often present screenshots of the program so that you can easily replicate results.

Did you know you can make a screenshot of a TSL program window? Press Ctrl + p to open a window which allows you to save a screenshot of the program. The TSL window should be located on your main monitor.

Click on the buttons below to go to our case studies. At the beginning of each case study, the required TSL package is mentioned. Our first case study, about the Nile data, is meant to illustrate the basic workings of the program and we advise you to start with that one.

Gasoline

Author: Rutger Lit
Date: June 30, 2022
Software: Time Series Lab - Home Edition
Topics: fractional seasonal periods and comparison of forecasting performance
Batch program: gasoline.txt

Gasoline consumption

The data for this case study is weekly data on US finished motor gasoline products supplied (in thousands of barrels per day) from February 1991 to May 2005. It is part of the R package fpp2 and available from the EIA website. It is also bundled with the installation file of TSL. The dataset is used in the TBATS paper of De Livera, A.M., R.J. Hyndman, and R.D. Snyder (2011). Furthermore, the dataset is analysed by R.J. Hyndman on his blog. We quote from this blog post:

The TBATS model is preferable when the seasonality changes over time. The ARIMA approach is preferable if there are covariates that are useful predictors as these can be added as additional regressors.

This gasoline case study illustrates that you don't need to choose between the two methods when you work with Time Series Lab. TSL offers a modelling framework for complex seasonal patterns AND, at the same time, with the inclusion of covariates (explanatory variables). We show that it is possible for TSL to produce more accurate forecasts compared to the TBATS package. We deliberately compare with TBATS since this package shows accurate forecasts when complex seasonal patterns are present in the data.

In the figure below, the gasoline dataset is loaded into TSL and plotted. The upward trend and seasonality patterns are clearly visible in the data. The Data characteristics area shows T = 745 observations. At a later stage (Estimation step) we will split the time series into an Training sample and a test sample.


Information: On the Database page, you can copy the contents of the blue Data characteristics pane to the clipboard by right-mouse clicking the area and selecting Copy contents or by selecting the text and clicking Ctrl-c.

TSL Database page with Gasoline dataset loaded

Data inspection and preparation page
Time Series Lab Database page. The upward trend and seasonality are clearly visible in the data.

Local Linear Trend model

The seasonal pattern of this time series is important but for illustrative purposes, we start our analysis without a seasonal component and select the Local Linear Trend model which is a model with a trend component but no seasonal component. Select the Local Linear Trend model model on the Pre-built models page. Alternatively, you can go to the Build your own model page and select a time-varying level and a time-varying slope.
Our time series consist of a total of 745 observations (February 1991 to May 2005). For this case study we select the first 484 observations as Training sample and leaving 261 observations as Validation sample. Drag (and/or click) the sample bar on the Pre-built models page to set a Training sample of size 484, or alternatively, set the start and end of the Training sample to 1 and 484 on the Estimation page. Click the Process Dashboard button on the Pre-built models page or the Estimate button on the Estimation page. TSL estimates the model and if you go to the Text output page you see a green colored message informing us that:

All selected models and series were estimated successfully

Furthermore, at the bottom of the Text output page we find the Model fit. For the current model this is:


Variable: gasoline
Model: TSL003 Local Linear Trend
                                               TSL003
Log likelihood                              -3485.355   
Akaike Information Criterion (AIC)           6978.710   
Bias corrected AIC (AICc)                    6978.794   
Bayesian Information Criterion (BIC)         6995.439   
in-sample MSE                              1.1177e+05   
... RMSE                                      334.317   
... MAE                                       268.147   
... MAPE                                        3.480   
Sample size                                       484   
Effective sample size                             482   
* based on one-step-ahead forecast errors
                            

We report these numbers here to show the improvement of adding a seasonal component later. The graphical output of the current model is shown in the figure below.

Graph page of TSL with Gasoline dataset

Data inspection and preparation page

With a smoothed level through the data, the seasonal pattern is even better visible. The triangular pattern in the level will appear later as well when we plot the forecasting performance of the model.
Let's assess the forecast performance of the model by going to the Model comparison page. This page can be viewed by clicking the Model comparison button in the button bar on the left of your screen. Note that this button is only visible when a Validation sample is specified. Click on the green Start loss calculation button in the top right of the window. Under User defined models, a new check-button appears which you should tick. The resulting TSL screen is shown below.

RMSE loss Local Linear Trend model and Gasoline dataset

Data inspection and preparation page
RMSE loss for the Local Linear Trend model and the Gasoline dataset for a forecast horizon of h=50.

The pyramid shaped loss line can be explained by the fact that a forecast from the Local Linear Trend model is a straight line that is upward sloping for our data set. The forecasts do not take into account the seasonal pattern of the data so when the data is at the highest or lowest point in the seasonal cycle, the loss is the highest. Let's verify this. Go to the Forecasting page and select multi-step-ahead in the top left corner. Navigate to Plot options and Show forecast 150 periods ahead. The resulting window should look the one presented in the figure below.

Forecasts for h=150 time points ahead

Data inspection and preparation page

Basic Structural Time Series model

It is time to introduce a seasonal component. We go to the Build your own model page and select a time-varying level, a time-varying slope, and a time-varying seasonal. The resulting model is called the Basic Structural Time Series model by Harvey (1990). Set the Seasonal period length to 365.25/7 ≈ 52.179 (weekly data taking leap years into account) and a Number of factors equal to 22.

Information: Seasonal period length is the number of time points after which the seasonal repeats. This can be a fractional number. For example, with daily data, specify a period of 365.25 for a seasonal that repeats each year, taking leap years into account. Number of factors specifies the seasonal flexibility. Note that a higher number is not always better and parsimonious models often perform better in forecasting.

The Build your own model page should look like the one in the figure below.

Component selection page

Data inspection and preparation page

Estimate the model and go to the Text output page. We see that the model fit is improved by adding the seasonal component.


Variable: gasoline
Model: TSL004
                                               TSL004
Log likelihood                              -3218.400   
Akaike Information Criterion (AIC)           6532.799   
Bias corrected AIC (AICc)                    6543.613   
Bayesian Information Criterion (BIC)         6733.539   
in-sample MSE                              1.0083e+05   
... RMSE                                      317.539   
... MAE                                       252.502   
... MAPE                                        3.244   
Sample size                                       484   
Effective sample size                             430   
* based on one-step-ahead forecast errors
                            

A large improvement in forecasting performance, compared to the Local Linear Trend model, can be seen if we start (and plot) the loss calculation on the Model comparison page. This shows how important it can be to model the seasonal pattern of a time series. We will see an example of multiple seasonal patterns in a time series which makes the correct handling even more important. Plotting both RMSE losses leads to the following figure.

Forecast performance of LLT and Basis Structural model

Data inspection and preparation page

Next, go back to the Build your own model page and lower the number of factors of the seasonal component. Parsimonious models often perform better in forecasting. A better performing number of factors is 7 although other values might be even better for forecasting. A model comparison between 22 and 7 factors is made in the figure below. The loss corresponding to the model with 7 factors is lower than the one that is presented in Figure 2 of De Livera, A.M., R.J. Hyndman, and R.D. Snyder (2011) which is obtained with the TBATS package.

Forecast performance of LLT and Basis Structural model

Data inspection and preparation page

A figure with the extracted trend and seasonal pattern is obtained from the Graphics and Diagnostics page.

Gasoline data extracted trend and seasonal pattern

Data inspection and preparation page
Gasoline data extracted trend and seasonal pattern for the time series model Level + Slope + Seasonal(52.179 / 7).

Further exploration

  • Estimate the model with a Level, Slope, and Seasonal with frequency 52. Verify that by taking the leap year not into account (52 instead of 52.179), forecasts become worse.
  • Forecasts can further be improved by adding explanatory variables. In TSL you can do this with the click of a couple of buttons on the Model setup page. Let us know which variables you have used to boost the forecast precision for the gasoline dataset!

Bibliography

References

Durbin, J. and Koopman, S. J. (2012). Time series analysis by state space methods. Oxford university press.

Harvey, A. (1989). Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge: Cambridge University Press. doi:10.1017/CBO9781107049994

De Livera, A.M., R.J. Hyndman, and R.D. Snyder (2011). Forecasting Time Series With Complex Seasonal Patterns Using Exponential Smoothing. Journal of the American Statistical Association 106:496, 1513-1527.