From January to March 2019, Australia has had an extremely high record of summer flu cases . If the trend continues, this could turn into one of the most deadly flu seasons for decades.
To estimate just how bad the flu could be, I created a Holt-Winters forecast model in MS Excel.
If you would like to download the Excel model, please fill out the below.
what does the data look like?
The data comes from Australia’s laboratory confirmed flu cases, recorded by the immunisation coalition. This data originates from the Australian Government Department of Health, National Notifiable Diseases Surveillance System. Since this data is laboratory confirmed flu cases, the true numbers are likely higher.
In March 2019, there were 10.6k confirmed flu cases. This March figure is 4 times more than the average for that month over the last four years. From chart 1, you will see that summer months in Australia have low flu numbers and peak in August / September.
The Holt-Winters method is a forecasting technique that uses three key smoothing variables – and is sometimes called triple exponential smoothing. These three variables are:
- Level (the height of the pattern)
- Trend (the slope of the pattern)
- Seasonal (the repeating pattern within a calendar)
These three variables are usually given the Greek letters alpha (level), beta (trend) and gamma (seasonal) respectively.
The Holt-Winters method requires some initialization. There are a lot of different ways to do this. Note that the initialization of the model is arbitrary and that if there is sufficient data, it should have little impact on the model. In the below, one of the initilization techniques for monthly or quarterly data is shown. In the below, S is the seasonality, L is the level, T is the trend and Y is the actual observed value. As we have monthly data we will use the M=12 variation for the flu analysis.
Once the model has been initialized in the first year, we can start generating a forecast using the below terms. As the forecast moves through the observed data, we improve our level, trend and seasonal factors. Don’t get too bogged down in the math below, you can see how the formula works in the example in Excel.
The excel model
The best way to understand the model is to download the excel and play with the actual values. I’ll walk you through the highlights of the forecast model.
The first step of building the model is to set up the initialization. In chart 4, the grey section initializes the model using the first year of data, using the formulas from chart 2. We can also enter initial values of the red parameters in cells B9 to B11 for level, trend and seasonal. You may start with 0.5 for all 3 values. We will use solver to get the optimal values of these cells shortly, so any value between 0 and 1 will do.
Next, we add in the equations for level (cell P9), trend (cell P10), seasonal (cell O11) and lastly the forecast (cell O13.) These equations get repeated across the data set and can be dragged across.
Now that we have a forecast to check our actual values, we calculate the error. The error is the difference between the actual flu cases and the forecast flu cases. We will use the error to find the root mean square of errors (RMSE) of the entire forecast.
We can use solver to help us find the optimal level, trend and seasonal factors that minimzes the RMSE. If you open solver and set the objective to minimize the root mean square error in cell B15 by changing values in B9 to B11. We also need to add the constraints that level, trend and seasonal are between 0 and 1. Hit optimize and you will get the optimal smoothing values for level, trend and seasonal.
Now we have an optimized model, we can view a graph of our forecast to see how well we have trained the model against the real data. Chart 6 shows the plot of the actual in blue and the forecast in orange. Overall the fit is pretty good, so we can have a go at a forecast for the next 9 months.
Let’s go back into our Excel model. We will forecast columns BN to BV in the shaded yellow. You will notice that we fix the trend and level values to the last real values we have. For the seasonal values we use the previous year’s values. Lastly, you may notice in the horizon row that we increment the cells by 1 for each forecast. In column BN the horizon is 2, in column BO the horizon is 3, in row BP the horizon is 4 and so on.
so what does the forecast look like?
Using our Holt-Winters forecast model, the expected peak flu cases are 180k. This is 1.8 times more than the peak in 2017, so potentially a very dangerous flu season.
The spread of a flu virus is dependant on many variables including climate, the number of people that get immunized, how many people travel, the community resistance to the strain etc so it is very difficult to say how accurate any model will be. However, what is clear is that this flu season could be severe.
There have been a few explanations in the media as to why the current flu numbers are so high. Some have suggested it is an anomaly from an extended flu season in North America, or because of a lower community resistance from a quiet flu season last year. However, there is a possibility that we are witnessing a more severe viral mutation that we have yet to develop a resistance. Either way, the best advice is to get a flu shot and to be vigilant this coming season.