COVID-19 forecasting is a hot mess without right tech

The study explored the impact of four widely used smoothing techniques - rolling mean, exponentially weighted moving average (EWMA), Kalman filter, and seasonal-trend decomposition using Loess (STL) - on the performance of each forecasting model. While past research often treats smoothing as an isolated preprocessing step, this study offered a joint evaluation of model and smoother combinations across varied national contexts.


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 04-06-2025 09:56 IST | Created: 04-06-2025 09:56 IST
COVID-19 forecasting is a hot mess without right tech
Representative Image.

A new peer-reviewed study has revealed that smoothing techniques can significantly enhance short-term COVID-19 forecast accuracy, particularly for neural networks, though model architecture remains the most critical determinant of performance. The study, titled "Smoothing Techniques for Improving COVID-19 Time Series Forecasting Across Countries," was published in the journal Computation.

The research systematically analyzed the effects of four smoothing strategies on the predictive performance of four major forecasting models: LSTM, Temporal Fusion Transformer (TFT), XGBoost, and LightGBM. Using weekly COVID-19 case data from Ukraine, Bulgaria, Slovenia, and Greece, the authors evaluated forecast performance over 3-month and 6-month horizons, employing metrics including Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE).

How do smoothing methods interact with model architectures?

The study explored the impact of four widely used smoothing techniques - rolling mean, exponentially weighted moving average (EWMA), Kalman filter, and seasonal-trend decomposition using Loess (STL) - on the performance of each forecasting model. While past research often treats smoothing as an isolated preprocessing step, this study offered a joint evaluation of model and smoother combinations across varied national contexts.

The key finding was that smoothing improved model stability and predictive performance mainly for deep learning models, especially under short-term forecasting horizons. For instance, the combination of LSTM with STL or rolling mean achieved the lowest forecast errors in several settings. In Ukraine’s 3-month forecasts, LSTM with rolling mean produced an RMSE of 33.89 and an MAE of 27.6, outperforming all other combinations.

In contrast, XGBoost, a tree-based model, showed robust accuracy over longer horizons regardless of smoothing technique. For example, in Greece’s 6-month forecast, XGBoost with STL achieved the lowest RMSE (434.03) and MAE (333.09), outperforming neural models that struggled with volatility in the data.

Statistical validation using a two-way ANOVA confirmed that model architecture had a statistically significant effect on MAPE (F = 4.13, p = 0.008), whereas the smoothing method alone did not. This underscores that while smoothing is useful, it is not a substitute for careful model selection.

Which models performed best across countries and timeframes?

No single model–smoothing combination dominated across all contexts, but clear trends emerged:

  • Short-term (3-month) forecasts: Neural models, especially LSTM and TFT, delivered superior accuracy when paired with STL or the rolling mean. For instance, in Bulgaria, TFT with rolling mean yielded an RMSE of 41.68 and an MAE of 28.97.
  • Medium-term (6-month) forecasts: XGBoost consistently outperformed others across most countries and metrics. In Ukraine, TFT with STL still performed well (RMSE: 144.49, MAE: 91.01), but XGBoost’s resilience to data noise made it more suitable for longer horizons.
  • MAPE variance: The MAPE values fluctuated widely across countries due to low actual case counts (denominator effects), especially in Slovenia and Greece. In these settings, STL notably helped stabilize the relative errors for neural models.

The heatmaps and forecast trajectory comparisons across pages 8–15 of the report showed consistent patterns: STL and rolling mean provided the smoothest, most accurate trend predictions, particularly for LSTM and TFT models. However, the performance of these models degraded when paired with Kalman filters or EWMA, especially over longer timeframes.

What are the implications for public health forecasting pipelines?

The findings carry immediate practical significance for health authorities designing epidemic forecasting systems:

  1. Short-term planning (up to 3 months): Neural networks such as LSTM or TFT should be prioritized, but only when preceded by smoothing methods like STL or rolling mean. These combinations delivered over 60% improvements in RMSE in most cases.
  2. Medium-term outlooks (3–6 months): Tree-based models like XGBoost offer greater robustness and lower computational costs. Their performance was less sensitive to the choice of smoothing method, making them ideal for deployment in resource-constrained settings.
  3. Smoothing as a supportive, not decisive, step: While smoothing can improve model consistency, especially in noisy datasets, its effect was not statistically significant alone. Public health agencies should prioritize model architecture and tailor smoothing choices to the specific context.

The study also pointed to operational considerations. Neural models required GPU acceleration and significantly longer training times, 3–4 times that of tree-based models, posing constraints for real-time deployment in under-resourced settings. The authors recommended combining model–smoother pairs into ensemble systems, citing prior success from platforms like the U.S. Forecast Hub.

Looking forward, the authors propose several enhancements, including adaptive smoothing parameter tuning by country, ensemble modeling for variance reduction, and anomaly detection to catch real-time data aberrations. They also recommend exploring advanced architectures like diffusion models and N-BEATS for greater forecasting precision.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback