Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle challenges associated with forecasting ultra-long time series by utilizing the industry-standard MapReduce framework. The proposed model combination approach facilitates distributed time series forecasting by combining the local estimators of ARIMA (AutoRegressive Integrated Moving Average) models delivered from worker nodes and minimizing a global loss function. In this way, instead of unrealistically assuming the data generating process (DGP) of an ultra-long time series stays invariant, we make assumptions only on the DGP of subseries spanning shorter time periods. We investigate the performance of the proposed distributed ARIMA models on an electricity demand dataset. Compared to ARIMA models, our approach results in significantly improved forecasting accuracy and computational efficiency both in point forecasts and prediction intervals, especially for longer forecast horizons. Moreover, we explore some potential factors that may affect the forecasting performance of our approach.
By Feng Li
Dr. Feng Li is an Associate Professor of Statistics in the School of Statistics and Mathematics at Central University of Finance and Economics in Beijing, China. Feng obtained his Ph.D. degree in Statistics from Stockholm University, Sweden in 2013. His research interests include Bayesian computation, econometrics and forecasting, and distributed learning. His recent research output appeared in statistics and forecasting journals such as the International Journal of Forecasting and Statistical Analysis and Data Mining, AI journals such as Expert Systems with Applications, and medical journals such as BMJ Open.View all of Feng Li's posts.