Abstract: In this work we develop a distributed least squares approximation (DLSA) method, which is able to solve a large family of regression problems (e.g., linear regression, logistic regression, Cox’s model) on a distributed system. By approximating the local objective function using a local quadratic form, we are able to obtain a combined estimator by taking a weighted average of local estimators. The resulting estimator is proved to be statistically as efficient as the global estimator. In the meanwhile it requires only one round of communication. We further conduct the shrinkage estimation based on the DLSA estimation by using an adaptive Lasso approach. The solution can be easily obtained by using the LARS algorithm on the master node. It is theoretically shown that the resulting estimator enjoys the oracle property and is selection consistent by using a newly designed distributed Bayesian Information Criterion (DBIC). The finite sample performance as well as the computational efficiency are further illustrated by extensive numerical study and an airline dataset. The airline dataset is 52GB in memory size. The entire methodology has been implemented by Python for a de-facto standard Spark system. By using the proposed DLSA algorithm on the Spark system, it takes 26 minutes to obtain a logistic regression estimator whereas a full likelihood algorithm takes 15 hours to reach an inferior result.
Authors: Matthias Anderer and Feng Li
Abstract: Hierarchical forecasting with intermittent time series is a challenge in both research and empirical studies. The overall forecasting performance is heavily affected by the forecasting accuracy of intermittent time series at bottom levels. In this paper, we present a forecasting reconciliation approach that treats the bottom level forecast as latent to ensure higher forecasting accuracy on the upper levels of the hierarchy. We employ a pure deep learning forecasting approach N-BEATS for continuous time series on top levels and a widely used tree-based algorithm LightGBM for the bottom level intermittent time series. The hierarchical forecasting with alignment approach is simple and straightforward to implement in practice. It sheds light on an orthogonal direction for forecasting reconciliation. When there is difficulty finding an optimal reconciliation, allowing suboptimal forecasts at a lower level could retain a high overall performance. The approach in this empirical study was developed by the first author during the M5 Forecasting Accuracy competition ranking second place. The approach is business orientated and could be beneficial for business strategic planning.