New Paper: Distributed ARIMA Models for Ultra-long Time Series

Authors:  Xiaoqian WangYanfei Kang, Rob J Hyndman and Feng Li

Providing forecasts for ultra-long time series plays a vital role in various activities, such as investment decisions, industrial production arrangements, and farm management. This paper develops a novel distributed forecasting framework to tackle challenges associated with forecasting ultra-long time series by utilizing the industry-standard MapReduce framework. The proposed model combination approach facilitates distributed time series forecasting by combining the local estimators of ARIMA (AutoRegressive Integrated Moving Average) models delivered from worker nodes and minimizing a global loss function. In this way, instead of unrealistically assuming the data generating process (DGP) of an ultra-long time series stays invariant, we make assumptions only on the DGP of subseries spanning shorter time periods. We investigate the performance of the proposed distributed ARIMA models on an electricity demand dataset. Compared to ARIMA models, our approach results in significantly improved forecasting accuracy and computational efficiency both in point forecasts and prediction intervals, especially for longer forecast horizons. Moreover, we explore some potential factors that may affect the forecasting performance of our approach.

Links: Working Paper | Spark Implementation

Categorized as Paper

Paper: Credit Risk Clustering with Covariate-dependent Copula Models

with Zhuojing He from Business School, Central University of Finance and Economics.


Understanding how corporate defaults cluster is particularly important for risk management of portfolios of corporate debt. In this paper, we discuss the dynamic nature of the clustering of credit risk across firms pairwise in the same family corporation in China. We insert the tail-dependence coefficient into the Joe-Clayton copula model directly through a reparameterized methodology to estimate the tail-dependence structure of credit risk. We also use both macroeconomic and firm-specific covariates to study the dynamic nature of the lower tail-dependence coefficient of distance-to-default which measures the credit risk clustering, and to find the driving forces behind credit risk clustering. Empirical results indicate that both macroeconomic and firm-specific covariates play important roles in the time-varying features of credit risk clustering. However, for different pairwise portfolios, these macroeconomic and firm-specific covariates have different effects.

Keywords: Credit risk clustering; Covariate-dependent copulas; tail-dependence; distance-to-default; MCMC.

Supplementary Material

In our study we apply our method to total of 45 pairwise firms. There are 39 pairs showing significant results, three of which are already explained in the paper. The supplementary material shows the empirical results of credit risk clustering across 36 significant pairwise firms not listed in the paper.

Categorized as Paper