My team and I develop open-source statistical and machine learning software for large-scale data running on the Apache Spark distributed computing
platform. Detailed software information is available at my code repository and my KLLAB repository

Statistical packages

Package      DescriptionLanguage  Environment  Link   
gratisThe R package provides efficient algorithms for generating time series with diverse and controllable characteristicsRAllCRAN
febamaFeature-based Bayesian Forecasting Model AveragingRAllGitHub
fideFeature-based Intermittent DEmand forecastingRAllGitHub
fumaForecast uncertainty based on model averagingRAllGitHub
fformppFFORMPP: Feature-based FORecast Model Performance PredictionRAllGitHub
dngDistribution and Gradients for Skewed DistributionsRAllCRAN
pyhtsA python package for hierarchical forecasting, inspired by the hts package in RPythonAllGitHub
dlsaDistributed Least Squares Approximation implemented with Apache SparkPythonSparkGitHub
darimaDistributed ARIMA models implemented with Apache SparkPythonSparkGitHub
dqrDistributed Quantile Regression by Pilot Sampling and One-Step UpdatingPythonSparkGitHub
cdcopulaCovariate-dependent copula modelsRAllGitHub
movingknotsEfficient Bayesian Multivariate Surface RegressionRAllGitHub
flutilsA collection of R functions which is required from my other packagesRAllGitHub
GSMMatlab code for Flexible Modeling of Conditional Distributions using Smooth Mixtures of Asymmetric Student T DensitiesMatlabAllGitHub

Miscellaneous 🔓

I have some fun stuffs as well