My team and I develop open-source statistical and machine learning software for large-scale data running on the Apache Spark distributed computing
platform. Detailed software information is available at my code repository and my KLLAB repository
Statistical packages
Package | Description | Language | Environment | Link |
gratis | The R package provides efficient algorithms for generating time series with diverse and controllable characteristics | R | All | CRAN GitHub |
febama | Feature-based Bayesian Forecasting Model Averaging | R | All | GitHub |
fide | Feature-based Intermittent DEmand forecasting | R | All | GitHub |
fuma | Forecast uncertainty based on model averaging | R | All | GitHub |
fformpp | FFORMPP: Feature-based FORecast Model Performance Prediction | R | All | GitHub |
dng | Distribution and Gradients for Skewed Distributions | R | All | CRAN GitHub |
pyhts | A python package for hierarchical forecasting, inspired by the hts package in R | Python | All | GitHub PyPi |
dlsa | Distributed Least Squares Approximation implemented with Apache Spark | Python | Spark | GitHub |
darima | Distributed ARIMA models implemented with Apache Spark | Python | Spark | GitHub |
dqr | Distributed Quantile Regression by Pilot Sampling and One-Step Updating | Python | Spark | GitHub |
cdcopula | Covariate-dependent copula models | R | All | GitHub |
movingknots | Efficient Bayesian Multivariate Surface Regression | R | All | GitHub |
flutils | A collection of R functions which is required from my other packages | R | All | GitHub |
GSM | Matlab code for Flexible Modeling of Conditional Distributions using Smooth Mixtures of Asymmetric Student T Densities | Matlab | All | GitHub |
Miscellaneous 🔓
I have some fun stuffs as well