My team and I develop open-source statistical and machine learning software for large-scale data running on the Apache Spark distributed computing
platform. Detailed software information is available at my code repository and my KLLAB repository
Statistical packages
| Package | Description | Language | Environment | Link |
| gratis | The R package provides efficient algorithms for generating time series with diverse and controllable characteristics | R | All | CRAN GitHub |
| febama | Feature-based Bayesian Forecasting Model Averaging | R | All | GitHub |
| fide | Feature-based Intermittent DEmand forecasting | R | All | GitHub |
fuma | Forecast uncertainty based on model averaging | R | All | GitHub |
| fformpp | FFORMPP: Feature-based FORecast Model Performance Prediction | R | All | GitHub |
| dng | Distribution and Gradients for Skewed Distributions | R | All | CRAN GitHub |
| pyhts | A python package for hierarchical forecasting, inspired by the hts package in R | Python | All | GitHub PyPi |
| dlsa | Distributed Least Squares Approximation implemented with Apache Spark | Python | Spark | GitHub |
| darima | Distributed ARIMA models implemented with Apache Spark | Python | Spark | GitHub |
| dqr | Distributed Quantile Regression by Pilot Sampling and One-Step Updating | Python | Spark | GitHub |
| cdcopula | Covariate-dependent copula models | R | All | GitHub |
| movingknots | Efficient Bayesian Multivariate Surface Regression | R | All | GitHub |
| flutils | A collection of R functions which is required from my other packages | R | All | GitHub |
| GSM | Matlab code for Flexible Modeling of Conditional Distributions using Smooth Mixtures of Asymmetric Student T Densities | Matlab | All | GitHub |
Miscellaneous 🔓
I have some fun stuffs as well
