My team and I develop open-source statistical and machine learning software for large-scale data running on the Apache Spark distributed computing
platform. Detailed software information is available at my code repository and my KLLAB repository
Statistical packages
| Package | Description | Language | Environment | Link |
| riseforecast | General Recovery Forecasting with RISE | Python | All | GitHub |
| gratis | Efficient algorithms for generating time series with diverse and controllable characteristics | R Python | All | CRAN GitHub |
| febama | Feature-based Bayesian Forecasting Model Averaging | R Python | All | GitHub |
| fide | Feature-based Intermittent DEmand forecasting | R | All | GitHub |
fuma | Forecast uncertainty based on model averaging | R | All | GitHub |
| fformpp | FFORMPP: Feature-based FORecast Model Performance Prediction | R Python | All | GitHub |
| dng | Distribution and Gradients for Skewed Distributions | R Python | All | CRAN GitHub |
| pyhts | A python package for hierarchical forecasting, inspired by the hts package in R | Python | All | GitHub PyPi |
| dlsa | Distributed Least Squares Approximation implemented with Apache Spark | Python | Spark | GitHub |
| darima | Distributed ARIMA models implemented with Apache Spark | Python | Spark | GitHub |
| dqr | Distributed Quantile Regression by Pilot Sampling and One-Step Updating | Python | Spark | GitHub |
| cdcopula | Covariate-dependent copula models | R | All | GitHub |
| movingknots | Efficient Bayesian Multivariate Surface Regression | Python | All | GitHub |
| GSM | Flexible Modeling of Conditional Distributions using Smooth Mixtures | Python | All | GitHub |
Miscellaneous 🔓
I have some fun stuffs as well
