Background
The Distributed Statistical Computing course was developed and taught by Dr. Feng Li in 2014 for a joint master’s program in statistics with prestigious universities, Peking University, Renmin University of China, Central University of Finance and Economics, University of Chinese Academy of Sciences, and Capital University of Economics and Business.
This course is also offered by Dr. Feng Li for the Business Analytics program at Peking University since 2020.
Prerequisites
- Basic knowledge of statistics
- Basic knowledge in computing
Literature
- Distributed statistical computing [New online book | Print version]
- Lecture notes
- Demo Hadoop/Spark configurations
- Basic Hadoop/Spark Tutorial for Statisticians
Teaching Videos
- The Chinese version of teaching videos are also available on https://space.bilibili.com/509963672
Slides and lecture notes
Read with online Jupyter Notebook viewer
- Download all Jupyter Notebooks in a zip file.
- Download all data in a zip file.
Part I: Distributed Systems and Distributed Computing
Part II: Advanced Distributed Statistical Computing
Topic | Material |
---|---|
L10.1: Big Data Visualization: Challenges and Viabilities | HTML |
L10.2: Statistical Elements of Big Data Visualization | HTML |
L10.3: Computational Aspects of Big Data Visualization | |
L11: Distributed Statistical Computing: State of the Art | |
L11: Least-Square Approximation for a Distributed System | Paper Code |
L12: Distributed ARIMA models for ultra-long time series | Paper Code |
L13: Distributed Quantile Regression by Pilot Sampling and One-Step Updating | Paper Code |
L14: Bayesian Forecasting with Distributed VAR models |