Distributed Statistical Computing

Background

The course was developed and taught by Dr Feng Li in 2014 for a joint master’s program in statistics with prestigious universities, Peking University, Renmin University of China, Central University of Finance and Economics, University of Chinese Academy of Sciences, and Capital University of Economics and Business.

This course is also offered by Feng Li for the Business Analytics program at Peking University since 2020.

Prerequisite

  • Basic knowledge of statistics
  • Basic knowledge in computing

Literature

Lecture notes

L01-Introduction
L02-MapReduce
L03-Statistical-Modeling-with-MapReduce
L04-Hive
L05-Introduction-to-Spark
L06-Data-Processing-with-Spark
L07-Machine-Learning-with-Spark
L08-Advanced-Statistical-Modelling-with-Spark
L09-Modelling-Streaming-Data-with-Spark
L10-Spark-with-Scala