Distributions and Random Numbers¶

Feng Li

School of Statistics and Mathematics

Central University of Finance and Economics

feng.li@cufe.edu.cn

https://feng.li/statcomp

Basic concepts of random numbers¶

Preliminary¶

  • Pseudo random numbers

    • an algorithm for generating a sequence of numbers that approximates the properties of random numbers.

    • The sequence is not truly random in that it is completely determined by a relatively small set of initial values, called the PRNG’s state.

    • Pseudo random numbers are important in practice for their speed in number generation and their reproducibility.

  • Random seed

    A random seed (or seed state, or just seed) is a number (or vector) used to initialize a pseudo random number generator.

  • The most important random numbers are from uniform distributed numbers.

    > runif(n,a,b)

  • Numbers selected from a non-uniform probability distribution can be generated using a uniform distribution PRNG and a function that relates the two distributions.

  • Assume you have uniformly distributed random numbers from [0, 1], how do you extend it to [a, b]?

Continuous random variables¶

Normal Distribution¶

  • The normal density function $$f(x, \mu, \sigma) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2} }$$

    > dnorm(x,mu,sigma)

    > dnorm(x,mu,sigma, log=TRUE)

    • In theory, dnorm(x,mu,sigma, log=TRUE)==log(dnorm(x,mu,sigma))

    • but dnorm(x,mu,sigma, log=TRUE) but is more stable for very large values. Why?

    • We love logs.

In [6]:
dnorm(100, mean=0, sd=1)
log(dnorm(100, mean=0, sd=1))

dnorm(100, mean=0, sd=1, log=TRUE)
0
-Inf
-5000.9189385332
  • The CDF (Cumulative Distribution Function) $$\Phi(x)\;=\int_{-\infty}^x f(t, \mu, \sigma) d t$$

    > pnorm(q,mu,sigma)

  • The quantile (Given CDF, what is x?), i.e. $\Phi^{-1}(p)$

    > qnorm(p,mu,sigma)

  • Random numbers from normal distribution

    > rnorm(n,mu,sigma)

In [5]:
z = rnorm (100,mean=0, sd=1)
hist(z)
In [3]:
x  = seq(-5, 5, 0.1)
y = dnorm(x = x, mean = 0, sd = 1)
plot(x, y, col = "blue", type = "l", lwd = 4)
In [4]:
y1 = pnorm(q = x, mean = 0, sd = 1)
plot(x, y1, col = "blue", type = "l", lwd = 4)

Other types of continuous distribution¶

  • Distribution Function in R

    Student t: {p,d,q}t

    Chi squared: {p,d,q}chi

    Gamma: {p,d,q}gamma

    Exponential {p,d,q}exp

  • For a significance test, what distribution do you use?

Discrete random variables¶

Discrete random variables¶

  • Distribution Function in R

    Binomial: {p,d,q}binorm

    Negative binomial: {p,d,q}nbinom

    Poisson: {p,d,q}pois

    Geometric: {p,d,q}geom

  • Bernoulli distribution is a special case of binomial distribution.

Suggested reading¶

  • Jones (2009): Chapter 14, 15, 16