Discrete Distributions

Bernoulli Distribution

  • Models: Single binary outcome (success/failure).
  • Example: Flipping a coin.
  • PMF:

Binomial Distribution

X~Bin(n, p) = sum of i.i.d. Bern(p) RVs

  • Models: Number of successes in independent Bernoulli trials.
  • Example: Number of heads in coin flips.
  • PMF:

Multinomial Distribution

  • Models: Generalization of the binomial distribution for more than two outcomes.
  • Example: Rolling a die 10 times and counting the occurrences of each face.
  • PMF:
    where and .

Hypergeometric Distribution

  • Models: Number of successes in a n draws (without replacement)
  • Example: Drawing 5 cards from a deck and counting the number of aces.
  • PMF:
    where is the population size, is the number of successes in the population, and is the sample size.

Negative hypergeometric distribution

  • Models: Number of draws (without replacement) to achieve r successes
  • Example: Number of cards that must be drawn to collect 4 aces.
  • PMF:

Geometric Distribution

  • Models: Number of trials until the first success.
  • Example: Number of flips until first heads.
  • PMF:

Negative Binomial Distribution

X~NBin(r, p) = sum of i.i.d. Geom(p) RVs

  • Models: Number of trials needed to achieve k successes (inclusive of the k-th trial).
  • Example: Number of coin flips required to get 3 heads.
  • PMF:
    where is the number of successes.

Poisson Distribution

models the #(events that occur in a unit of space or time). is the expected number of events.

Derivation

Suppose you have n trials, each with of success. Then the probability of r successes can be modelled by the binomial distribution:

As n tends to infinity, we have

This gives rise to the Poisson pmf:

Discrete Uniform Distribution

  • Models: All outcomes in a finite set are equally likely.
  • Example: Rolling a fair die.
  • PMF:

Continuous Distributions

Continuous Uniform Distribution

Exponential Distribution

  • Models: Time between events in a Poisson process.
  • Example: Time between incoming calls.
  • PDF:

Gamma Distribution

X~Gamma() models the amount of time until n events. E.g. Time until the earthquake.

  • Gamma(n, λ) = sum of i.i.d. Expo(λ)
  • Gamma(1, λ) ∼ Expo(λ)

Shape-Rate Parameterization: the preferred parameterization for Bayesian stats

where is the Gamma function

Shape-Scale Parameterization: models the waiting time until the th event when each event occurs on average every units of time.

Weibull Distribution

  • Models: Lifetimes of objects.
  • Example: Time to failure of a machine.
  • PDF:

Pareto distribution

  • Models: Heavy-tailed distributions, often used to model situations where a small number of occurrences account for the majority of the effect.
  • Example: The distribution of wealth in a population, where a small percentage of people hold most of the wealth.
    where is the scale parameter (minimum value) and is the shape parameter.

Normal (Gaussian) Distribution

Bivariate Normal Distribution

Multivariate Normal Distribution

where is a k-dimensional vector, is the mean vector, and is the covariance matrix.

Log-Normal Distribution

  • Models: Multiplicative processes.
  • Example: Stock prices.
  • PDF:

Chi-Square Distribution

  • Models: Sum of squares of normal variables.
  • Example: Goodness-of-fit tests.
  • PDF:

F-Distribution

  • Models: Ratio of two scaled chi-square distributions.
  • Example: ANOVA testing.
  • PDF:
    where and are the degrees of freedom.

For :

For :

The variance is undefined for .

Beta Distribution

  • Models: Distribution of probabilities.
  • Example: Distribution of success rates.
  • PDF:

where or

Dirichlet Distribution

  • Models: Probabilities of outcomes in a multinomial distribution.
  • Example: Proportion of time spent on different activities during a day.
  • PDF:
    where is the multinomial Beta function.
    where .

t-Distribution

  • Models: Distribution of sample means when population variance is unknown.
  • Example: Testing hypotheses about means.
  • PDF:
    where is the degrees of freedom.

For :

For :

The variance is undefined for .

Cauchy Distribution

  • Models: Distributions with heavy tails.
  • Example: Resonance behavior.
  • PDF:

statistical distance

measures how different 2 probability distributions P and Q are from each other.

  • asymmetric measure:
    • Kullback-Leibler Divergence:
      • MLE can be seen as minimizing the KL divergence
  • symmetric measures:
    • total variation difference:
    • Hellinger distance: