the language of statistics. See distributions.

convergence

3 types

from strongest to weakest:

  • almost sure: convergence happens with probability 1
  • in probability: the probability of being close to the limit increases
  • in distribution: the distribution of the variable approximates the distribution of the limit
    • for all where is continuous

sequences

If and are sequences of random variables (RVs) such that and in probability, then:

continuous mapping theorem

If is a continuous function, then:

LLN

strong: almost sure convergence

weak: convergence in probability

CLT

Law of total variance

This is useful for separating uncertainty into systematic variation (2nd term: variability explained by Y) and random noise (1st term: variability unexplained by Y).

Intuition:

  • 1st term: average variance within groups created by Y
    • given Y, what is the residual variance of X?
  • 2nd term: variance between group means
    • how much does the expected value of X change as Y varies?

Law of total expectation

For continuous Y:

For discrete Y:

Inequalities

Cauchy-Schwarz

Recall the most common form: And the integral form:

The probabilistic form is:

Upper bound: this becomes an inequality f there is perfect correlation, i.e. .

Markov

for

Use case: when you only know the mean and need an upper bound on how often a random variable gets large

Chebyshev

for ,

Use case: when you know the mean and variance and need an upper bound on the spread of the distribution

Jensen

for convex; reverse if is concave