the language of statistics. See distributions.
convergence
3 types
from strongest to weakest:
- almost sure: convergence happens with probability 1
- in probability: the probability of being close to the limit increases
- in distribution: the distribution of the variable approximates the distribution of the limit
- for all where is continuous
sequences
If and are sequences of random variables (RVs) such that and in probability, then:
continuous mapping theorem
If is a continuous function, then:
LLN
strong: almost sure convergence
weak: convergence in probability
CLT
Law of total variance
This is useful for separating uncertainty into systematic variation (2nd term: variability explained by Y) and random noise (1st term: variability unexplained by Y).
Intuition:
- 1st term: average variance within groups created by Y
- given Y, what is the residual variance of X?
- 2nd term: variance between group means
- how much does the expected value of X change as Y varies?
Law of total expectation
For continuous Y:
For discrete Y:
Inequalities
Cauchy-Schwarz
Recall the most common form: And the integral form:
The probabilistic form is:
Upper bound: this becomes an inequality f there is perfect correlation, i.e. .
Markov
for
Use case: when you only know the mean and need an upper bound on how often a random variable gets large
Chebyshev
for ,
Use case: when you know the mean and variance and need an upper bound on the spread of the distribution
Jensen
for convex; reverse if is concave