Concentration and its uses
Sub-Gaussian random variables
Sub-Exponential random variables
Let \(X_i\) be a sequence of i.i.d random variables
\[ \mathbb{P}\left[ f(|X-\mu|) \ge f(t) \right] \leq \min_f \frac{\mathbb{E}f(|X-\mu|)}{f(t)} \]
Canonical statistics \(\sum X_i/n \sim \mathbb{E}X\), suppose we want to estimate \(\mathbb{E}|X_1-X_2|\), we would consider using \(\sum_{i<j} \frac{|x_i-x_j|}{\binom{n}{2}}\).
The \(U\)-statistics are of the form
\[U = \frac{1}{\binom{n}{2}} \sum_{i<j} g(x_i, x_j)\]
where \(g\) is a symmetric function of two variables.
Imagine that we have \(\|g\|_\infty <b\) or assume \(|x|<b\),
\[\begin{align*} U(x_1,\dots,x_n) - U(x_1,\dots,x_j',\dots,x_n)| &\le \frac{1}{\binom{n}{2}} \sum_{i, i\neq j} |g(x_i,x_j) - g(x_i,x_j')|\\ &\le \frac{2b(n-1)}{\binom{n}{2}} \\ &= \frac{4b}{n}. \end{align*}\]
\[ \mathbb{P}(|U-\mathbb{E}U| \ge t) \le 2 \exp\left(-\frac{nt^2}{8b^2}\right). \]