\(\newcommand{\abs}[1]{\left\lvert#1\right\rvert}\) \(\newcommand{\norm}[1]{\left\lVert#1\right\rVert}\) \(\newcommand{\inner}[1]{\left\langle#1\right\rangle}\) \(\DeclareMathOperator*{\argmin}{arg\,min}\) \(\DeclareMathOperator*{\argmax}{arg\,max}\) \(\DeclareMathOperator*{\E}{\mathbb{E}}\)

× This section is under development!


ssume that eigenvectors are normalized to unit \(\ell_2\)-norm and have a nonnegative first component. The sign convention guarantees uniqueness of the eigenvector associated with an eigenvalue with geometric multiplicity one.

Vector norms

For \(\mathrm{v} = (v_i)_{i=1}^n \in \mathbb{R}^n\).

\(\norm{\mathrm{v}}_p = (\sum_{i}^n\abs{v_i}^p)^{1/p},\) hence it is a norm for \(p=[1,\infty].\) \(\norm{\mathrm{v}}_0 =\) # of nonzero elements. \(\norm{\mathrm{v}}_\infty = \underset{1\le i \le n}{\max} \abs{v_i}\).

Matrix norms

For $A \in \mathbb{R}^{m \times n}, r = rank(A) \le \min(m,n)$, SVs sorted $s_1 \ge s_2 \ge \ldots$.

Operator/Spectral norm

\(\norm{A} = \underset{x\in\mathbb{R}^n\backslash \{0\}}{\max} \frac{\norm{Ax}_2}{\norm{x}_2} = \underset{x\in S^{n-1}}{\max} \norm{Ax}_2 = \underset{x\in S^{n-1},\ y\in S^{m-1}}{\max} \inner{Ax,y} = s_1(A) = \norm{s}_\infty\)

Frobenius/Euclidean norm

\(\norm{A}_F = \sqrt{\sum_i^m\sum_j^n \abs{A_{ij}}^2} = \sqrt{\sum_i^r s_i^2(A_{i})} = \sqrt{tr(A'A)} = \sqrt{\inner{A,A}} = \norm{s}_2\)

\(\norm{AA'}_F = \norm{A'A}_F \le \norm{A}_F^2\) \(\norm{A+B}_F^2 = \norm{A}_F^2 + \norm{B}_F^2 + 2\inner{A+B}_F\) \(\norm{A}_F \le \norm{A} _* \le \sqrt{r}\norm{A}_F\) if \(A = U\Sigma V'\), then \(\norm{A}_F^2 = \norm{\Sigma}_F^2\) \(\norm{A}_F^2 = \sum_i^n \norm{\mathbb{a}_i}^2,\) where \(\mathbb{a}_i\) are columns.

HS norm

For \(X,Y \in \mathbb{R}^{n\times n}\), the matrix inner product is \(\inner{X,Y} = tr(XY')\), which induces the Hilbert-Schmidt norm \(\norm{X}_{HS} = \sqrt{\inner{X,X}}\).


\(\norm{A} \le \norm{A}_F \le \sqrt{r} \norm{A}\). \(s_i \le \frac{1}{\sqrt{i}} \norm{A}_F\). \(\norm{s}_\infty \le \norm{s}_2 \le \sqrt{r} \norm{s} _\infty\).

To be added a lot more!

Asymptotic Notation

For functions \(f,g\):

  • \(f_n = \mathcal{O}(g_n)\) means \(\exists C\in (0,\infty)\) s.t. \(f_n \le Cg_n\).
  • \(f_n = \Omega(g_n)\) means \(\exists C\in (0,\infty)\) s.t. \(f_n \ge Cg_n\).
  • \(f_n = \Theta(g_n)\) means \(f_n = \mathcal{O}(g_n)\) and \(f_n = \Omega(g_n)\).
  • Note: no \(\cdot_p\) means \(C\) is deterministic, nonrandom.

\(\mathcal{O}_p\) and \(\mathcal{o}_p\)

\(\boxed{X_n = \mathcal{O}_p(g_n) \Longleftrightarrow \forall \epsilon>0, \, \exists M>0 \; \text{ s.t. } \; \mathbb{P}(\abs{X_n/g_n}\ge M) < \epsilon}\) \(\boxed{X_n = \mathcal{o}_p(g_n) \Longleftrightarrow \forall \epsilon>0 \quad \underset{n\to\infty}{\lim} \mathbb{P}(\abs{X_n/g_n}\ge \epsilon) = 0}\)

if \(X_n = \mathcal{O}_p(f_n)\) and \(Y_n = \mathcal{O}_p(g_n)\):

  • \(X_n Y_n = \mathcal{O}_p(f_ng_n)\).
  • \(\abs{X_n}^s = \mathcal{O}_p(f_n^s), \quad s>0\).
  • \(X_n + Y_n = \max\{\mathcal{O}_p(f_n), \mathcal{O}_p(g_n)\}\).

if \(X_n = \mathcal{o}_p(f_n)\) and \(Y_n = \mathcal{o}_p(g_n)\):

  • \(X_n Y_n = \mathcal{o}_p(f_ng_n)\).
  • \(\abs{X_n}^s = \mathcal{o}_p(f_n^s), \quad s>0\).
  • \(X_n + Y_n = \max\{\mathcal{o}_p(f_n), \mathcal{o}_p(g_n)\}\).

if \(X_n = \mathcal{o}_p(f_n)\) and \(Y_n = \mathcal{O}_p(g_n)\):

  • \(X_n Y_n = \mathcal{o}_p(f_ng_n)\).
  • \(X_n + Y_n = \mathcal{O}_p(g_n)\).

Continuous mapping Thm

Given \(f: \mathbb{R}^k \to \mathbb{R}^m\) is “almost surely continuous”.

  • \(X_n \overset{d}{\to} X \Longrightarrow f(X_n) \overset{d}{\to} f(X)\).
  • \(X_n \overset{d}{\to} X \Longrightarrow f(X_n) \overset{p}{\to} f(X)\).
  • \(X_n \overset{d}{\to} X \Longrightarrow f(X_n) \overset{as}{\to} f(X)\).

Slutsky’s Lemma

if \(X_n \overset{d}{\to} X\) and \(Y_n \overset{d}{\to} c\):

  • \(X_n + Y_n \overset{d}{\to} X + c\).
  • \(X_nY_n \overset{d}{\to} cX\).
  • \(X_n/Y_n \overset{d}{\to} X/c\).


Add skinny SVD!

\(\underset{n\times m}{A} = \sum_i^r s_i\mathrm{u}_i\mathrm{v}_i',\) where \(r=rank(A)\).

\(s_i(A) = \sqrt{\lambda_i(AA')} = \sqrt{\lambda_i(A'A)}\).

If \(A\) is symmetric, then also: \(s_i(A) = \abs{\lambda_i(A)}\)

Courant-Fisher min-max Thm

Courant–Fischer variational representation of max eigenvalue & eigenvector:

\[\mathrm{v}_1(\widehat{\Sigma}) = \underset{\norm{z}_2 = 1}{\argmax} \; z'\widehat{\Sigma}z\]

Alternative equivalent variational representation is in terms of the semidefinite program (SDP):

\[Z^* = \underset{Z\in\mathbb{S}^p_+, \, tr(Z)=1}{\argmax} \, tr(\widehat{\Sigma}Z)\]

Eckart-Young-Mirsky Thm


Given a set $Q$ of $q$ points in $\mathbb{R}^N$ with $N$ typically large, we would like to embed these points into a lower-dimensional Euclidean space $\mathbb{R}^n$ while approximately preserving the relative distances between any two of these points. The question is how small can we make $n$ (relative to $q$) and which types of embeddings work?

Given $\epsilon \in (0, 1),$ for every set $Q$ of $q$ points in $\mathbb{R}^N$, if $n$ is a positive integer such that \(n>n_0 = O(\ln(q)/\epsilon^2)\), there exists a Lipschitz f’n $f: \mathbb{R}^N \to \mathbb{R}^n$ s.t.
\((1-\epsilon)\norm{u-v}^2_2 \le \norm{f(u) - f(v)}^2_2 \le (1+\epsilon)\norm{u-v}^2_2, \quad \forall \; u,v\in Q.\)