\(\newcommand{\abs}[1]{\left\lvert#1\right\rvert}\) \(\newcommand{\norm}[1]{\left\lVert#1\right\rVert}\) \(\newcommand{\as}{\overset{a.s.}{\to}}\) \(\DeclareMathOperator*{\E}{\mathbb{E}}\)
Classical vs High-dimensional
L
et $n$ be # of observations, $p$ be # of variables. The classical regime allows $n$ to diverge, but assumes $p$ fixed. In contrast, the high-dimensional regime permits both $n$ and $p$ to diverge, $ p/n \to \gamma > 0$. Many of the classical results break down in that case. Here I consider eigenvalues and eigenvectors of a high-dimensional covariance matrix. This has immediate implications for covariance estimation, but also for all the statistical tools based on covariance estimates: PCA, GLS, GMM, classification, portfolio optimization, etc.
Consider a simple case $X_i \overset{iid}{\sim} \mathcal{N}_p(\mathbf{0}, \Sigma),\quad i=1,\ldots, n.$ How to estimate $\Sigma$?
Some notation:
Sample covariance estimator $S = \frac{1}{n}\sum_i^n X_iX_i’ = \frac{1}{n} X’X.$ Eigendecompositions $\Sigma = ULU’ = \sum_j^p \ell_j \mathrm{u}_j \mathrm{u}_j’, \quad S = V\Lambda V’ = \sum_j^p \lambda_j \mathrm{v}_j \mathrm{v}_j’.$ Eigenvalues distinct, sorted in decreasing order. Eigenvectors chosen with the first element positive.
Clasical Regime
In a classical regime, $S$ is a very good estimator (Anderson 1963, Van der Vaart 2000):
Unbiased $\E(S) = \Sigma.$
Consistent $S \as \Sigma$ as $n\to\infty.$
Asymptotically normal eigenvalues \(\sqrt{n}(\lambda_i-\ell_i) \overset{d}{\to} \mathcal{N}(0,2\ell_i^2), \quad j=1,\ldots,p.\)
Is invertible.
It gets trickier in high dimensions It is especially interesting what happens to eigenvalues and eigenvectors in high dimensions. There are three key features: eigenvalue spreading, eigenvalue bias and eigenvectors inconsistency.
High-dimensional Regime
Eigenvalue spreading
Marchenko-Pastur (1967)
In high dimensions, sample eigenvalues $\lambda_j$ are more spread out than their population counterparts $\ell_j.$ In fact, the higher the dimension, the more is the spreading.
Consider the case when $\Sigma = I_p,$ i.e. $\ell_1 = \ldots = \ell_p = 1,$ and $p/n \to \gamma \le 1.$
Empirical d’n of eigenvalues of sample covariance \(F_p(x) := \frac{1}{p} \# \{ \lambda_j\le x \}\)
Ukranian mathematicians Marchenko & Pastur (MP) showed that this empirical d’n converges $F_p(x) \to F(x),$ with the limit pdf given by: