\(\newcommand{\abs}[1]{\left\lvert#1\right\rvert}\) \(\newcommand{\norm}[1]{\left\lVert#1\right\rVert}\) \(\newcommand{\inner}[1]{\left\langle#1\right\rangle}\) \(\DeclareMathOperator*{\argmin}{arg\,min}\) \(\DeclareMathOperator*{\argmax}{arg\,max}\) \(\DeclareMathOperator*{\E}{\mathbb{E}}\) \(\DeclareMathOperator*{\V}{\mathbb{V}}\)

× This section is under development!

Stein’s paradox

Recall \(\norm{X}^2 = \sum X_i^2\).

\(X = (X_1,\ldots,X_p)'\), where each \(X_i \sim \mathcal{N}(\theta_i,1).\) The goal is to get a good estimator \(\widehat{\theta} = \widehat{\theta}(X)\) in terms of loss \(L(\widehat{\theta}, \theta) = \norm{\widehat{\theta}-\theta}^2\) (Euclidean norm).

Since the loss is random, consider the risk function \(R(\widehat{\theta}, \theta) = \E{L(\widehat{\theta},\theta)}\)

\(\widehat{\theta}\) strictly dominates \(\widetilde{\theta}\) if \(R(\widehat{\theta}, \theta) \le R(\tilde{\theta}, \theta), \forall \theta\) and strictly so for some \(\theta\).

\(\widehat{\theta}\) is admissible if \(\widehat{\theta}\) is not strictly dominated by any other \(\widetilde{\theta}\).

What about \(\widehat{\theta}_0:=X\), which is MLE and UMVUE? It is indeed admissible, but only for \(p=1,2\). Surprisingly, it is inadmissible for \(p\ge 3.\) James&Stein (1961) showed that

\[\boxed{\widehat{\theta}_{JS} := \left( 1-\frac{p-2}{\norm{X}^2} \right)X}\]

strictly dominates \(\widehat{\theta}_0:=X.\)

Proof: Notice that \(\norm{X-\theta}^2 \sim \chi^2_p\). Risk of James-Stein:
\(R(\widehat{\theta}_{JS},\theta) = \E\left\{ \norm{X-\frac{(p-2)X}{\norm{X}^2}- \theta}^2 \right\}\)
\(= \E\left\{ \norm{X-\theta}^2 \right\} - 2 \E\left\{ \norm{(X-\theta)\frac{(p-2)X}{\norm{X}^2}} \right\} + \E\left\{ \norm{\frac{(p-2)X}{\norm{X}^2}}^2 \right\}\)
\(= p - 2(p-2) \sum_i^p \E\left\{ \frac{(X_i-\theta_i)X_i}{\norm{X}^2} \right\} + (p-2)^2 \E\left\{ \frac{1}{\norm{X}^2} \right\}\)
\(= p - 2(p-2)(p-2) \E\left\{ \frac{1}{\norm{X}^2} \right\} + (p-2)^2 \E\left\{ \frac{1}{\norm{X}^2} \right\}\)
\(= p - (p-2)^2 \E\left\{ \frac{1}{\norm{X}^2} \right\} < p,\)
where the fourth equality holds since

Instead of shrinking to \(0\), could also shrink to \(\theta_0\):

\[\boxed{\widehat{\theta}_{JS} = \theta_0 + \left(1-\frac{p-2}{\norm{X}^2}\right)(X-\theta_0)}\]

which in turn is strictly dominated by positive-part James-Stein:

\[\boxed{\widehat{\theta}_{JS+} = \theta_0 + \left(1-\frac{p-2}{\norm{X}^2}\right)_+(X-\theta_0)}\]