所谓的用自由度来解释为什么是 \(n-1\) 不是 \(n\) 的实在是不靠谱(
对于随机向量 \( x \),我们可以定义它的期望和协方差矩阵:
\[ \begin{aligned} \mu &= \mathbb{E}(x) \\ S &= \mathbb{E}[(x-\mu)(x-\mu)^t]\end{aligned}\]
假设有一组观测值:\( \{x_i\}_{i=1}^n \),我们或许会想要这样估计参数这样估计参数(例如多元正态分布的极大似然估计会给出如下的公式):
\[ \begin{aligned} \hat{\mu} &= \frac{1}{n} \sum_{i=1}^n x_i \\ \hat{S} &= \frac{1}{n} \sum_{i=1}^n (x_i - \hat{\mu})(x_i - \hat{\mu})^t\end{aligned}\]
这两个估计值本身也是随机变量,考虑它们的期望,这个期望告诉了我们在平均的意义下我们的估计究竟是多少。由于期望的线性性,期望的估计值的期望可以很方便地计算:
\[ \begin{aligned} \mathbb{\hat{\mu}} &= \mathbb{E}\left(\frac{1}{n} \sum_{i=1}^n x_i\right) \\ &= \frac{1}{n} \mathbb{E}\left(\sum_{i=1}^n x_i\right) \\ &= \frac{1}{n} \cdot n \cdot \mathbb{E}(x) \\ &= \mu\end{aligned}\]
这说明这一估计是无偏的。
然而对于协方差矩阵,情况变得不一样起来。由于 \( x_i,x_j(i \ne j) \) 是无关的,因此 \( \mathbb{E}(x_i x_j^t) = \mathbb{E}(x)\mathbb{E}(x^t) \),而 \( x_i \) 与自身是相关的,所以刚才的式子不总是成立的。考虑协方差矩阵估计值的期望
\[ \begin{aligned} \mathbb{E}(\hat{S}) &= \mathbb{E}\left( \frac{1}{n} \sum_{i=1}^n (x_i - \hat{\mu})(x_i - \hat{\mu})^t \right) \\ &= \frac{1}{n} \mathbb{E}\left( \sum_{i=1}^n (x_i - \hat{\mu})(x_i - \hat{\mu})^t \right) \\ &= \frac{1}{n} \mathbb{E}\left( \sum_{i=1}^n ( x_i x_j^t - \hat{\mu} x_i^t - x_i\hat{\mu}^t + \hat{\mu}\hat{\mu}^t ) \right) \\ &= \frac{1}{n} \sum_{i=1}^n \mathbb{E}( x_i x_j^t - \hat{\mu} x_i^t - x_i\hat{\mu}^t + \hat{\mu}\hat{\mu}^t )\end{aligned}\]
拿出求和号中的一项来分析:
\[ \begin{aligned} &\quad\; x_i x_j^t - \hat{\mu} x_i^t - x_i\hat{\mu}^t + \hat{\mu}\hat{\mu}^t \\ &= x_i x_j^t - \left( \frac{1}{n} \sum_{u=1}^n x_u \right) x_i^t - x_i\left( \frac{1}{n} \sum_{u=1}^n x_u^t \right) + \left( \frac{1}{n} \sum_{u=1}^n x_u \right)\left( \frac{1}{n} \sum_{u=1}^n x_u^t \right) \\ &= x_i x_j^t - \left( \frac{1}{n} \sum_{u=1}^n x_u \right) x_i^t - x_i\left( \frac{1}{n} \sum_{u=1}^n x_u^t \right) + \frac{1}{n^2} \sum_{1 \le u, v \le n} x_u x_v^t \\ &= x_i x_j^t - \frac{2}{n} \sum_{1 \le u \le n, u \ne i} x_u x_i^t - \frac{2}{n} x_i x_i^t + \frac{1}{n^2} \sum_{1 \le u \ne v \le n} x_u x_v^t + \frac{1}{n^2} \sum_{u=1}^n x_u x_u^t\end{aligned}\]
取期望:
\[ \begin{aligned} &\quad\; \mathbb{E}( x_i x_j^t - \hat{\mu} x_i^t - x_i\hat{\mu}^t + \hat{\mu}\hat{\mu}^t ) \\ &= \mathbb{E}(x x^t) - \frac{2(n-1)}{n} \mathbb{E}(x)\mathbb{E}(x^t) - \frac{2}{n} \mathbb{E}(xx^t) + \frac{n^2-n}{n^2} \mathbb{E}(x)\mathbb{E}(x^t) + \frac{n}{n^2} \mathbb{E}(x x^t) \\ &= \frac{n-1}{n} \left( \mathbb{E}(x x^t) - \mathbb{E}(x)\mathbb{E}(x^t) \right)\end{aligned}\]
因此
\[ \begin{aligned} \mathbb{E} (\hat{S}) &= \frac{1}{n} \cdot n \cdot \frac{n-1}{n} \left( \mathbb{E}(x x^t) - \mathbb{E}(x)\mathbb{E}(x^t) \right) \\ &= \frac{n-1}{n} \left( \mathbb{E}(x x^t) - \mathbb{E}(x)\mathbb{E}(x^t) \right) \\ &= \frac{n-1}{n} \left( \mathbb{E} [ (x-\mathbb{E}(x))(x-\mathbb{E}(x))^t ]\right) \\ &= \frac{n-1}{n} S\end{aligned}\]
因此,\( \hat{S} \) 这一估计是有偏的,无偏估计应当是:
\[ \tilde{S} = \frac{1}{n-1} \sum_{i=1}^n (x_i - \hat{\mu})(x_i - \hat{\mu})^t\]
Comments | NOTHING