当前位置:网站首页>Why is the sample variance divided by (n-1)

Why is the sample variance divided by (n-1)

2022-06-22 00:34:00 subtitle_

introduction

There was a problem in probability theory and mathematical statistics : Why is the sample variance divided by (n-1), At that time, I didn't understand very well when I was studying , However, if you ask the teacher, the teacher can't tell why ( I feel that the teacher is good at water …), So find your own materials to learn . It is arranged as follows .



1. Pre knowledge

The little friend who has studied probability theory and mathematical statistics must also know the following formula :

1. If the mean ( expect ) E ( x ) = μ \mathbf{E}(x)=\mu E(x)=μ, variance D ( x ) = σ 2 \mathbf{D}(x)=\sigma^2 D(x)=σ2, that E ( x ‾ ) = μ \mathbf{E}(\overline{x})=\mu E(x)=μ, D ( x ‾ ) = σ 2 / n \mathbf{D}(\overline{x})=\sigma^2/n D(x)=σ2/n

2. Note the population variance σ 2 \sigma^2 σ2 And sample variance S 2 S^2 S2 The formula is different , The first denominator is divided by n, One is divided by (n-1), Second, the sum of squares minus the inner one is the overall mean μ \mu μ, One subtracts the sample mean x ‾ \overline{x} x, That is to say .
σ 2 = ∑ i = 1 n ( x i − μ ) 2 n , S 2 = ∑ i = 1 n ( x i − x ‾ ) 2 n − 1 \sigma^2=\frac{\sum_{i=1}^{n}(x_i-\mu)^2}{n},S^2=\frac{\sum_{i=1}^{n}(x_i-\overline{x})^2}{n-1} σ2=ni=1n(xiμ)2,S2=n1i=1n(xix)2

3. ∑ 1 = 1 k ( x i − x ‾ ) = 0 \sum_{1=1}^k(x_i-\overline{x})=0 1=1k(xix)=0

2. Proving ideas

Actually, the sample variance S 2 S^2 S2 It is essentially the overall mean μ \mu μ Or total variance σ 2 \sigma^2 σ2 A point estimation of , It's a random variable , Good point estimation has two most important properties :

(1) The point estimate is unbiased , The expected value of the point estimate should be the estimated parameter , But this is not enough , Because there may be many forms of point estimation , So there's a third 2 strip .

(2) The unbiased estimator has a minimum variance , The variance of the minimum variance point estimate is smaller than that of any other estimator of the parameter .

The following proof S 2 S^2 S2 yes σ 2 \sigma^2 σ2 An unbiased estimator of . It is proved that the expected value of the point estimate should be the estimated population parameter .

3. Proof process

There are two ways to prove it : The first one is commonly given in books , The second is better understood .

Method of proof 1 E ( S 2 ) = E ( ∑ i = 1 n ( x i − x ‾ ) 2 n − 1 ) = 1 n − 1 E [ ∑ i = 1 n ( x i − x ‾ ) 2 ] = 1 n − 1 E [ ∑ i = 1 n x i 2 − n x ‾ 2 ] = 1 n − 1 [ ∑ i = 1 n ( μ 2 + σ 2 ) − n ( μ 2 + σ 2 n ) ] = 1 n − 1 ( n − 1 ) σ 2 = σ 2 \begin{aligned}\mathbf{E}({S^2})&=\mathbf{E}(\frac{\sum_{i=1}^{n}(x_i-\overline{x})^2}{n-1})\\&=\frac{1}{n-1}\mathbf{E}[\sum_{i=1}^{n}(x_i-\overline{x})^2]\\&=\frac{1}{n-1}\mathbf{E}[\sum_{i=1}^{n}x_i^2-n\overline{x}^2]\\&=\frac{1}{n-1}[\sum_{i=1}^n(\mu^2+\sigma^2)-n(\mu^2+\frac{\sigma^2}{n})]\\&=\frac{1}{n-1}(n-1)\sigma^2\\&=\sigma^2\end{aligned} E(S2)=E(n1i=1n(xix)2)=n11E[i=1n(xix)2]=n11E[i=1nxi2nx2]=n11[i=1n(μ2+σ2)n(μ2+nσ2)]=n11(n1)σ2=σ2
Method of proof 2
hypothesis t t t It's a constant : ∑ i = 1 n ( x i − t ) 2 = ∑ i = 1 n ( x i − x ‾ + x ‾ − t ) 2 = ∑ i = 1 n ( x i − x ‾ ) 2 + 2 ∑ i = 1 n ( x i − x ‾ ) ( x ‾ − t ) + ∑ i = 1 n ( x ‾ − t ) 2 = ∑ i = 1 n ( x i − x ‾ ) 2 + 2 ( x ‾ − t ) ∑ i = 1 n ( x i − x ‾ ) + ∑ i = 1 n ( x ‾ − t ) 2 = ∑ i = 1 n ( x i − x ‾ ) 2 + ∑ i = 1 n ( x ‾ − t ) 2 = ∑ i = 1 n ( x i − x ‾ ) 2 + n ( x ‾ − t ) 2 \begin{aligned}\sum_{i=1}^{n}(x_i-t)^2&=\sum_{i=1}^{n}(x_i-\overline{x}+\overline{x}-t)^2\\&=\sum_{i=1}^{n}(x_i-\overline{x})^2+2\sum_{i=1}^{n}(x_i-\overline{x})(\overline{x}-t)+\sum_{i=1}^{n}(\overline{x}-t)^2\\&=\sum_{i=1}^{n}(x_i-\overline{x})^2+2(\overline{x}-t)\sum_{i=1}^{n}(x_i-\overline{x})+\sum_{i=1}^{n}(\overline{x}-t)^2\\&=\sum_{i=1}^{n}(x_i-\overline{x})^2+\sum_{i=1}^{n}(\overline{x}-t)^2\\&=\sum_{i=1}^{n}(x_i-\overline{x})^2+n(\overline{x}-t)^2\end{aligned} i=1n(xit)2=i=1n(xix+xt)2=i=1n(xix)2+2i=1n(xix)(xt)+i=1n(xt)2=i=1n(xix)2+2(xt)i=1n(xix)+i=1n(xt)2=i=1n(xix)2+i=1n(xt)2=i=1n(xix)2+n(xt)2
In the order t t t Is the overall mean μ \mu μ, Then there are ∑ i = 1 n ( x i − x ‾ ) 2 = ∑ i = 1 n ( x ‾ − μ ) 2 − n ( x ‾ − μ ) 2 \begin{aligned}\sum_{i=1}^{n}(x_i-\overline{x})^2=\sum_{i=1}^{n}(\overline{x}-\mu)^2-n(\overline{x}-\mu)^2\end{aligned} i=1n(xix)2=i=1n(xμ)2n(xμ)2
You can see ∑ i = 1 n ( x i − x ‾ ) 2 \sum_{i=1}^{n}(x_i-\overline{x})^2 i=1n(xix)2 and ∑ i = 1 n ( x ‾ − μ ) 2 \sum_{i=1}^{n}(\overline{x}-\mu)^2 i=1n(xμ)2 Are not strictly equal , There is still a difference n ( x ‾ − μ ) 2 n(\overline{x}-\mu)^2 n(xμ)2. be E ( S 2 ) = E ( ∑ i = 1 n ( x i − x ‾ ) 2 n − 1 ) = 1 n − 1 E [ ∑ i = 1 n ( x i − x ‾ ) 2 ] = 1 n − 1 E [ ∑ i = 1 n ( ( x ‾ − μ ) 2 − n ( x ‾ − μ ) 2 ) ] = 1 n − 1 [ E ( ∑ i = 1 n ( x ‾ − μ ) 2 ) − E ( ∑ i = 1 n n ( x ‾ − μ ) 2 ) ] = 1 n − 1 [ E ( ∑ i = 1 n ( x ‾ − μ ) 2 ) − n E ( ∑ i = 1 n ( x ‾ − μ ) 2 ) ] = 1 n − 1 ( n σ 2 − n ⋅ σ 2 n ) = σ 2 \begin{aligned}\mathbf{E}({S^2})&=\mathbf{E}(\frac{\sum_{i=1}^{n}(x_i-\overline{x})^2}{n-1})\\&=\frac{1}{n-1}\mathbf{E}[\sum_{i=1}^{n}(x_i-\overline{x})^2]\\&=\frac{1}{n-1}\mathbf{E}[\sum_{i=1}^{n}(\left(\overline{x}-\mu)^2-n(\overline{x}-\mu)^2\right)]\\&=\frac{1}{n-1}[\mathbf{E}\left(\sum_{i=1}^{n}(\overline{x}-\mu)^2\right)-\mathbf{E}\left(\sum_{i=1}^{n}n(\overline{x}-\mu)^2\right)]\\&=\frac{1}{n-1}[\mathbf{E}\left(\sum_{i=1}^{n}(\overline{x}-\mu)^2\right)-n\mathbf{E}\left(\sum_{i=1}^{n}(\overline{x}-\mu)^2\right)]\\&=\frac{1}{n-1}(n\sigma^2-n\cdot\frac{\sigma^2}{n})\\&=\sigma^2\end{aligned} E(S2)=E(n1i=1n(xix)2)=n11E[i=1n(xix)2]=n11E[i=1n((xμ)2n(xμ)2)]=n11[E(i=1n(xμ)2)E(i=1nn(xμ)2)]=n11[E(i=1n(xμ)2)nE(i=1n(xμ)2)]=n11(nσ2nnσ2)=σ2

Both methods of proof can help understand . I hope it can help you .

原网站

版权声明
本文为[subtitle_]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206212254215908.html