当前位置:网站首页>Population standard deviation and sample standard deviation

Population standard deviation and sample standard deviation

2022-06-23 01:11:00 littleBlackT

Total variance and Sample variance (sample variance)

set up X To obey the distribution F Random variable of , If It's a random variable X The average of the expected values ()

A random variable X Or distribution F Of variance ( English :Variance) by :

But in general , It is impossible to observe or test every individual of the whole . therefore , The population must be sampled and observed ( sampling ) . Because we use sampling to infer the distribution of the population , So the sampling must be random , Sampling value It should be treated as a set of random variables . Because the purpose of sampling is to make statistical inference on the distribution of the population , In order to make the extracted samples reflect the overall information well , The sampling method must be considered . One of the most common sampling methods is called " Simple random sampling ", The resulting sample is called a simple random sample , It requires the sample to meet the following two points :

  • Representative : Each of them has the same distribution as the investigated population ;
  • independence : Are independent random variables .

Such a sample is called Independent homologous distribution (independent and identically distributed) sample , abbreviation i.i.d. sample

In practice, sampling results in i.i.d. After sample , Sample variance can be used To approximate the population variance :


Understand from the perspective of degrees of freedom

Statistical degrees of freedom (degree of freedom, df), It refers to when the parameters of the population are estimated by the statistics of the sample , The number of independent or freely varying data in the sample is called the degree of freedom of the statistic .

  • After the mean value of a set of sample data is determined , If you know one of them n-1 The value of the number , The first n The value of the number is determined . here , The mean value is equivalent to one Limiting conditions , Because of this restriction , The degree of freedom to estimate the population variance is Instead of .

in other words , We use it Sample mean Come on It is estimated that Overall Expectations A restriction condition is generated to reduce the degree of freedom of the remaining data . If you still use As the denominator , The estimated variance will be smaller , We call it biased sample variance( Biased sample variance )

Understand from a rigorous derivation

Consider a set of sample data

in other words , Mathematical derivation can prove The expectation of is not equal to , It's a factor away from it , let me put it another way , use Being a denominator leads to underestimation of variance

So we need to modify it to get Unbiased sample variance Unbiased sample variance


Words written at the back

The counterintuitive point of this question is , Why do we take samples to calculate variance , The denominator must be instead of , I have given two relatively elementary ways of understanding , Personally think that , The most important point in understanding is , Distinguish what is It is estimated that Of , What is? real Of , for instance Is used to estimate Of


Reference:

https://www.jianshu.com/p/18aaa7b1cb09
https://www.zhihu.com/question/20099757?sort=created
https://www.zhihu.com/question/22983179
https://chinois.jinzhao.wiki/zh-hans/%E6%A8%99%E6%BA%96%E5%B7%AE
https://en.wikipedia.org/wiki/Variance
原网站

版权声明
本文为[littleBlackT]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/174/202206222331585602.html