当前位置:网站首页>Population standard deviation and sample standard deviation
Population standard deviation and sample standard deviation
2022-06-23 01:11:00 【littleBlackT】
Total variance and Sample variance (sample variance)
set up X To obey the distribution F Random variable of , If It's a random variable X The average of the expected values ()
A random variable X Or distribution F Of variance ( English :Variance) by :
But in general , It is impossible to observe or test every individual of the whole . therefore , The population must be sampled and observed ( sampling ) . Because we use sampling to infer the distribution of the population , So the sampling must be random , Sampling value It should be treated as a set of random variables . Because the purpose of sampling is to make statistical inference on the distribution of the population , In order to make the extracted samples reflect the overall information well , The sampling method must be considered . One of the most common sampling methods is called " Simple random sampling ", The resulting sample is called a simple random sample , It requires the sample to meet the following two points :
Representative : Each of them has the same distribution as the investigated population ; independence : Are independent random variables .
Such a sample is called Independent homologous distribution (independent and identically distributed) sample , abbreviation i.i.d. sample
In practice, sampling results in i.i.d. After sample , Sample variance can be used To approximate the population variance :
Understand from the perspective of degrees of freedom
Statistical degrees of freedom (degree of freedom, df), It refers to when the parameters of the population are estimated by the statistics of the sample , The number of independent or freely varying data in the sample is called the degree of freedom of the statistic .
After the mean value of a set of sample data is determined , If you know one of them n-1 The value of the number , The first n The value of the number is determined . here , The mean value is equivalent to one Limiting conditions , Because of this restriction , The degree of freedom to estimate the population variance is Instead of .
in other words , We use it Sample mean Come on It is estimated that Overall Expectations A restriction condition is generated to reduce the degree of freedom of the remaining data . If you still use As the denominator , The estimated variance will be smaller , We call it biased sample variance( Biased sample variance )
Understand from a rigorous derivation
Consider a set of sample data
in other words , Mathematical derivation can prove The expectation of is not equal to , It's a factor away from it , let me put it another way , use Being a denominator leads to underestimation of variance
So we need to modify it to get Unbiased sample variance (Unbiased sample variance)
Words written at the back
The counterintuitive point of this question is , Why do we take samples to calculate variance , The denominator must be instead of , I have given two relatively elementary ways of understanding , Personally think that , The most important point in understanding is , Distinguish what is It is estimated that Of , What is? real Of , for instance Is used to estimate Of
Reference:
https://www.jianshu.com/p/18aaa7b1cb09
https://www.zhihu.com/question/20099757?sort=created
https://www.zhihu.com/question/22983179
https://chinois.jinzhao.wiki/zh-hans/%E6%A8%99%E6%BA%96%E5%B7%AE
https://en.wikipedia.org/wiki/Variance
边栏推荐
- 一文读懂基于Redis的Amazon MemoryDB数据库
- JMeter associated login 302 type interface
- TiDB VS MySQL
- LINQ 查询
- LINQ 查詢
- graphite statsd接口数据格式说明
- 62. 不同路径
- Binary tree to string and string to binary tree
- Dual cross domain: access allow origin header contains multiple values "*, *", but only one is allowed
- Ansible learning summary (7) -- ansible state management related knowledge summary
猜你喜欢

Development status of full color LED display

Tidb monitoring upgrade: a long way to solve panic

three. JS simulated driving tour art exhibition hall - creating super camera controller

MySQL-Seconds_ behind_ Master accuracy error

【滑动窗口】leetcode992. Subarrays with K Different Integers

Steps to implement a container global component

cadence SPB17.4 - 中文UI设置

崔鹏团队:万字长文梳理「稳定学习」全景图

E-R diagram

Cadence spb17.4 - Allegro - optimize and specify the polyline connection angle of a single electrical line - polyline to arc
随机推荐
Project directory navigation
Node fetch download file
The road of architects starts from "storage selection"
Dual cross domain: access allow origin header contains multiple values "*, *", but only one is allowed
數據庫中數據的儲存結構和方式是什麼?
層次選擇器
最安全的现货白银的心理分析
Yyds dry inventory solution sword finger offer: print the binary tree into multiple lines
Development status of full color LED display
[sliding window] leetcode992 Subarrays with K Different Integers
Ansible learning summary (7) -- ansible state management related knowledge summary
"Hearing" marketing value highlights, Himalaya ushers in a new situation
Phantomjs Usage Summary
Add expiration time for localstorage
总体标准差和样本标准差
层次选择器
Shell logs and printouts
What is the storage structure and mode of data in the database?
使用aggregation API扩展你的kubernetes API
关于测试/开发程序员技术的一些思考,水平很高超的,混不下去了......