当前位置:网站首页>Evaluation of classification model
Evaluation of classification model
2022-07-23 13:17:00 【weixin_ nine hundred and sixty-one million eight hundred and se】
Accuracy rate
- The most common use is accuracy , That is, the correct percentage of predicted results :estimator.score()
Accuracy (Precision) And recall rate (Recall)
Confusion matrix
- Under the classification task , Predicted results (Predicted Condition) With the right mark (True Condition) There are four different combinations , Make up the confusion matrix

Accuracy
- Accuracy ( Precision rate ): The predicted result is the proportion of real positive cases in positive samples
P = T P T P + F P P=\frac{TP}{TP+FP} P=TP+FPTP
Recall rate
- Recall rate ( Recall rate ): The proportion of real positive samples with positive prediction results
R = T P T P + F N R=\frac{TP}{TP+FN} R=TP+FNTP
The relationship between the two
- Precision and recall are contradictory variables . Generally speaking , When the accuracy is high , The recall rate is often low ; When the recall rate is high , Precision is often low .
P-R curve

- P-R The figure intuitively shows the recall rate of the learner in the sample population 、 Precision rate
- If a learner P-R The curve is completely changed by the curve of another learner “ encase ”, It can be asserted that the performance of the latter is better than the former
- If two learners P-R The curves intersect ,
- You can compare P-R The size of the area under the curve , To some extent, it represents the relative success of the learner in precision and recall “ Double high ” The proportion of .
- But this value is not easy to estimate , have access to “ Balance point ”(BEP) To measure , It is “ Precision rate = Incomplete rate ” The value of time , The higher one is better .
F1 Measure
- BEP Or too simplistic , More often F1 Measure :
F 1 = 2 × P × R P + R F1=\frac{2\times P\times R}{P+R} F1=P+R2×P×R
notes :F1 Measurement is based on the harmonic average of precision and recall :
1 F = 1 2 ( 1 P + 1 R ) \frac{1}{F}=\frac{1}{2}(\frac{1}{P}+\frac{1}{R}) F1=21(P1+R1)
- In some applications , The importance of precision and recall is different . For example, in the commodity recommendation system , In order to disturb users as little as possible , More hope that the recommended content is really what users are interested in , At this time, the accuracy is more important ; In the fugitive information retrieval system , More hope to miss as few fugitives as possible , At this point, recall is more important .F1 The general form of measurement —— F β F_{\beta} Fβ, Can let us express the accuracy / Different preferences for recall , It is defined as
F 1 = ( 1 + β ) 2 × P × R ( β 2 × P ) + R F1=\frac{(1+\beta)^2\times P\times R}{(\beta ^2 \times P)+R} F1=(β2×P)+R(1+β)2×P×R
β > 1 \beta>1 β>1 Time recall has a greater impact , β < 1 \beta<1 β<1 Time accuracy has a greater impact
ROC and AUC
- ROC The vertical axis of the curve is “ True case rate ”(TPR), TPR(True Positive Rate) It can be understood as all positive classes , How many are predicted to be positive classes ), The horizontal axis is " The false positive rate is "(FRP), FPR(False Positive Rate) It can be understood that in all anti classes , How many are predicted to be positive classes ( Positive class prediction error ) . The two are defined as
T P R = T P T P + F N F P R = F P F P + T N TPR=\frac{TP}{TP+FN}\qquad FPR=\frac{FP}{FP+TN} TPR=TP+FNTPFPR=FP+TNFP
- If a learner ROC The curve is completely changed by the curve of another learner “ encase ”, It can be asserted that the performance of the latter is better than the former
- If two learners P-R The curves intersect , Compare ROC The area under the curve , namely AUC.
边栏推荐
猜你喜欢
随机推荐
Signal integrity (SI) power integrity (PI) learning notes (32) power distribution network (4)
设计思维的“布道者”
JVM详细解析
Quelle est la raison pour laquelle la plate - forme easygbs ne peut pas lire l'enregistrement vidéo et a un phénomène de streaming répété rtmp?
Paging collections using streams
Record a reptile question bank
【日常训练】814. 二叉树剪枝
Signal integrity (SI) power integrity (PI) learning notes (XXXI) power distribution network (III)
第十一天笔记
Confused, work without motivation? Career development hopeless? It's enough to read this article
Matplotlib-实现常见概率分布
Common CMD commands to quickly open programs
【JZOF】09用两个栈实现队列
Is it safe to open an account with Guosen Securities software? Will the information be leaked?
使用fastjson解析以及赋予json数据时,json字段顺序不一致问题
Desensitize data
Machine learning: Li Hang - statistical learning method (II) perceptron + code implementation (primitive + dual form)
Opencv image processing (Part 1): geometric transformation + morphological operation
静态路由的搭建
OpenCV 视频操作









