当前位置:网站首页>01 machine learning: evaluation indicators
01 machine learning: evaluation indicators
2022-07-16 06:54:00 【ZERO_ pan】
01 Cross validation
Divide the data into training sets and test sets , Use training sets to build models , And use the test set evaluation model to provide modification suggestions , This approach is called cross validation .
02 Classification problem - Confusion matrix
The confusion matrix of two classifications as an example 
Accuracy rate
The prediction is more accurate than all the data .
Recall rate
The number of positive samples of the prediction pair is larger than the number of positive samples in the above samples .
significance : Try to find positive examples
Accuracy
The number of positive samples of the prediction pair is higher than the number of samples predicted to be positive 
significance : Accuracy when the prediction is a positive example .
F value

Summary Q & A
Q: Accuracy of a model 90%, The performance of this model must be good ?
- not always
- Suppose the probability of a certain disease is 10%, Then we predict that all samples are not sick , Then the accuracy of the model can reach 90%. But this model is useless
- At this time, we need to consider the recall rate and accuracy . Suppose the positive example is illness . Then the recall rate and accuracy rate of such a model are equal to 0, because A=0.
- therefore , The accuracy of light is not enough to judge the performance of a model . Especially when The data is unbalanced When .
Q: What is the relationship between recall rate and accuracy rate ?
- Identify when you are sick , Try to find people who are sick . At this time, we want the recall rate to be as high as possible .
- To recall 100%, One way is , All are predicted to be sick . Better kill by mistake , Don't let go of , But do you think this model has good performance ?
- therefore , Accuracy also needs to be considered , Try to kill as few mistakes as possible .
- Accuracy and recall are not contradictory indicators , But the focus is different .
Q: What kind of scenario gives priority to accuracy , Consider the recall rate ?
- scene : Find out the real positive examples and add points , If it is not a positive case, it will be reduced by points .
ROC curve
FPR And TPR The relative change between two quantities .
TPR True case rate : It's the recall rate , The closer the 1 The better
FPR The false positive rate is :C/(C+D),
ROC in , The meaning of several special points

The closer the model is z1 The better .
ROC The curve is a model TPR and FPR With Judgment threshold The curve of change .
AUC
Usually we use ROC The area of the lower right corner under the curve , The value range of the area 0.5-1
Why not 0-1 Well , Because for a binary classification problem , Accuracy rate is 0 The accuracy of the model is 1 Model of .
The following figure shows the area AUC(Area under couver) Area is 1, That is to say z1 spot .
scikit-learn Code
| indicators | scikit-learn |
|---|---|
| Precision | from sklearn.metrics import precision_score |
| Recall | from sklearn.metrics import recall_score |
| F1 | from sklearn.metrics import f1_score |
| Confusion Matrix | from sklearn.metrics import confusion_matrix |
| ROC | from sklearn.metrics import roc_curve |
| AUC | from sklearn.metrics import auc |
03 The return question
The regression problem requires that the smaller the error, the better . But we can't add the errors directly , Because there are positive and negative errors , It is common to take the absolute value or square of the error .
Mean absolute error

Value range :0- It's just infinite
Mean square error

Value range :0- It's just infinite
R2

TSS It is a model that all predictions are mean mse(1/m Take it or leave it ).R2 The meaning is , Your model should at least be better than a simple model ( Models that are all predicted to be mean ) It is better to .
R2 Value range ( Negative infinity ~1)
Here are some special points
- R2=0, Your model effect (RSS) Equivalent to a model that predicts only the mean (TSS)
- R2=1, Your model predicts perfectly . The closer the 1 The better .
- R2<0, Your model is very poor , It is better to predict the model as the mean .
- R2= Negative infinity , Your model may oscillate without convergence .
scikit-learn Code
| indicators | scikit-learn |
|---|---|
| MSE,RMSE | from sklearn.metrics import mean_squared_error |
| MAE | from sklearn.metrics import mean_absolute_error |
| R2 | from sklearn.metrics import r2_score |
边栏推荐
- 在Colab上训练yolov3(一)
- 利用指针编写程序实现在一个字符串的隨意位置上插入一个字符(要求插入字符的位置由用户从键盘输入)。
- [go language introduction] 06 go language circular statement
- 黑马数据库笔记DQL
- Embedded software development stm32f407 key input standard library version
- Excel-1
- [go language introduction] 13 go language interface details
- 01kNN_ Regression
- General operation of computer
- [Go语言入门] 12 Go语言结构体(struct)详解
猜你喜欢

Holiday study plan from June 24, 2022 to August 26, 2022

Comment utiliser l'oscilloscope virtuel dans Keil 5 pour la simulation logicielle

Promise --- synchronize? Asynchronous?

Embedded software development stm32f407 key input standard library version

Swagger快速入门(接口文档)

The 12th Blue Bridge Cup embedded simulation questions

GY-53红外激光测距模块的使用以及pwm模式代码的实现

Stm32-tim3 output PWM signal to drive mg996r steering gear (key control)

STM32F103 guider - example game Tetris

SQL基础1
随机推荐
数据库黑马笔记DML
01-kNN
[Multisim] problems that must be paid attention to when using ne5532p series operational amplifier simulation
ES6 let 、const 详解
01kNN_ Regression
vim用法
Download and installation of CodeBlocks on the official website
数组扁平化实现
Typescript basic configuration tutorial (automatically compiled in vscode)
[signal conditioning] example of precision detection circuit and PCB
Use of functions with variable length parameters (C language < stdarg. H>)
02-FeatureScaling归一化
Excel-2
Notice on the completion of Internet of things
Getting started with spark
Get Erlang installation package for free!!
01kNN_Regression
SSM图书管理系统
[Go语言入门] 06 Go语言循环语句
Use of go language JSON parsing library jsoniter (replace standard library encoding/json)