当前位置:网站首页>Hands on data analysis data modeling and model evaluation
Hands on data analysis data modeling and model evaluation
2022-06-25 01:22:00 【includeSteven】
Data modeling and evaluation
Introduce
After data processing and Preliminary visual analysis , We can use the data to get the information we want . The first step of data analysis is modeling , After modeling, we need to evaluate whether our model is reliable .
Data modeling
The modeling library used here is sklearn, It contains many algorithms of machine learning , The corresponding model algorithm selection path can refer to the following figure :

Divide the data set
First, the data set should be divided into training set and test set , What we use here is sklearn.model_selection.train_test_split Method , Can pass jupyter Of train_test_split? View the documentation for the method .
Note that random selection is used by default for cutting data sets , It needs to be judged according to the actual situation .
Model creation
stay sklearn in , All estimators are inherited from estimator, All pass fit Method to build the model , Use predict To predict the outcome .
For classification , You can use logistic regression or random forest , Corresponding to the following two classes :
- sklearn.liner_model.LogisticRegression
- sklearn.ensemble.RandomForestClassifier
Model to predict
After building the model , Can pass predict Method to predict the model , Input eigenvalue x, The corresponding label will be given y value .
You can also use predict_proba To get the probability of each tag corresponding to the model prediction .
Evaluation of the model
Cross validation
sklearn.model_selection.cross_val_score(estimator, X_train, y_train, cv=10): Output the score of each cross validation
Confusion matrix and corresponding probability calculation
- sklearn.metrics.confusion_matrix
- sklearn.metrics.classification_report
draw ROC curve
sklearn.metrics.roc_curve, The return value is false positive rate、true positive rate and thresholds
边栏推荐
猜你喜欢

Abnova丨BSG 单克隆抗体中英文说明

利用 Redis 的 sorted set 做每周热评的功能

Basic knowledge of assembly language (2) -debug

Deep learning LSTM model for stock analysis and prediction

Bi-sql top

15. several methods of thread synchronization

Q1季度逆势增长的华为笔电,正引领PC进入“智慧办公”时代

动手学数据分析 数据建模和模型评估

明日考试 最后一天如何备考?二造考点攻略全整理

Bi-sql - different join
随机推荐
归并排序求逆序数
Bi-sql like
How about compass stock trading software? Is it safe?
Convolution and transpose convolution
Assembly language (3) 16 bit assembly basic framework and addition and subtraction loop
Première application de l'informatique quantique à la modélisation des flux de puissance dans les systèmes énergétiques à l'Université technique danoise
Bi-sql select into
"One good programmer is worth five ordinary programmers!"
中金财富证券开户佣金多少呢?股票开户交易安全靠谱吗?
丹麥技術大學首創將量子計算應用於能源系統潮流建模
Ideas and examples of divide and conquer
PHP easywechat and applet realize long-term subscription message push
Start service 11111
Is it reliable to open an account on the flush with a mobile phone? Is there any hidden danger in this way
汇编语言(4)函数传参
[practical series] full WiFi coverage at home
Distinguish between i++ and ++i seconds
Powerbi - for you who are learning
excel 汉字转拼音「建议收藏」
用手机在同花顺上开户靠谱吗?这样炒股有没有什么安全隐患