当前位置:网站首页>Model selection and optimization
Model selection and optimization
2022-06-23 20:39:00 【Mr. Dongye】
Cross validation ( All data sharing n Equal division )
The most commonly used is 10 Crossover verification
give an example :
4 Crossover verification ( Divide into 4 Equal time division ):
Finally, it is found that 4 Mean of accuracy
The grid search : Adjustable parameters
Preset several super parameter combinations for the model , Each group of super parameters was evaluated by cross validation , Select the optimal parameter combination to establish the model
API
from sklearn.model_selection import GridSearchCV
# coding=utf8
import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 500)
pd.set_option('display.unicode.ambiguous_as_wide', True)
pd.set_option('display.unicode.east_asian_width', True)
df = pd.read_csv(
r'E:\Python machine learning \csv\datingTestSet.txt',
sep='\t',
header=None,
names=['flight', 'icecream', 'game', 'type']
)
df_value = df[['flight', 'icecream', 'game']].values
df_value = np.array(df_value)
# test_size=0.25 Means to choose 25% To verify the data
x_train, x_test, y_train, y_test = train_test_split(df_value, df['type'], test_size=0.25) # The cutting data
# Preprocessing : Data standardization ( The normal distribution is satisfied, i.e. the standard deviation is 1, The average value is 0 Array of )
# The processing formula is X=(x-x̅)/α
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)
# coding=utf8
import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 500)
pd.set_option('display.unicode.ambiguous_as_wide', True)
pd.set_option('display.unicode.east_asian_width', True)
df = pd.read_csv(
r'E:\Python machine learning \csv\datingTestSet.txt',
sep='\t',
header=None,
names=['flight', 'icecream', 'game', 'type']
)
df_value = df[['flight', 'icecream', 'game']].values
df_value = np.array(df_value)
# test_size=0.25 Means to choose 25% To verify the data
x_train, x_test, y_train, y_test = train_test_split(df_value, df['type'], test_size=0.25) # The cutting data
# Preprocessing : Data standardization ( The normal distribution is satisfied, i.e. the standard deviation is 1, The average value is 0 Array of )
# The processing formula is X=(x-x̅)/α
scaler = StandardScaler()
x_train
example
# coding=utf8
import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 500)
pd.set_option('display.unicode.ambiguous_as_wide', True)
pd.set_option('display.unicode.east_asian_width', True)
df = pd.read_csv(
r'E:\Python machine learning \csv\datingTestSet.txt',
sep='\t',
header=None,
names=['flight', 'icecream', 'game', 'type']
)
df_value = df[['flight', 'icecream', 'game']].values
df_value = np.array(df_value)
# test_size=0.25 Means to choose 25% To verify the data
x_train, x_test, y_train, y_test = train_test_split(df_value, df['type'], test_size=0.25) # The cutting data
# Preprocessing : Data standardization ( The normal distribution is satisfied, i.e. the standard deviation is 1, The average value is 0 Array of )
# The processing formula is X=(x-x̅)/α
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)
# coding=utf8
import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 500)
pd.set_option('display.unicode.ambiguous_as_wide', True)
pd.set_option('display.unicode.east_asian_width', True)
df = pd.read_csv(
r'E:\Python machine learning \csv\datingTestSet.txt',
sep='\t',
header=None,
names=['flight', 'icecream', 'game', 'type']
)
df_value = df[['flight', 'icecream', 'game']].values
df_value = np.array(df_value)
# test_size=0.25 Means to choose 25% To verify the data
x_train, x_test, y_train, y_test = train_test_split(df_value, df['type'], test_size=0.25) # The cutting data
# Preprocessing : Data standardization ( The normal distribution is satisfied, i.e. the standard deviation is 1, The average value is 0 Array of )
# The processing formula is X=(x-x̅)/α
scaler = StandardScaler()
x_train
The grid search
# Use K Nearest neighbor algorithm
knn = KNeighborsClassifier()
# Construct the values of some parameters to search
param = {'n_neighbors':[3,5,10]}
# choose 2 Crossover verification
cv = 2
# Do a grid search
gc = GridSearchCV(knn, param_grid=param,cv=cv)
gc.fit(x_train,y_train)
gc_s = gc.score(x_test,y_test)
print(gc.best_score_) # Show the best results in cross validation
print(gc.best_estimator_) # Show the best model parameters
print(gc.cv_results_) # Display the results of each cross validation for each super parameter 边栏推荐
- How to install SSL certificates in Microsoft Exchange 2010
- [golang] delving into strings -- from byte run string to unicode and UTF-8
- Row height, (top line, middle line, baseline, bottom line), vertical align
- 【Golang】快速复习指南QuickReview(七)——interface
- December 29, 2021: the elimination rules of a subsequence are as follows: 1. In a subsequence
- SQL聯合查詢(內聯、左聯、右聯、全聯)的語法
- [golang] quick review guide quickreview (VIII) -- goroutine
- [golang] some questions to strengthen slice
- What is the process of setting up local cloud on demand? Can cloud on demand audit videos?
- What is the role of computer auto audit audio? What content failed to pass the audit?
猜你喜欢

GL Studio 5 installation and experience

Applet development framework recommendation

Crise de 35 ans? Le volume intérieur est devenu synonyme de programmeur...

Add two factor authentication, not afraid of password disclosure, let alone 123456

Official announcement. Net 7 preview 5
Implementing MySQL fuzzy search with node and express

重庆 奉节耀奎塔,建成后当地连中五名进士,是川江航运的安全塔

Implementation of microblog system based on SSM

35岁危机?内卷成程序员代名词了…

vs2022scanf函数的使用,使用scanf的报错-返回值被忽略:解决·方法
随机推荐
Newbeecoder. UI new open source control library DataGrid instructions
Syntaxe des requêtes fédérées SQL (inline, left, right, full)
Is it safe for Huatai Securities to open an account online for securities companies with low handling fees and commissions
【Golang】深究字符串——从byte rune string到Unicode与UTF-8
Syntax of SQL union query (inline, left, right, and full)
[golang] how to clear slices gracefully
Dart series: your site is up to you. Use extension to extend classes
How to build Tencent cloud game server? Differences between cloud game platforms and ordinary games
「开源摘星计划」Containerd拉取Harbor中的私有镜像,云原生进阶必备技能
Digital procurement transformation solution: SaaS procurement management platform promotes enterprise sunshine procurement
【Golang】快速复习指南QuickReview(八)——goroutine
Rstudio 1.4 software installation package and installation tutorial
Cloudbase init considerations
JS five methods to judge whether a certain value exists in an array
How to make a commodity price tag
Tupu software digital twin intelligent water service, breaking through the development dilemma of sponge City
【Golang】跟着源码学技巧系列之对象池sync.Pool
After the collapse of UST, will the stable currency market pattern usher in new opportunities?
20 provinces and cities announce the road map of the meta universe
Kubernetes resource topology aware scheduling optimization