当前位置:网站首页>01-kNN
01-kNN
2022-07-16 06:39:00 【ZERO_ pan】
machine learning :01-KNN Algorithm ( classification )
Theoretical part :https://www.cnblogs.com/listenfwind/p/10311496.html
Code section :Python3 Getting started with machine learning Classical algorithms and applications can be easily applied to artificial intelligence https://coding.imooc.com/class/169.html
Iris combat :
import numpy as np
from sklearn import datasets
""" Load data """
iris=datasets.load_iris()
all_data=iris.data # features
all_label=iris.target # label
print('data shape:{},label shape{}'.format(all_data.shape,all_label.shape))
data shape:(150, 4),label shape(150,)
""" Divide the data and randomly disrupt the data :70% be used for train,30 be used for test"""
from sklearn.model_selection import train_test_split
train_data,test_data,train_label,test_label=train_test_split(all_data,all_label,test_size=0.3,random_state=666)
print('train_data shape:{},test_data shape{}'.format(train_data.shape,test_data.shape))
train_data shape:(105, 4),test_data shape(45, 4)
""" call sklearn Medium kNN Algorithm """
from sklearn.neighbors import KNeighborsClassifier
knn_clf=KNeighborsClassifier(3) # Pass in k=3 value
knn_clf.fit(train_data,train_label) # Training ,kNN Algorithm data is both a model
pred_label=knn_clf.predict(test_data) # forecast
print("predict:{}\ntrue:{}".format(pred_label,test_label))
predict:[1 2 1 2 0 1 1 2 1 1 1 0 0 0 2 1 0 2 2 2 1 0 2 0 1 1 0 1 2 2 0 0 1 2 1 1 2
2 0 1 2 2 1 1 0]
true:[1 2 1 2 0 1 1 2 1 1 1 0 0 0 2 1 0 2 2 2 1 0 2 0 1 1 0 1 2 2 0 0 1 2 1 1 2
2 0 1 2 2 1 1 0]
print(pred_label==test_label)
[ True True True True True True True True True True True True
True True True True True True True True True True True True
True True True True True True True True True True True True
True True True True True True True True True]
"""sklearn It provides direct calculation accuracy API"""
knn_clf.score(test_data,test_label)
1.0
""" Compare the accuracy to find the best k"""
""" Use for loop """
k_score=[]
for k in range(1,len(train_data)):
knn_clf=KNeighborsClassifier(n_neighbors=k)
knn_clf.fit(train_data,train_label)
k_score.append(knn_clf.score(test_data,test_label))
plt.plot(range(1,len(train_data)),k_score)
plt.xlabel('k')
plt.ylabel('score')
Text(0, 0.5, 'score')

Super search
sklearn In addition to k There are also some parameters , See the official website for details https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html?highlight=kneighborsclassifier#sklearn.neighbors.KNeighborsClassifier
""" Call the following function, and we can see kNN Hyperparameters of the algorithm """
knn_clf.get_params()
{'algorithm': 'auto',
'leaf_size': 30,
'metric': 'minkowski',
'metric_params': None,
'n_jobs': None,
'n_neighbors': 104,
'p': 2,
'weights': 'uniform'}
n_neighbors Namely k,
p It's minkov distance p,
weights Weight. ,
n_jobs It's runtime CPU The core number =-1 It's all open
Let's use grid search to search the above three parameters . Reference resources :https://www.cnblogs.com/caomaoboy/p/12044087.html
"""Grid Search"""
""" If you don't consider the weight, don't design p, So there are two kinds of parameters """
param_grid=[
{
'weights':['uniform'], # Regardless of weight
'n_neighbors':[k for k in range(1,20)],
},
{
'weights':['distance'], # Consider the weight
'n_neighbors':[k for k in range(1,20)],
'p':[p for p in range(1,6)]
},
]
from sklearn.model_selection import GridSearchCV
knn_clf_grid_search=GridSearchCV(knn_clf,param_grid)
knn_clf_grid_search.fit(train_data,train_label)
GridSearchCV(estimator=KNeighborsClassifier(n_neighbors=104),
param_grid=[{'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19],
'weights': ['uniform']},
{'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19],
'p': [1, 2, 3, 4, 5], 'weights': ['distance']}])
""" Return a classifier with the best effect """
knn_clf=knn_clf_grid_search.best_estimator_
""" The best classifier, the best result """
knn_clf_grid_search.best_score_
0.9714285714285715
The above is related to the accuracy we got at the beginning 100% Dissimilarity , Because grid search GridSearchCV Cross validation is adopted in (CV)
""" The best parameters """
knn_clf_grid_search.best_params_
{'n_neighbors': 7, 'p': 4, 'weights': 'distance'}
边栏推荐
- [signal conditioning] example of precision detection circuit and PCB
- 友善zeroPi uboot、kernel 编译,
- 梯度下降法的向量化
- Introduction to common memory
- Summary of working methods
- About coursera
- 如何将电子签名透明化处理
- 【信号调理】ADC保护电路/ADC缓冲器
- Embedded software development stm32f407 buzzer register version
- 第四章 STM32+LD3320+SYN6288+DHT11实现语音获取温湿度数值(上)
猜你喜欢
随机推荐
c语言 字符串的系列操作(字符串的逆序输出、字符串类型与int、double的互相转换)
C语言实现汉诺塔(程序执行步骤详解)
嵌入式软件开发 STM32F407 跑马灯 HAL库版
电脑常规操作
嵌入式软件开发 STM32F407 蜂鸣器 标准库版
C语言宏定义(宏参数创建字符串、预处理粘合剂)
【Multisim】使用NE5532P系列运放仿真时必须注意的问题
如何将会员消费能力分类?
【ICCV2021】Tokens-to-Token ViT: Training Vision Transformers From Scratch on ImageNet
SQL中去除重复数据的几种方法,我一次性都告你
C语言位操作(适用于操作单片机寄存器)
GY-53红外激光测距模块的使用以及pwm模式代码的实现
How to solve the win10 installer prompt "cannot open the file xxxxx to be written"
IIC通讯
01-kNN
DHT11和DHT22(AM2302)比较及使用方法
Chapter V stm32+ld3320 speech recognition control Taobao USB dormitory desk lamp
RTtread-动态内存分配
About coursera
第六章 OLED模块+STM32的使用







![[matlab] matlab lesson 2 - preliminary drawing](/img/8d/bcceb32c26b527d8cac6d7a8ba3a3a.png)

