当前位置:网站首页>Data processing and visualization of machine learning [iris data classification | feature attribute comparison]
Data processing and visualization of machine learning [iris data classification | feature attribute comparison]
2022-06-21 21:23:00 【51CTO】
Table of Contents
- One , Preface
- 1.1 This paper is based on the principle
- 1.2 Purpose
- 1.3 Objectives and contents
- 1.4 This paper is based on the environment
- Two , Experimental process
- 2.1 install scikit-learn Machine learning related modules
- 2.2 Download the iris data set in the program
- 2.3 Use matplotlib Compare and draw the characteristics of iris data set
- 2.4 Analyze the characteristics of the drawn iris visual map to clearly distinguish the categories of iris
- 3、 ... and , The source code involved in this article is attached
One , Preface
1.1 This paper is based on the principle
Most machine learning models deal with features , The feature is usually the numerical representation of the input variable that can be used for the model .
In most cases , The collected data needs to be processed before it can be used by the algorithm . Usually , There are many different characteristics in a dataset , Some of them may be redundant or irrelevant to the value we want to predict , It can be filtered through data processing and visualization .
The necessity of feature selection technology is also reflected in the simplified model 、 Reduce training time 、 Avoid dimension explosion and promote generalization to avoid over fitting .
1.2 Purpose
1. Familiar with data processing and visualization methods of machine learning
2. Use data processing and visualization methods to analyze data characteristics
1.3 Objectives and contents
1. install scikit-learn Machine learning and its related python package ;
2. Download the iris data set in the program ;
3. Use matplotlib Compare and draw the characteristics of iris data set ;
4. Analyze the characteristics of the drawn iris visual map to clearly distinguish the categories of iris ;
1.4 This paper is based on the environment
1.PC machine
2.Windows10
3.Scikit-learn Installation package
4.jupyter Editor or pycharm etc. python Editor 
Two , Experimental process
2.1 install scikit-learn Machine learning related modules
The installation process is a little bit , Direct installation scikit-learn modular , Domestic image installation can be adopted , It saves time .
Input
Check whether the local environment is successfully installed 【scikit-learn】 This module .
2.2 Download the iris data set in the program
We use load_iris Data sets , In total, including 150 rows , The first four columns are calyx length , Calyx width , Petal length , Petal width 4 An attribute that identifies iris ,‘sepal_len’,‘sepal_wid’,‘petal_len’,‘petal_wid’.
The first 5 In the category of iris ( Include Setosa,Versicolour,Virginica Three types of )
The code is as follows
We output X Take a look at this 150 Group data :
2.3 Use matplotlib Compare and draw the characteristics of iris data set
Because we will use figure Method , Let's define the size first , Give Way 16 Subgraphs can be output appropriately . The following code :
We need output 16 Subtext , Set the variable to 4, Traverse twice .
Traverse twice , as follows :
You can imagine :
Namely 0-0,0-1,0-2,0-3,1-0,1-1……
Yes 16 Combinations of , It is also necessary to take the characteristic value .
We need to set the position of each subgraph , You can draw these subgraphs in turn , The advantage is simplicity , The disadvantage is that it is a little troublesome .
The following code :
We need to think about , If 0-0,1-1,2-2, This is a special case , Let's deal with it separately .
plt.scatter We need to understand the properties of : as follows
matplotlib.pyplot.scatter(x, y, s=None, c=None, marker=None, cmap=None, norm=None, vmin=None,
vmax=None, alpha=None, linewidths=None, verts=None, edgecolors=None, hold=None, data=None, **kwargs)
- x, y → The coordinates of the scatter point
- s → The area of the scatter
- c → Scattered color ( The default is blue ,‘b’, Other colors are the same as plt.plot( ))
- marker→ Scatter style ( The default value is filled circle ,‘o’, Other styles are the same as plt.plot( )) alpha → Scatter transparency ([0,1] Number between ,0 Indicates full transparency ,1 Is completely opaque )
- linewidths → Edge lineweight of scatter points edgecolors → The edge color of the scatter
If ,feature==feature_other, If the traversal values are the same ,x, y → The coordinates of the scatter points are the same , It's not very intuitive , Let's go straight to x The coordinates of the scatter point set a self increasing variable , Let it come from 0 To 49 Self increasing .
In other cases :x, y → The coordinates of the scatter points are different , You can draw normally
Above code explanation :
X[0:50,feature],X[0:50,feature_other]
Represent the x, y → The coordinates of the scatter point , Because we have 150 Group target data , We get the target data set from the data set according to different characteristic values . Perform drawing processing .
Need to understand grammar :
a[:,1] The meaning of , You can understand .
Now we need to set X Axis and Y The label of the shaft . The grammar is as follows :
xlabel(xlabel, fontdict=None, labelpad=None, *, loc=None, **kwargs)
- xlabel: Type is string , The text of the label .
- fontdict: dict, A dictionary is used to control the font style of labels
- labelpad: The type is floating point number , The default value is None, That is, the distance between the label and the coordinate axis .
- loc: The value range is {‘left’, ‘center’, ‘right’}, The default value is rcParams[“xaxis.labellocation”](‘center’), The location of the label .
- **kwargs:Text Object key attribute , Used to control the appearance properties of text , Like typeface 、 Text color, etc .
Finally, set the legend position , Output image .
The renderings are as follows :
2.4 Analyze the characteristics of the drawn iris visual map to clearly distinguish the categories of iris
According to the figure 0-2 ,1-3 distinct .
The length of sepals and petals can be seen , The characteristics of sepal width and petal width can clearly distinguish Iris species .
3、 ... and , The source code involved in this article is attached
The source code involved in this paper is as follows , It can run directly :
边栏推荐
- 欢迎使用Markdown编辑器
- How to solve the problem of automatically updating the click times of weaving dream article list
- ctfshow 105-127
- 如何解决织梦文章列表自动更新点击次数
- Lvs+keepalived high availability cluster deployment
- [parallel and distributed computing] 10B_ MapReduce GFS Implementation
- NewOJ Week 6
- 启牛开通证券账户是真实安全的吗?开户收费吗
- libtorch显存管理示例
- What plug-ins are available for vscade?
猜你喜欢

2016 ICLR | Adversarial Autoencoders
![[microservices 7] in depth analysis of bestavailablerule source code of ribbon load balancing strategy](/img/fa/492e32d840a5a2ea4c3a2d54b2330a.png)
[microservices 7] in depth analysis of bestavailablerule source code of ribbon load balancing strategy

Unity analog flashlight light source detector, AI attack range detection area, object detection in visual cone, fan-shaped area detection, circular area detection, cone area detection

Idea has this class but can't find it

The first in the industry! Krypton app has obtained the authoritative certification of China Network Security Review Technology and Certification Center

集群一---LVS负载均衡集群NAT模式及LVS负载均衡实战部署

Convert string type to list < integer >
![[Internet of things development] punctual atom STM32 warship v3+ smart cloud aiot+app control](/img/78/90f7eca3ca9504a7f8b232e577ae03.png)
[Internet of things development] punctual atom STM32 warship v3+ smart cloud aiot+app control

数据路:三人行,必有我师!

What plug-ins are available for vscade?
随机推荐
Go语言自学系列 | golang标准库encoding/xml
【微服务七】Ribbon负载均衡策略之BestAvailableRule源码深度剖析
Unity 模拟手电筒光源探测器,AI攻击范围检测区域,视锥内检测物体,扇形区域检测,圆形区域检测,圆锥区域检测
Mysql database - storage engine
浅谈代码语言的魅力
Leecode70 climbing stairs
Some shaders in AB package do not trigger the callback of ipreprocessshaders
【MySQL·水滴计划】第三话- SQL的基本概念
函数的声明方式
文件编译过程
libtorch显存管理示例
This real-time monitoring scheme is really excellent!
MySQL数据库---事务
ASP. Net core creates razor page and uploads multiple files (buffer mode)
Principle and application of user mode hot patch
Ns32f103vbt6 hardware and software replace stm32f103vbt6
PowerPoint tutorial, how to organize slides into groups in PowerPoint?
Basic rules of smiles
Unity analog flashlight light source detector, AI attack range detection area, object detection in visual cone, fan-shaped area detection, circular area detection, cone area detection
Leecode435 non overlapping interval