当前位置:网站首页>Machine learning concept sorting (no formula)
Machine learning concept sorting (no formula)
2022-06-22 06:06:00 【Startled】
1. What does machine learning do
(1) classification
Such as : Input a large number of animal pictures for training , To enable the machine to distinguish which dog is , Which cat is it .
(2) mark
Annotation is the generalization of classification problem . The difference from the classification problem is , The output is not a simple category ( If this is a dog ), It's a sequence of annotations . Such as : Enter an English sentence , Output the part of speech of each word in the sentence .
(3) forecast
Also called regression . For example, through the housing price data of a certain place in previous years , Learn a model , This model can predict the future trend of house prices .
2. The basic steps of machine learning
Mr. Li Hang is 《 Statistical learning method 》 To sum up , The three elements of statistical methods can be expressed as : Method = Model + Strategy + Algorithm .
More generally speaking ,
(1) Model , That is, the algorithm structure used for training . Like a familiar perceptron , Support vector machine , neural network , All belong to “ Model ”. This step is crucial , Take a chestnut : If there is a clear boundary between the two types of data on the two-dimensional plane , Then the linear model can be used for classification , But if it's all mixed up , Then linear Model , No matter how you train , Nor can we get satisfactory results
(2) Strategy , Strategy is the rule of learning . Suppose we choose the linear model in the first step :WX+b=0, among W Is an unknown parameter , We need to train the input data constantly , To correct the model ( Adjust “ A straight line ” The location of ), At this point, a guideline is needed to guide how to adjust , We call it strategy . Such as in the linear model , We can reduce Total distance from misclassification point to straight line As our guideline for updating parameters , Our goal is to make this distance smaller and smaller through continuous learning , When it is 0 when , It indicates that there are no misclassification points at this time . This objective function is usually called “ Loss function ” or “ Empirical risk ”, The goal is to minimize him .
(3) Algorithm , The algorithm is the method used to update parameters . Such as gradient descent method , Newton method and so on . These methods are used for the quickest 、 The best way is to adjust the parameters to the optimal state .
3. Other concepts
(1) Over fitting
It's easy to understand , That is, the model can classify the training data very well and correctly , However, there is a big error in the new data . For example, in a two-dimensional plane , There's a bunch of data that's roughly linearly distributed , If fitted , The final model learned may be a function of a very high degree , Although it makes all the existing data fall on this function , But a new data , The predicted value will deviate a lot .
(2) Regularization
To prevent over fitting , Introduce the concept of regularization . Mentioned earlier , Our goal is to minimize the loss function . We add a regular term to the loss function , Regular terms generally increase with the complexity of the model , The more complex the model , The larger the regular term . such , Our goal is to minimize the risk function , As the model becomes more and more complex , On the contrary, the risk function becomes larger , Therefore, this method can effectively prevent over fitting .
(3) Cross validation
Another way to prevent overfitting is cross validation . Usually used in neural networks . Why can't we just use regular terms , I think neural network is a big black box , Too complicated , Some things are difficult to generalize with a formula , So the neural network uses cross validation to prevent over fitting .
Cross validation is : We divide the data set into three parts : Training set 、 Verification set 、 Test set . Training and testing are well understood , What is the validation set for ? At different levels of model learning , Test with validation set , Choose to use the model that has the least error to the validation set . Generally speaking, it is : When the model is similar , Try the validation set , If the model behaves badly , Then it shows that it has been fitted .
(4) Supervised model and unsupervised model
The supervisory model refers to , Training data has category information , such as , Training data with supervised model :“ Wang Wang , Wag the tail , It's hot and my tongue is stretched out — Dog ”,“ Meow meow , High cold , Catch mice — cat ” Finally, be sure to tell the model that this is a characteristic of dogs and cats , The next time the model encounters a similar symptom, it will give feedback “ This is a dog / cat ”. The unsupervised model , Only the previous features , No category labels . Model by learning , It can automatically extract high-order features to find the difference between dogs and cats , Therefore, it can also be classified , It is called unsupervised model .
4. General steps
(1) Define the algorithm formula
(2) Define the loss function , Select the optimization algorithm
(3) Train the data iteratively
(4) Evaluate accuracy on a test set or verification set
5. summary
The ultimate goal of machine learning is to learn a model , The model includes : The discriminant models are k Next door neighbor 、 perceptron 、 Decision tree 、 Logical regression 、SVM etc. . The generation model has naive Bayes , Markov chain . The difference between discriminant model and generative model is not well understood , I'll add it later when I understand it .
In order to learn this model ( Identify unknown parameters in the model ), We need some strategies , That is, minimize the loss function , Different scenarios use different types of loss functions , Such as 0-1 Loss function 、 Square loss function 、 Absolute loss function 、 Logarithmic loss function . To prevent over fitting , The concept of regularization is introduced .
Last , To minimize the loss function , The optimization algorithm is also introduced , Such as gradient descent method 、 Newton method, etc .
Want to make the model have a better effect , We should do well in the above three parts , Many details will be designed . The above is only a very macroscopic and rough summary , Because I have to solve one thing , There must be contradictions , What if the gradient descent algorithm converges very slowly , How to initialize the parameters can make the algorithm converge as soon as possible , Each problem has its own solution , But as long as we keep in mind what our goals are, we won't get bogged down , Will not be ignorant , Will soon understand the essence of the algorithm .
Last , I have been learning machine learning for months , Stumbling in the middle , Multiple terminations , The reason is , Most of them are confused by too professional words , Or be scared by the endless formula , Thus, it is impossible to grasp the whole context of knowledge from a macro perspective , I have been unable to grasp the gist . Therefore, I will make a small summary of my detours , I hope it can help me wander in front of the machine learning gate , The same confused students . Because most of the content is your own idea , If there is any imprecision or mistake , Also please correct me !
边栏推荐
猜你喜欢

U disk as startup disk to reinstall win10 system (no other software required)

Surfer格网文件裁剪

单细胞论文记录(part11)--ClusterMap for multi-scale clustering analysis of spatial gene expression

Flink核心功能和原理

400 hash table (1. sum of two numbers, 454. sum of four numbers II, 383. ransom letter)

MFC TabCtrl 控件修改標簽尺寸

性能优化 之 3D资产优化及顶点数据管理

W800芯片平台进入OpenHarmony主干

matlab 的离散pid控制

Understanding of C pointer
随机推荐
【云计算重点复习】
单细胞论文记录(part7)--DL and alignment of spatially resolved single-cell transcriptomes with Tangram
R language observation log (part24) -- writexl package
【Rust笔记】03-引用
从转载阿里开源项目 Egg.js 技术文档引发的“版权纠纷”,看宽松的 MIT 许可该如何用?
为什么我选择 Rust
【自己动手写CPU】异常相关指令的实现
drop、truncate和delete的区别
Non transitive dice (spring daily question 51)
单细胞论文记录(part10)--Computational challenges and opportunities in SRT data
vcpkg:If you are sure you want to rebuild the above packages, run the command with the --recurse opt
Write optimized DSP code for cortex-m4
MFC tabctrl control to modify label size
Vulkan 预旋转处理设备方向
汇顶科技GR551x系列开发板已支持OpenHarmony
虚职、架空、拖后腿,大厂开源办公室到底什么样?
Vulkan pre rotation processing equipment direction
【CPU设计实战】数字逻辑电路设计基础(一)
生信可视化(part2)--箱线图
从入门到精通之专家系统CLIPS(一)CLIPS初识与概述