当前位置:网站首页>Machine learning concept sorting (no formula)

Machine learning concept sorting (no formula)

2022-06-22 06:06:00 Startled

1. What does machine learning do

(1) classification

Such as : Input a large number of animal pictures for training , To enable the machine to distinguish which dog is , Which cat is it .

(2) mark

Annotation is the generalization of classification problem . The difference from the classification problem is , The output is not a simple category ( If this is a dog ), It's a sequence of annotations . Such as : Enter an English sentence , Output the part of speech of each word in the sentence .

(3) forecast

Also called regression . For example, through the housing price data of a certain place in previous years , Learn a model , This model can predict the future trend of house prices .

2. The basic steps of machine learning

Mr. Li Hang is 《 Statistical learning method 》 To sum up , The three elements of statistical methods can be expressed as : Method = Model + Strategy + Algorithm .

More generally speaking ,

(1) Model , That is, the algorithm structure used for training . Like a familiar perceptron , Support vector machine , neural network , All belong to “ Model ”. This step is crucial , Take a chestnut : If there is a clear boundary between the two types of data on the two-dimensional plane , Then the linear model can be used for classification , But if it's all mixed up , Then linear Model , No matter how you train , Nor can we get satisfactory results

(2) Strategy , Strategy is the rule of learning . Suppose we choose the linear model in the first step :WX+b=0, among W Is an unknown parameter , We need to train the input data constantly , To correct the model ( Adjust “ A straight line ” The location of ), At this point, a guideline is needed to guide how to adjust , We call it strategy . Such as in the linear model , We can reduce Total distance from misclassification point to straight line As our guideline for updating parameters , Our goal is to make this distance smaller and smaller through continuous learning , When it is 0 when , It indicates that there are no misclassification points at this time . This objective function is usually called “ Loss function ” or “ Empirical risk ”, The goal is to minimize him .

(3) Algorithm , The algorithm is the method used to update parameters . Such as gradient descent method , Newton method and so on . These methods are used for the quickest 、 The best way is to adjust the parameters to the optimal state .

3. Other concepts

(1) Over fitting

It's easy to understand , That is, the model can classify the training data very well and correctly , However, there is a big error in the new data . For example, in a two-dimensional plane , There's a bunch of data that's roughly linearly distributed , If fitted , The final model learned may be a function of a very high degree , Although it makes all the existing data fall on this function , But a new data , The predicted value will deviate a lot .

(2) Regularization

To prevent over fitting , Introduce the concept of regularization . Mentioned earlier , Our goal is to minimize the loss function . We add a regular term to the loss function , Regular terms generally increase with the complexity of the model , The more complex the model , The larger the regular term . such , Our goal is to minimize the risk function , As the model becomes more and more complex , On the contrary, the risk function becomes larger , Therefore, this method can effectively prevent over fitting .

(3) Cross validation

Another way to prevent overfitting is cross validation . Usually used in neural networks . Why can't we just use regular terms , I think neural network is a big black box , Too complicated , Some things are difficult to generalize with a formula , So the neural network uses cross validation to prevent over fitting .

Cross validation is : We divide the data set into three parts : Training set 、 Verification set 、 Test set . Training and testing are well understood , What is the validation set for ? At different levels of model learning , Test with validation set , Choose to use the model that has the least error to the validation set . Generally speaking, it is : When the model is similar , Try the validation set , If the model behaves badly , Then it shows that it has been fitted .

(4) Supervised model and unsupervised model

The supervisory model refers to , Training data has category information , such as , Training data with supervised model :“ Wang Wang , Wag the tail , It's hot and my tongue is stretched out — Dog ”,“ Meow meow , High cold , Catch mice — cat ” Finally, be sure to tell the model that this is a characteristic of dogs and cats , The next time the model encounters a similar symptom, it will give feedback “ This is a dog / cat ”. The unsupervised model , Only the previous features , No category labels . Model by learning , It can automatically extract high-order features to find the difference between dogs and cats , Therefore, it can also be classified , It is called unsupervised model .

4. General steps

(1) Define the algorithm formula

(2) Define the loss function , Select the optimization algorithm

(3) Train the data iteratively

(4) Evaluate accuracy on a test set or verification set

5. summary

The ultimate goal of machine learning is to learn a model , The model includes : The discriminant models are k Next door neighbor 、 perceptron 、 Decision tree 、 Logical regression 、SVM etc. . The generation model has naive Bayes , Markov chain . The difference between discriminant model and generative model is not well understood , I'll add it later when I understand it .

In order to learn this model ( Identify unknown parameters in the model ), We need some strategies , That is, minimize the loss function , Different scenarios use different types of loss functions , Such as 0-1 Loss function 、 Square loss function 、 Absolute loss function 、 Logarithmic loss function . To prevent over fitting , The concept of regularization is introduced .

Last , To minimize the loss function , The optimization algorithm is also introduced , Such as gradient descent method 、 Newton method, etc .

Want to make the model have a better effect , We should do well in the above three parts , Many details will be designed . The above is only a very macroscopic and rough summary , Because I have to solve one thing , There must be contradictions , What if the gradient descent algorithm converges very slowly , How to initialize the parameters can make the algorithm converge as soon as possible , Each problem has its own solution , But as long as we keep in mind what our goals are, we won't get bogged down , Will not be ignorant , Will soon understand the essence of the algorithm .

Last , I have been learning machine learning for months , Stumbling in the middle , Multiple terminations , The reason is , Most of them are confused by too professional words , Or be scared by the endless formula , Thus, it is impossible to grasp the whole context of knowledge from a macro perspective , I have been unable to grasp the gist . Therefore, I will make a small summary of my detours , I hope it can help me wander in front of the machine learning gate , The same confused students . Because most of the content is your own idea , If there is any imprecision or mistake , Also please correct me !


 


原网站

版权声明
本文为[Startled]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206220544017397.html