当前位置:网站首页>Self taught machine learning series - 1 basic framework of machine learning
Self taught machine learning series - 1 basic framework of machine learning
2022-06-26 09:09:00 【ML_ python_ get√】
The basic framework of machine learning
1 Basic ideas of machine learning
- Introduce the basic framework of machine learning : Data acquisition 、 feature extraction 、 Data conversion 、 model training 、 Model selection 、 Model evaluation
- Supervised learning : Give original features and problem labels , Mining rules , Learn a pattern , Answer new questions
- Unsupervised learning : Just look for patterns based on original features
- Reinforcement learning : Maximize returns , There is no absolutely correct label , Nor is it looking for data structure features
1.1 Model selection
- How to select model parameters ? Cross validation
- For the general regression problem : Minimize the mean square error
- Over fitting : The variance increases as the model complexity increases , The bias decreases as the complexity of the model increases , Mean square error U type -
- Cross validation :
- The easiest way : Select a certain proportion of the training set as the verification set , But will not participate in model training , Reduce accuracy
- Usually used K The method of cross validation , All samples are then divided into K part (3-20). Use one part of it as a validation set each time , repeat K Time , Until all parts have been verified .
1.2 Model evaluation
- The return question : Mean square error
- Classification problem : Accuracy rate - hit (1,1) A false report (1,0) Omission of (0,1) Right to refuse (0,0) 1 I'm sick ,0 No disease
- Accuracy rate :( hit + Right to refuse )/ total When the incidence rate is particularly low
- Accuracy : hit /( hit + A false report )
- shooting : hit /( hit + Omission of )
- False report rate : A false report /( A false report + Right to refuse )
- ROC
- AUC
2 Common machine learning methods
- Supervised learning : Generalized linear model 、 Linear discriminant analysis 、 Support vector machine 、 Decision tree 、 Random forests 、 neural network 、K a near neighbor
- Unsupervised learning : clustering 、 Dimension reduction PCA
2.1 Generalized linear model
- Simple regression : Single factor
- Multiple regression : Multi factor
- Ridge return :L2 Regularization
- Lasso : L1 Regularization
- Logical regression : Dichotomous problem , Improvement of linear probability model
- Ordered multi classification : Multiple classification problem , The order in which dependent variables exist , fitting N-1 A logistic regression
- OvR:one vs rest Divide the samples into two categories , Conduct N Secondary logistic regression , The probability of each single class is obtained
2.2 Linear discriminant analysis and quadratic discriminant analysis
- Logistic regression is not suitable for cases where two categories are far apart
- LDA: Linear discriminant analysis , Expansion of logistic regression , It is considered that the sample satisfies the normal distribution , Use sample moments to estimate coefficients
- QDA: Second discriminant analysis , The discriminant equation is a quadratic function , The boundary is a curve
2.3 Support vector machine
- Dividing sample space by a hyperplane : Use a super large piece of paper to divide the space into two parts
- And this hyperplane is only determined by a finite number of points , It's called support vector
- XOR gate problem : Only the input (1,0)(0,1) Just output 1, Lines cannot be classified
- L d , Such as regression introduction x1*x2, Taylor expansion, etc
- SVM Introducing kernel function to compute hyperplane , Linear kernel 、 Polynomial kernel 、 Gaussian kernel
2.4 Decision trees and random forests
- Decision tree : Each layer node is divided into multiple nodes by some rule , The leaf node of the terminal is the classification result
- Selection of classification features : It maximizes the information gain after splitting sum(-plogp)
- Avoid overfitting : prune 、 Branch stop method
- C4.5 Algorithm : It can only be used for classification , Feature combinations are not possible
- CART Algorithm : Each node can only be classified into two child nodes , Support feature combination , It can be used for classification and regression
- advantage : Fast training , Solve non numerical characteristics , Nonlinear classification
- shortcoming : unstable , Sensitive to training samples , Easy to overfit
- Integration method
- Bootstrap : Some samples with the same length can be obtained by putting them back for sampling Bootstrap Data sets , Conduct N Time , Train weak classifiers for each data set .
- Bagging : be based on Bootstrap Method , Vote on multiple weak classifiers 、 The mean gets the final classification
- Parallel methods :Bagging—— Random forests , Row sampling results in Bootstrap Data sets , Column sampling random selection m Features , Final N Four decision trees vote to get the classification
- Serial method :AdaBoost—— Gradient lift decision tree :GBDT, The original data is trained to get a weak classifier , The samples with wrong classification increase the weight , Keep training
2.5 Neural networks and deep learning
- The basic idea : Neurons have two states of excitation and inhibition , The dendrites on it will receive the stimulation from the last neuron , Only the potential reaches a certain threshold , Neurons will be activated to the excited state , The electrical signal then travels along the axon and synapse to the dendrite of the next neuron , This forms a huge network
- Input layer : Linear weighting
- Hidden layer : Activation function ,ReLu,sigmoid,tanh
- Output layer : classification softmax、sigmoid, Regression equality
- Too many layers : The parameters are difficult to estimate , The gradient disappears , Convolutional neural networks CNN( Local connection )
- Image recognition :CNN
- Time series problem : Recursive neural network RNN And the long and short memory network LSTM
- Unsupervised learning : Generative antagonistic network GAN
2.6 KNN
- Supervised learning
- The above categories are based on assumptions : If two samples have similar characteristics , Belong to the same category
- Based on this idea , Make a new classification rule : The category corresponding to each point shall be determined by the nearest K Neighbor categories determine
- K Determination of value : Too small to fit , Too big to fit , Cross validation
2.7 clustering
- Unsupervised learning : Divide the sample into K A cluster of , Group similar objects into a cluster .
- K-means
- Randomly determine K A point as the center of mass , Find the nearest centroid for each sample point , Assigned to the corresponding cluster
- Select the average value of each cluster as the new particle , Update the sample points in the cluster , Get the first iteration result
- Keep repeating the process , Until the cluster no longer changes
- shortcoming : suffer K Value has a great influence , Affected by outliers , Slow convergence
- Hierarchical clustering : Decompose the sample hierarchy , Then decompose from top to bottom or merge from bottom to top
- Spectral clustering : Each object looks at the vertex of the graph V, The similarity between vertices is equal to the connecting edge E A weight , An undirected weighted graph based on similarity is obtained G(V,E)
2.8 Dimension reduction
- PAC:
- It is assumed that the larger part of the independent variable can obtain a larger response at the dependent variable
- Look for the most changeable direction in space , As a new feature , Data needs to be standardized
- How to determine the characteristic number , Cross validation
- Partial least squares :
- When PAC Suppose not , Look for the close relationship between features and dependent variables to determine new features
- X Yes Y Do the regression coefficient of linear regression , Use significant coefficients as much as possible to get Z1
- X Yes Z1 Do linear regression , The residuals , Not by Z1 Explain the part as a new feature
- New features on Y Do linear regression to get significant regression coefficient , Linear combination gives Z2
- Repeat the process M Time
- Fisher Linear discriminant analysis :
- The core idea : The spacing within the sample class is the smallest , The space between classes is the largest
- Calculate the center of two types of data
- Calculate the dispersion matrix of two types of data ( Within class ) Add to get the total dispersion matrix S
- Calculate the dispersion matrix between classes Sb
- Want to project to wx Distance between classes on w’Sbw As big as possible , In class distance w’Sw As small as possible , So optimize w’Sbw/w’Sw
- Get the projected features
- Nonlinear dimensionality reduction
- Local linear embedding LLE
- Geodesic distance isomap
- Laplacian characteristic mapping
边栏推荐
- Reverse crawling verification code identification login (OCR character recognition)
- 行为树 文件说明
- Programming training 7- date conversion problem
- 基于SSM的电脑商城
- 读书笔记:SQL 查询中的SQL*Plus 替换变量(DEFINE变量)和参数
- PD快充磁吸移動電源方案
- Pytorch build progression
- 设置QCheckbox 样式的注意事项
- Applet realizes picture preloading (picture delayed loading)
- XSS 跨站脚本攻击
猜你喜欢
Yolov5 advanced camera real-time acquisition and recognition
Phpcms applet plug-in tutorial website officially launched
Phpcms V9 mall module (fix the Alipay interface Bug)
In depth study paper reading target detection (VII) Chinese version: yolov4 optimal speed and accuracy of object detection
Yolov5 advanced 5 GPU environment setup
框架跳转导致定位失败的解决方法
XSS cross site scripting attack
Isinstance() function usage
Reverse crawling verification code identification login (OCR character recognition)
Nacos注册表结构和海量服务注册与并发读写原理 源码分析
随机推荐
Nacos注册表结构和海量服务注册与并发读写原理 源码分析
编程训练7-日期转换问题
Phpcms applet plug-in version 4.0 was officially launched
Some commands for remote work
In depth study paper reading target detection (VII) Chinese version: yolov4 optimal speed and accuracy of object detection
Sqoop merge usage
20220213 Cointegration
Phpcms V9 background article list adds one click push to Baidu function
ThreadLocal
PD快充磁吸移动电源方案
ImportError: ERROR: recursion is detected during loading of “cv2“ binary extensions. Check OpenCV in
远程工作的一些命令
百度小程序富文本解析工具bdParse
Machine learning (Part 1)
What is optimistic lock and what is pessimistic lock
isinstance()函数用法
浅谈一下Type-C接口发展历程
外部排序和大小堆相关知识
Machine learning (Part 2)
Code de mise en œuvre de l'intercepteur et du filtre