当前位置:网站首页>Summary of Baimian machine learning
Summary of Baimian machine learning
2022-07-24 14:20:00 【Strong fight】
One Feature Engineering
1 Feature normalization
Why normalize numerical type characteristics : Make different indicators comparable , Unify all features into a roughly the same numerical range .
Common methods :
① Normalization of linear function : Map the results to 【0,1】 The scope of the , Scale the original data proportionally
X_norm = (X-X_max)/(X_max-X_min)
② zero - Mean normalization : Map the raw data to a mean of 0, The standard deviation is 1 The distribution of
z=(X-u)/theta
( The model solved by gradient descent method usually needs normalization , Not applicable to decision tree , The decision tree is mainly based on data set when splitting nodes D About the characteristics of x The information gain ratio of )
2 Category features
① Serial number code : It is usually used to deal with data with size relationship between categories . High-income (3), Middle income (2), low-income (1).
② Hot coding alone : Handle features that do not have size relationships between categories . For example, male (1,0), Woman (0,1).
③ Binary code : First assign each category with serial number code ID, And then I'll put the categories ID The corresponding binary code as a result .( Binary coding essentially uses binary pairs ID
Hash map , The resulting 0/1 Eigenvector , And the dimension is less than that of single heat code , Save storage space )
3 Combination features / Processing of high dimensional composite features
In order to improve the fitting ability , Two features can be combined into second-order features . This combination seems to have no problem , But when you introduce ID The characteristics of the type , The problem arises
If the number of users is m, The number of items is n, Then the parameter scale to be learned is mn. In the Internet Environment , The number of users and items can reach tens of millions ,
It's almost impossible to learn mn Parameters of scale . Use users and items separately k A low dimensional vector representation of a dimension , The parameter scale that needs to be learned becomes mk+nk.( Equivalent to matrix decomposition )
Given the original input, how to effectively construct the decision tree ? Gradient lifting decision tree , The idea of this method is to build the next decision tree on the residual of the previous decision tree every time .
4 Text representation model
① Word bag model and N-gram Model
Cut the whole text into words , Each article can be expressed as a long vector , Each dimension in the vector represents a word , The weight corresponding to this dimension reflects the word in
The importance of the original article .
TF-IDF(t,d)=TF(t,d)IDF(t);
among TF(t,d) For the word t In the document d Frequency of occurrence in ,IDF(t) It's reverse document frequency
IDF(t)=log( The total number of articles / contain t The total number of documents +1)
Usually , You can put the continuous occurrence of n(n<=N) A group of words (N-gram) Also as a separate feature to vector representation , constitute N-gram Model .
② Theme model
③ Word embedding and deep learning model
Word embedding is a kind of model that quantifies words , The core idea is to map every word into a low dimensional space ( Usually K=50-300 dimension ) A dense vector over .
If a document has N Word , You can use one NK Dimension matrix to represent this document .
Deep learning model provides us with a way to automatically carry out feature Engineering , Each hidden layer in the model can be considered as corresponding to the characteristics of different levels of abstraction .
3、 ... and Classical algorithms
3 Decision tree
The decision tree is a top-down , The process of tree classification of sample data , It's made up of nodes and directed edges . Nodes are divided into inner nodes and leaf nodes , Each of these internal nodes
Representing a feature or attribute , A leaf node represents a category .
Applying the idea of ensemble learning to decision tree, we can get random forest , Gradient promotion decision tree and other models .
What are the common heuristic functions of decision tree ?
ID3 : Maximum information gain g(D,A)=H(D)-H(D|A)
C4.5 : Maximum information gain ratio gr(D,A)=g(D,A)/Ha(D)
CART : The largest Gini index
How to prune the decision tree ?
pre-pruning , That is, stop the growth of the decision tree in advance . After pruning , Prune the generated over fitting decision tree .
Pre pruning has the following methods for when to stop the growth of decision trees :
1. When the tree reaches a certain depth , Stop growing trees
2. When the number of samples arriving at the node is less than a certain threshold ,
3. Calculate the accuracy improvement of the test set at each split , When it is less than a certain threshold , No more expansion .
边栏推荐
- Class loading mechanism and parental delegation mechanism
- Bibliometrix: dig out the one worth reading from thousands of papers!
- C# unsafe 非托管对象指针转换
- 小熊派 课程导读
- C multithreaded lock collation record
- The spiral matrix of the force buckle rotates together (you can understand it)
- 学习scipy minimize
- Nodejs uses the express framework to post the request message "badrequesterror:request aborted"
- 记不住正则表达式?这里我整理了99个常用正则
- 不要灰心,大名鼎鼎的YOLO、PageRank影响力爆棚的研究,曾被CS顶会拒稿
猜你喜欢

bibliometrix: 从千万篇论文中挖掘出最值得读的那一篇!
![[oauth2] III. interpretation of oauth2 configuration](/img/31/90c79dbc91ee15c353ec46544c8efa.png)
[oauth2] III. interpretation of oauth2 configuration
![[oauth2] II. Authorization method of oauth2](/img/9f/0098394a341a9dfb0cf8a862f46049.png)
[oauth2] II. Authorization method of oauth2

Centos7安装达梦单机数据库

Uni app background audio will not be played after the screen is turned off or returned to the desktop

Dialog manager Chapter 2: create frame window
![[NLP] next stop, embossed AI](/img/fc/4997309d0d53c5b6eb441ac39e6929.jpg)
[NLP] next stop, embossed AI

Detailed explanation of IO model (easy to understand)

After five years of contact with nearly 100 bosses, as a headhunter, I found that the secret of promotion was only four words

Must use destructuring props assignmenteslint
随机推荐
Notes on the use of IEEE transaction journal template
JS get object attribute value
Matlab program for natural gas flow calculation
Moving the mouse into select options will trigger the mouseleave event processing scheme
Was installer startup error
Error importing header file to PCH
Don't lose heart. The famous research on the explosive influence of Yolo and PageRank has been rejected by the CS summit
PCA of [machine learning]
String - Sword finger offer 58 - ii Rotate string left
Class loading mechanism and parental delegation mechanism
sql server语法—创建数据库
CSDN垃圾的没有底线!
小熊派 课程导读
交换
电赛设计报告模板及历年资源
2.4. properties of special profile
Flink advanced features and new features (VIII)
After reading this article, I found that my test cases were written in garbage
TS learning record (I) sudo forgets the password (oolong) try changing the 'lib' compiler option to include 'DOM'
CAS atomic type