当前位置:网站首页>Machine learning note 7: powerful neural network representation
Machine learning note 7: powerful neural network representation
2022-06-22 05:43:00 【Amyniez】
Catalog
1. Neural networks ( Great AI )
1.1 Question elicitation
- The disadvantages of linear regression and logical regression : When there are too many eigenvalues , The computational load will be very large , Resulting in reduced computational efficiency . therefore , Neural network will solve this problem perfectly .
- for example : Yes 100 Eigenvalues , Need to hope to use this 100 A feature to construct a nonlinear polynomial model , The result will be an amazing number of feature combinations , Even if we only use a combination of two features : x 1 x 2 + x 2 x 3 + x 3 x 4 + . . . . . . + x 99 x 100 x_1x_2+x_2x_3+x_3x_4+......+x_{99}x_{100} x1x2+x2x3+x3x4+......+x99x100, There will be 5000 A combination of features , There are too many characteristics to calculate for logistic regression .

1.2 e.g. Visual object recognition model ( Identify the objects in the picture )
- Use many pictures of cars and many pictures of non cars , Then use the pixel value on the picture ( Saturation or brightness ) As a feature . Only grayscale images are selected ( Not RGB), Each pixel has only one value , Then select two pixels at two different positions on the picture , Then train a logistic regression algorithm , Use this Two pixel values To determine whether the picture is a car :


use 50x50 A small picture of pixels , And all pixels are considered as features , Will have a
2500 Features , If further, it will Pairwise feature combination Form a polynomial model , There will be an appointment 250 0 2 / 2 2500^2/2 25002/2 Features . Ordinary logistic regression model , Can't handle so many features effectively . therefore , You need to use neural networks .
1.3 What is neural network ?
The birth of neural network is that people want to try to design algorithms that imitate the brain ( The human brain is the best learning machine ). hypothesis : The brain does everything and in different ways , You don't need thousands of different programs to implement . contrary , The way the brain processes , Just need a single learning algorithm . Because the human body has The same brain tissue Can handle light 、 Acoustic or tactile signals , Then there may be A learning algorithm ( Instead of thousands of algorithms ), You can deal with vision at the same time 、 Hearing and touch .
2. Model to represent
The neural network model is based on many neurons , Each neuron is a learning model . These neurons ( Also called activation unit ,activation unit) Take some features as output , And provide an output based on its own model . Such as a neural network similar to neurons :

x 1 , x 2 , x 3 x_1, x_2, x_3 x1,x2,x3 It's the input unit , Input raw data into them .
a 1 , a 2 , a 3 a_1, a_2, a_3 a1,a2,a3 It's the intermediate unit , Responsible for data processing , And then to the next level .
Finally, the output unit , It's responsible for calculating h θ ( x ) h_\theta(x) hθ(x).Neural network model is a network of many logic units organized according to different levels , The output variables of each layer are the input variables of the next layer . The first layer acts as the input layer (Input Layer), The middle layer becomes the hidden layer (Hidden Layers), The last layer is called the output layer (Output Layer). And add a deviation unit for each layer (bias unit):

Model is introduced :
a i ( j ) a_i^{(j)} ai(j): On behalf of the 𝑗 Layer of the first 𝑖 Two activation units .
θ ( j ) \theta(j) θ(j): From the first 𝑗 Layers are mapped to 𝑗 + 1 In the case of layer Weight matrices , for example ,𝜃(1) A matrix representing the weights mapped from the first layer to the second layer . Its size is : By the end of 𝑗 + 1 The number of active cells in a layer is the number of rows , By the end of 𝑗 A matrix in which the number of active cells of a layer plus one is the number of columns . for example , In the neural network in the figure above θ ( 1 ) \theta(1) θ(1) The size is 3 * 4.
The activation unit and output are expressed as :
Weight matrices :
therefore , every last 𝑎 It's all owned by the upper floor 𝑥 And each one 𝑥 The corresponding decision , So this left to right algorithm is called Forward propagation algorithm ( FORWARD PROPAGATION )
hold 𝑥, 𝜃, 𝑎 Each is represented by a matrix , We can get 𝜃 ⋅ 𝑋 = 𝑎 :
3. Vectorization calculation of forward propagation algorithm
First , Calculation of the second layer of neural network :

Re order 𝑧(2) = θ ( 1 ) \theta^{(1)} θ(1)𝑥, be 𝑎(2) = 𝑔( 𝑧 ( 2 ) 𝑧^{(2)} z(2)) , Add... After calculation 𝑎 0 ( 2 ) = 1 𝑎_0^{(2)} = 1 a0(2)=1:

Forward propagation algorithm :

All in all , a n = g ( θ n − 1 a n − 1 ) a^n=g(\theta^{n-1}a^{n-1}) an=g(θn−1an−1),n: It means the first one n layer .
Be careful : If you want to calculate the whole training set , The training set characteristic matrix needs to be Transposition , Make the features of the same instance in the same column . Such as :

4. Characteristics of neural network
4.1 Comparison of eigenvalues between logistic regression and neural network
Neural network can learn its own series of New features :
- stay Logical regression in , The model is limited to the original features X in , Although binomial terms can be used to combine these features ( Such as , A combination of two ), But the model is still limited by the original features .
- stay neural network in , The original feature is just the input layer , The prediction made by the output layer uses the characteristics of the second layer , Not the original features in the input layer , It can be said that Features in the second layer It is a series of neural networks that are used to predict output variables after learning New features .
4.2 Logic operation in neural network
Monolayer neurons ( No middle layer ) The calculation of can be used to express Logical operations (AND、OR)
Logic and (AND):

If we assume θ 0 = − 30 , θ 1 = 20 , θ 2 = 20 \theta_0=-30,\theta_1=20,\theta_2=20 θ0=−30,θ1=20,θ2=20 , Then the output function is : h θ ( x ) = g ( − 30 + 20 x 1 + 20 x 2 ) h_\theta^{(x)}=g(-30+20x_1+20x_2) hθ(x)=g(−30+20x1+20x2)
namely z Being positive , It is true ;z Negative , False
To sum up, we can get ,AND function : h θ ( x ) h_\theta^{(x)} hθ(x) ≈ x 1 A N D x 2 x_1 AND x_2 x1ANDx2Logic or (OR):

hypothesis θ 0 = − 10 , θ 1 = 20 , θ 2 = 20 \theta_0=-10,\theta_1=20,\theta_2=20 θ0=−10,θ1=20,θ2=20
5. Samples of neural networks
When the input characteristic is Boolean (0 or 1) when , A single activation layer can be used as a logical operator , To represent different operators , Just choose Different weights .
Logic and (AND): Both numbers are 1, The result is 1;

Logic or (OR): As long as one number is 1, The results for 1;


Logic is not (XOR): The difference between the two numbers is 1, Same as 0;

Logical identical or (XNOR): Same as 1, Different for 0; namely X N O R = ( x 1 A N D x 2 ) O R ( ( N O T x 1 ) A N D ( N O T x 2 ) ) XNOR=(x_1 AND x_2)OR((NOT x_1)AND(NOT x_2)) XNOR=(x1ANDx2)OR((NOTx1)AND(NOTx2))
First construct the expression ( ( N O T x 1 ) A N D ( N O T x 2 ) ) ((NOT x_1)AND(NOT x_2)) ((NOTx1)AND(NOTx2)) Partial neurons :
XNOR Operator implementation :
Andrew NG Pictures given :

In this way, we can gradually construct more and more complex functions , You can also get more powerful eigenvalues . This is the power of neural networks !
6. Multi classification in neural networks
for example , Train a neural network algorithm to recognize passers-by (pedestrian)、 automobile (car)、 The motorcycle (motorcycle) And trucks (truck), In the output layer we should have 4 It's worth . for example , The first value is 1 or 0 Used to predict whether a pedestrian , The second value is used to determine whether it is a car .
Input vector 𝑥 There are three dimensions , Two intermediate layers , Output layer 4 Two neurons are used to represent 4 class , That is, every data will appear in the output layer [𝑎 𝑏 𝑐 𝑑]𝑇, And 𝑎, 𝑏, 𝑐, 𝑑 Only one of them is for 1, Represents the current class . Here's what to do The structure of neural networks :

The output of neural network algorithm is one of four possible cases :
边栏推荐
- Facebook账户 “ 解封、防封、养号 ” 知识要点,已收藏!
- 为什么说“ CPS联盟营销 ” 是性价比最高的推广方式?
- Research Report on global and Chinese active Ethernet access device industry demand trend and investment prospect 2022-2027
- Tensorflow 2. Chapter 14: callbacks and custom callbacks in keras
- Gerrit Code Review Setup
- CMAKE notes
- Kubernetes - deploy application to cluster
- Global and Chinese aluminum electrolytic capacitor market survey and future development strategic planning report 2022-2027
- Some templates about bisection
- 网络、IO流、反射、多线程、异常
猜你喜欢

基于CRU中的tmp数据进行年平均气温分析

Gerrit Code Review Setup

Kubernetes - deploy application to cluster

大厂晋升学习方法三:链式学习法
Throw away electron and embrace Tauri based on Rust

P1318 ponding area

Online text code comparison tool

做事方法:3C 方案设计法
Talk about MySQL's locking rule "hard hitting MySQL series 15"

Cookie setting and reading in C #
随机推荐
Implementation of lazy loading of pictures (summary and sorting)
关于图片懒加载的实现(总结梳理)
毕业季 | 新的开始,不说再见
open source hypervisor
独立站优化清单丨如何有效提升站内转化率?
关于二分一些模板
Graduation season | a new start, no goodbye
Analysis of 43 cases of MATLAB neural network: Chapter 28 Application Research of decision tree classifier - breast cancer diagnosis
《MATLAB 神经网络43个案例分析》:第29章 极限学习机在回归拟合及分类问题中的应用研究——对比实验
From "platform transformation" to "DTC brand going to sea", what is the trend of 2021?
I don't suggest you work too hard
[graduation season · advanced technology Er] a graduate student's chatter
Hide symbol of dynamic library
C语言指针(进阶)
QEMU ARM interrupt system architecture 2
Performance analysis and test of interprocess communication methods under dual core real-time system
tmux -- ssh terminal can be closed without impact the server process
P1160 queue arrangement
Go语言使用zap日志库
《双内核实时系统下各个进程间通信方法的性能分析和测试》