当前位置:网站首页>Machine learning note 7: powerful neural network representation

Machine learning note 7: powerful neural network representation

2022-06-22 05:43:00 Amyniez

1. Neural networks ( Great AI )

1.1 Question elicitation

  1. The disadvantages of linear regression and logical regression : When there are too many eigenvalues , The computational load will be very large , Resulting in reduced computational efficiency . therefore , Neural network will solve this problem perfectly .
  2. for example : Yes 100 Eigenvalues , Need to hope to use this 100 A feature to construct a nonlinear polynomial model , The result will be an amazing number of feature combinations , Even if we only use a combination of two features : x 1 x 2 + x 2 x 3 + x 3 x 4 + . . . . . . + x 99 x 100 x_1x_2+x_2x_3+x_3x_4+......+x_{99}x_{100} x1x2+x2x3+x3x4+......+x99x100, There will be 5000 A combination of features , There are too many characteristics to calculate for logistic regression .
     Insert picture description here

1.2 e.g. Visual object recognition model ( Identify the objects in the picture )

  1. Use many pictures of cars and many pictures of non cars , Then use the pixel value on the picture ( Saturation or brightness ) As a feature . Only grayscale images are selected ( Not RGB), Each pixel has only one value , Then select two pixels at two different positions on the picture , Then train a logistic regression algorithm , Use this Two pixel values To determine whether the picture is a car :
     Insert picture description here
     Insert picture description here
    use 50x50 A small picture of pixels , And all pixels are considered as features , Will have a
    2500 Features , If further, it will Pairwise feature combination Form a polynomial model , There will be an appointment 250 0 2 / 2 2500^2/2 25002/2 Features . Ordinary logistic regression model , Can't handle so many features effectively . therefore , You need to use neural networks .

1.3 What is neural network ?

   The birth of neural network is that people want to try to design algorithms that imitate the brain ( The human brain is the best learning machine ). hypothesis : The brain does everything and in different ways , You don't need thousands of different programs to implement . contrary , The way the brain processes , Just need a single learning algorithm . Because the human body has The same brain tissue Can handle light 、 Acoustic or tactile signals , Then there may be A learning algorithm ( Instead of thousands of algorithms ), You can deal with vision at the same time 、 Hearing and touch .
 Insert picture description here

2. Model to represent

  1. The neural network model is based on many neurons , Each neuron is a learning model . These neurons ( Also called activation unit ,activation unit) Take some features as output , And provide an output based on its own model . Such as a neural network similar to neurons :
     Insert picture description here
     Insert picture description here x 1 , x 2 , x 3 x_1, x_2, x_3 x1,x2,x3 It's the input unit , Input raw data into them .
    a 1 , a 2 , a 3 a_1, a_2, a_3 a1,a2,a3 It's the intermediate unit , Responsible for data processing , And then to the next level .
    Finally, the output unit , It's responsible for calculating h θ ( x ) h_\theta(x) hθ(x).

  2. Neural network model is a network of many logic units organized according to different levels , The output variables of each layer are the input variables of the next layer . The first layer acts as the input layer (Input Layer), The middle layer becomes the hidden layer (Hidden Layers), The last layer is called the output layer (Output Layer). And add a deviation unit for each layer (bias unit):
     Insert picture description here
    Model is introduced :
    a i ( j ) a_i^{(j)} ai(j): On behalf of the 𝑗 Layer of the first 𝑖 Two activation units .
    θ ( j ) \theta(j) θ(j): From the first 𝑗 Layers are mapped to 𝑗 + 1 In the case of layer Weight matrices , for example ,𝜃(1) A matrix representing the weights mapped from the first layer to the second layer . Its size is : By the end of 𝑗 + 1 The number of active cells in a layer is the number of rows , By the end of 𝑗 A matrix in which the number of active cells of a layer plus one is the number of columns . for example , In the neural network in the figure above θ ( 1 ) \theta(1) θ(1) The size is 3 * 4.
    The activation unit and output are expressed as :
     Insert picture description here Weight matrices :
     Insert picture description here
    therefore , every last 𝑎 It's all owned by the upper floor 𝑥 And each one 𝑥 The corresponding decision , So this left to right algorithm is called Forward propagation algorithm ( FORWARD PROPAGATION )
    hold 𝑥, 𝜃, 𝑎 Each is represented by a matrix , We can get 𝜃 ⋅ 𝑋 = 𝑎
     Insert picture description here

3. Vectorization calculation of forward propagation algorithm

   First , Calculation of the second layer of neural network :
 Insert picture description here
 Insert picture description here
Re order 𝑧(2) = θ ( 1 ) \theta^{(1)} θ(1)𝑥, be 𝑎(2) = 𝑔( 𝑧 ( 2 ) 𝑧^{(2)} z(2)) , Add... After calculation 𝑎 0 ( 2 ) = 1 𝑎_0^{(2)} = 1 a0(2)=1

 Insert picture description here
Forward propagation algorithm :

 Insert picture description here

   All in all , a n = g ( θ n − 1 a n − 1 ) a^n=g(\theta^{n-1}a^{n-1}) an=g(θn1an1),n: It means the first one n layer .

   Be careful : If you want to calculate the whole training set , The training set characteristic matrix needs to be Transposition , Make the features of the same instance in the same column . Such as :

 Insert picture description here

4. Characteristics of neural network

4.1 Comparison of eigenvalues between logistic regression and neural network

   Neural network can learn its own series of New features

  1. stay Logical regression in , The model is limited to the original features X in , Although binomial terms can be used to combine these features ( Such as , A combination of two ), But the model is still limited by the original features .
  2. stay neural network in , The original feature is just the input layer , The prediction made by the output layer uses the characteristics of the second layer , Not the original features in the input layer , It can be said that Features in the second layer It is a series of neural networks that are used to predict output variables after learning New features .

4.2 Logic operation in neural network

   Monolayer neurons ( No middle layer ) The calculation of can be used to express Logical operations AND、OR

  1. Logic and (AND):
     Insert picture description here
    If we assume θ 0 = − 30 , θ 1 = 20 , θ 2 = 20 \theta_0=-30,\theta_1=20,\theta_2=20 θ0=30,θ1=20,θ2=20 , Then the output function is : h θ ( x ) = g ( − 30 + 20 x 1 + 20 x 2 ) h_\theta^{(x)}=g(-30+20x_1+20x_2) hθx=g(30+20x1+20x2)
    namely z Being positive , It is true ;z Negative , False
    To sum up, we can get ,AND function : h θ ( x ) h_\theta^{(x)} hθx x 1 A N D x 2 x_1 AND x_2 x1ANDx2

  2. Logic or (OR):
     Insert picture description here
    hypothesis θ 0 = − 10 , θ 1 = 20 , θ 2 = 20 \theta_0=-10,\theta_1=20,\theta_2=20 θ0=10,θ1=20,θ2=20

5. Samples of neural networks

   When the input characteristic is Boolean (0 or 1) when , A single activation layer can be used as a logical operator , To represent different operators , Just choose Different weights .

  1. Logic and (AND): Both numbers are 1, The result is 1;
     Insert picture description here

  2. Logic or (OR): As long as one number is 1, The results for 1;
     Insert picture description here  Insert picture description here

  3. Logic is not (XOR): The difference between the two numbers is 1, Same as 0;
     Insert picture description here

  4. Logical identical or (XNOR): Same as 1, Different for 0; namely X N O R = ( x 1 A N D x 2 ) O R ( ( N O T x 1 ) A N D ( N O T x 2 ) ) XNOR=(x_1 AND x_2)OR((NOT x_1)AND(NOT x_2)) XNOR=(x1ANDx2)OR((NOTx1)AND(NOTx2))
    First construct the expression ( ( N O T x 1 ) A N D ( N O T x 2 ) ) ((NOT x_1)AND(NOT x_2)) ((NOTx1)AND(NOTx2)) Partial neurons :
     Insert picture description here XNOR Operator implementation :
     Insert picture description here
    Andrew NG Pictures given :
     Insert picture description here  Insert picture description here

   In this way, we can gradually construct more and more complex functions , You can also get more powerful eigenvalues . This is the power of neural networks !

6. Multi classification in neural networks

   for example , Train a neural network algorithm to recognize passers-by (pedestrian)、 automobile (car)、 The motorcycle (motorcycle) And trucks (truck), In the output layer we should have 4 It's worth . for example , The first value is 1 or 0 Used to predict whether a pedestrian , The second value is used to determine whether it is a car .
   Input vector 𝑥 There are three dimensions , Two intermediate layers , Output layer 4 Two neurons are used to represent 4 class , That is, every data will appear in the output layer [𝑎 𝑏 𝑐 𝑑]𝑇, And 𝑎, 𝑏, 𝑐, 𝑑 Only one of them is for 1, Represents the current class . Here's what to do The structure of neural networks :

 Insert picture description here
The output of neural network algorithm is one of four possible cases :
 Insert picture description here

原网站

版权声明
本文为[Amyniez]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206220529590296.html