当前位置：网站首页>1.23 neural network

1.23 neural network

2022-06-26 08:48:00 【Thick Cub with thorns】

1.23 neural network

List of articles
1.23 neural network
@[toc] The eighth 、 neural network ： describe (Neural Networks: Representation)
8.1 Nonlinear hypothesis
8.2 Neurons and the brain
8.3 Model to represent 1
8.4 Model to represent 2
8.5 Features and intuitive understanding 1（ Can be transformed into and perhaps or Gate circuit ）
8.6 Sample and intuitive understanding II（ More complex functions can be constructed ）
8.7 Multiple categories

The eighth 、 neural network ： describe (Neural Networks: Representation)

8.1 Nonlinear hypothesis

Reference video : 8 - 1 - Non-linear Hypotheses (10 min).mkv

We learned before , Both linear regression and logistic regression have such a disadvantage , namely ： When there are too many features , The calculated load will be very large .

Here is an example ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-6jvtlRoy-1642852713958)(…/images/5316b24cd40908fb5cb1db5a055e4de5.png)]

When we use $x_1$ , $x_2$ When forecasting with multiple terms of , We can apply it very well .
We've seen it before , Use nonlinear polynomial terms , It can help us build a better classification model . Suppose we have a lot of characteristics , For example, greater than 100 A variable , We want to use this 100 A feature to construct a nonlinear polynomial model , The result will be an amazing number of feature combinations , Even if we only use a combination of two features $x_1x_2+x_1x_3+x_1x_4+...+x_2x_3+x_2x_4+...+x_{99}x_{100})$ , We'll be close, too 5000 A combination of features . For general logistic regression, there are too many characteristics to calculate .

Suppose we want to train a model to recognize visual objects （ For example, identify whether a car is on a picture ）, How can we do this ？ One way is to use a lot of pictures of cars and a lot of pictures of non cars , And then use the values of the pixels on these images （ Saturation or brightness ） As a feature .

If we only use grayscale images , Each pixel has only one value （ Instead of RGB value ）, We can select two pixels in two different positions on the picture , Then a logistic regression algorithm is trained to use the values of these two pixels to judge whether the picture is a car ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-rvhyDBVg-1642852713960)(…/images/3ac5e06e852ad3deef4cba782ebe425b.jpg)]

If we all use 50x50 A small picture of pixels , And we see all the pixels as features , Will have a 2500 Features , If we want to further combine the two features to form a polynomial model , There will be an appointment ${ {2500}^{2}}/2$ individual （ near 3 Million ） features . Ordinary logistic regression model , Can't handle so many features effectively , Now we need neural networks .

8.2 Neurons and the brain

Reference video : 8 - 2 - Neurons and the Brain (8 min).mkv

Neural network is a very old algorithm , It was originally created to create machines that mimic the brain .

In this course , I'll introduce you to neural networks . Because it can solve different machine learning problems .

Neural network gradually emerged in the 1980s and 1990s , It's widely used . But for various reasons , stay 90 Applications decreased in the late s . But recently , The neural network is back . One reason is ： Neural network is an algorithm with a little too much computation . However, probably due to the faster running speed of computers in recent years , It's enough to really run a large-scale neural network . It is for this reason and some other technical factors that we will discuss later , Today's neural network is the most advanced technology for many applications . When you want to simulate the brain , It's about trying to make machines that work the same way as the human brain . The brain can learn to process images by seeing rather than listening , Learn to deal with our touch .

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-JLfuBMCP-1642852713961)(…/images/7912ea75bc7982998870721cb1177226.jpg)]

This part of the brain, this small red area, is your auditory cortex , You are understanding me now , It depends on the ears . The ear... Receives a sound signal , And send sound signals to your auditory cortex , Because of this , You can understand my words .

Here are a few more examples ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-YA5Cxep9-1642852713962)(…/images/2b74c1eeff95db47f5ebd8aef1290f09.jpg)]

This picture is learned with the tongue “ see ” An example of . Its principle is ： This is actually a program called BrainPort The system of , It is now FDA
( The food and Drug Administration ) Clinical trial phase of , It can help blind people see things . Its principle is , You have a gray-scale camera on your forehead , Face forward , It can get the low resolution gray image of things in front of you . You connect a wire to the electrode array mounted on your tongue , Then each pixel is mapped to a certain position of your tongue , A point with a high voltage value may correspond to a point with a low dark pixel voltage value . Corresponds to bright pixels , Even with its current capabilities , Using this system, we can learn to use our tongues in tens of minutes “ see ” thing .

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-a71pPaRs-1642852713963)(…/images/95c020b2227ca4b9a9bcbd40099d1766.png)]

This is the second example , About human echolocation or human sonar . There are two ways you can achieve ： You can snap your fingers , Or smack your tongue . But now there are blind people , I did receive such training in school , And learn to interpret the sound wave patterns that bounce back from the environment — This is sonar . If you search YouTube after , You will find some videos about an amazing child , He had his eyeballs removed because of cancer , Although I lost my eyeball , But by snapping your fingers , He can walk around without bumping into anything , He can skate , He can throw the basketball into the basket . Notice that this is a child without eyes .

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-F50irmeF-1642852713964)(…/images/697ae58b1370e81749f9feb333bdf842.png)]

The third example is the tactile belt , If you wear it around your waist , The buzzer will sound , And it always makes a buzzing sound when facing north . It can make people have a sense of direction , In a way similar to the way birds perceive direction .

There are also some strange examples ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-Jri3chcQ-1642852713966)(…/images/1ee5c76a62b35384491c603bb54c8c0c.png)]

If you put a third eye in a frog , Frogs can also learn to use that eye . therefore , This will be very surprising . If you can plug almost any sensor into your brain , The brain's learning algorithm can find a way to learn data , And process these data . In a sense , If we can find out the learning algorithm of the brain , Then the brain learning algorithm or similar algorithm is executed on the computer , Perhaps this will be our best attempt to move towards artificial intelligence . The dream of artificial intelligence is ： One day we can make real intelligent machines .

Neural network may open a window for us to enter the distant artificial intelligence dream , But I'm going to talk about the reasons for neural networks in this class , Mainly for modern machine learning applications . It is the most effective technical method . So in the next few lessons , We will begin to delve into the technical details of neural networks .

8.3 Model to represent 1

Reference video : 8 - 3 - Model Representation I (12 min).mkv

To build a neural network model , We need to think about the neural network in the brain first ？ Every neuron can be thought of as a processing unit / Nucleus nervi （processing unit/Nucleus）, It has a lot of input / Dendrites （input/Dendrite）, And there's an output / axon （output/Axon）. A neural network is a network in which a large number of neurons are interconnected and communicate through electrical pulses .

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-g6UQ0A08-1642852713968)(…/images/3d93e8c1cd681c2b3599f05739e3f3cc.jpg)]

Here is a schematic diagram of a group of neurons , Neurons use weak currents to communicate . These weak currents are also called action potentials , In fact, it is some weak current . So if neurons want to send a message , It will just pass through its axons , Send a weak current to other neurons , This is the axon .

The neural network model is based on many neurons , Each neuron is a learning model . These neurons （ Also called activation unit ,activation unit） Take some features as output , And provide an output based on its own model . The following figure is an example of a neuron using a logistic regression model as its own learning model , In the neural network , Parameters can also be called weights （weight）.

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-QImgfLWd-1642852713969)(…/images/c2233cd74605a9f8fe69fd59547d3853.jpg)]

We designed a neural network similar to neurons , The effect is as follows ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-4Nx28fLa-1642852713970)(…/images/fbb4ffb48b64468c384647d45f7b86b5.png)]

among $x_1$ , $x_2$ , $x_3$ It's the input unit （input units）, We input raw data to them .
$a_1$ , $a_2$ , $a_3$ It's the intermediate unit , They are responsible for processing the data , And then to the next level .
Finally, the output unit , It's responsible for calculating ${h_\theta}\left( x \right)$ .

Neural network model is a network of many logic units organized according to different levels , The output variables of each layer are the input variables of the next layer . Here is a picture of 3 Layer of neural network , The first layer becomes the input layer （Input Layer）, The last layer is called the output layer （Output Layer）, The middle layer becomes the hidden layer （Hidden Layers）. We add a deviation unit for each level （bias unit）：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-F1kBNMOW-1642852713971)(…/images/8293711e1d23414d0a03f6878f5a2d91.jpg)]

Here are some markups to help describe the model ：
$a_{i}^{\left( j \right)}$ On behalf of the $j$ Layer of the first $i$ Two activation units . ${\theta }^{\left( j \right)}}$ From the first $j$ Layers are mapped to $ j+1$ The matrix of the weight of the layer , for example ${\theta }^{\left( 1 \right)}}$ A matrix representing the weights mapped from the first layer to the second layer . Its size is ： By the end of $j + 1$ The number of active cells in a layer is the number of rows , By the end of $j$ A matrix in which the number of active cells of a layer plus one is the number of columns . for example ： In the neural network shown in the figure above ${\theta }^{\left( 1 \right)}}$ The size is 3*4.

For the model shown above , The activation unit and output are expressed as ：

$a_{1}^{(2)}=g(\Theta _{10}^{(1)}{ {x}_{0}}+\Theta _{11}^{(1)}{ {x}_{1}}+\Theta _{12}^{(1)}{ {x}_{2}}+\Theta _{13}^{(1)}{ {x}_{3}})$
$a_{2}^{(2)}=g(\Theta _{20}^{(1)}{ {x}_{0}}+\Theta _{21}^{(1)}{ {x}_{1}}+\Theta _{22}^{(1)}{ {x}_{2}}+\Theta _{23}^{(1)}{ {x}_{3}})$
$a_{3}^{(2)}=g(\Theta _{30}^{(1)}{ {x}_{0}}+\Theta _{31}^{(1)}{ {x}_{1}}+\Theta _{32}^{(1)}{ {x}_{2}}+\Theta _{33}^{(1)}{ {x}_{3}})$
${h}_{\Theta }}(x)=g(\Theta _{10}^{(2)}a_{0}^{(2)}+\Theta _{11}^{(2)}a_{1}^{(2)}+\Theta _{12}^{(2)}a_{2}^{(2)}+\Theta _{13}^{(2)}a_{3}^{(2)})$

In the above discussion, only one row in the characteristic matrix （ A training example ） Feed it to the neural network , We need to feed the whole training set to our neural network algorithm to learn the model .

We can know ： every last $a$ It's all owned by the upper floor $x$ And each one $x$ The corresponding decision .

（ We call this algorithm from left to right forward propagation algorithm ( FORWARD PROPAGATION )）

hold $x$ , $\theta$ , $a$ Each is represented by a matrix ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-LgFwDPNi-1642852713972)(…/images/20171101224053.png)]

We can get $\theta \cdot X=a$ .

8.4 Model to represent 2

Reference video : 8 - 4 - Model Representation II (12 min).mkv

( FORWARD PROPAGATION )
Coding relative to using cycles , utilize Vectorization method It will make the calculation easier . Take the above neural network as an example , Try to calculate the value of the second layer ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-Z38c2YoM-1642852713973)(…/images/303ce7ad54d957fca9dbb6a992155111.png)]

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-SlTA9N7n-1642852713974)(…/images/2e17f58ce9a79525089a1c2e0b4c0ccc.png)]

We make ${z}^{\left( 2 \right)}}={ {\theta }^{\left( 1 \right)}}x$ , be ${a}^{\left( 2 \right)}}=g({ {z}^{\left( 2 \right)}})$ , Add... After calculation $a_{0}^{\left( 2 \right)}=1$ . The calculated output value is ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-2Fi36qez-1642852713974)(…/images/43f1cb8a2a7e9a18f928720adc1fac22.png)]

We make ${z}^{\left( 3 \right)}}={ {\theta }^{\left( 2 \right)}}{ {a}^{\left( 2 \right)}}$ , be $h_\theta(x)={ {a}^{\left( 3 \right)}}=g({ {z}^{\left( 3 \right)}})$ .
This is only a calculation for a training example in the training set . If we want to calculate the whole training set , We need to transpose the training set eigenmatrix , Make the features of the same instance in the same column . namely ：
${ {z}^{\left( 2 \right)}}={ {\Theta }^{\left( 1 \right)}}\times { {X}^{T}} $

${a}^{\left( 2 \right)}}=g({ {z}^{\left( 2 \right)}})$

For a better understanding Neuron Networks How it works , Let's cover the left half first ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-HWw3vpTw-1642852713975)(…/images/6167ad04e696c400cb9e1b7dc1e58d8a.png)]

The right half is actually $a_0, a_1, a_2, a_3$ , according to Logistic Regression Mode output of $h_\theta(x)$ ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-j6jjaYpR-1642852713976)(…/images/10342b472803c339a9e3bc339188c5b8.png)]

In fact, neural networks are like logistic regression, It's just that we put logistic regression Input vector in $\left[ x_1\sim {x_3} \right]$ It becomes the middle layer $\left[ a_1^{(2)}\sim a_3^{(2)} \right]$ , namely : $h_\theta(x)=g\left( \Theta_0^{\left( 2 \right)}a_0^{\left( 2 \right)}+\Theta_1^{\left( 2 \right)}a_1^{\left( 2 \right)}+\Theta_{2}^{\left( 2 \right)}a_{2}^{\left( 2 \right)}+\Theta_{3}^{\left( 2 \right)}a_{3}^{\left( 2 \right)} \right)$
We can $a_0, a_1, a_2, a_3$ As more advanced eigenvalues , That is to say $x_0, x_1, x_2, x_3$ The evolutionary body of , And they are made up of $x$ And $\theta$ Decisive , Because it's a gradient , therefore $a$ It is changing. , And it's getting worse and worse , So these higher-level eigenvalues are far better than just $x$ Power is powerful , It can also better predict new data .

This is the advantage of neural network compared with logical regression and linear regression .

8.5 Features and intuitive understanding 1（ Can be transformed into and perhaps or Gate circuit ）

Reference video : 8 - 5 - Examples and Intuitions I (7 min).mkv

essentially , Neural network can learn a series of its own characteristics . In ordinary logistic regression , We are limited to using the original features in the data $x_1,x_2,...,{ {x}_{n}}$ , Although we can use some binomial terms to combine these features , But we are still limited by these primitive features . In the neural network , The original feature is just the input layer , In our example of neural network in the above three layers , The third layer, the output layer, makes use of the features of the second layer , Not the original features in the input layer , We can think that the features in the second layer are a series of new features which are used to predict the output variables after learning by neural network .

Neural network , Monolayer neurons （ No middle layer ） Can be used to represent logical operations , For example, logic and (AND)、 Logic or (OR).

Illustrate with examples ： Logic and (AND); The left half of the figure below is the design and implementation of neural network output Layer expression , The upper part on the right is sigmod function , The lower part is the truth table .

We can use such a neural network to represent AND function ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-9cM28K2D-1642852713977)(…/images/809187c1815e1ec67184699076de51f2.png)]

among $\theta_0 = -30, \theta_1 = 20, \theta_2 = 20$
Our output function $h_\theta(x)$ That is to say ： $h_\Theta(x)=g\left( -30+20x_1+20x_2 \right)$

We know $g (x)$ The image is ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-h8K0iSbi-1642852713977)(…/images/6d652f125654d077480aadc578ae0164.png)]

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-IgkboN7e-1642852713978)(…/images/f75115da9090701516aa1ff0295436dd.png)]

So we have ： $h_\Theta(x) \approx \text{x}_1 \text{AND} \, \text{x}_2$

So our ：$h_\Theta(x) $

This is it. AND function .

Next, let's introduce another OR function ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-K0o48XOB-1642852713979)(…/images/aa27671f7a3a16545a28f356a2fb98c0.png)]

OR And AND The whole is the same , The only difference is that the values of .

8.6 Sample and intuitive understanding II（ More complex functions can be constructed ）

Reference video : 8 - 6 - Examples and Intuitions II (10 min).mkv

Binary logical operators （BINARY LOGICAL OPERATORS） When the input characteristic is Boolean （0 or 1） when , We can use a single activation layer as a binary logical operator , To represent different operators , We just need to choose different weights .

The neurons in the picture below （ The three weights are -30,20,20） It can be regarded as the same function as logic and （AND）：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-zkovsatP-1642852713979)(…/images/57480b04956f1dc54ecfc64d68a6b357.png)]

The neurons in the picture below （ The three weights are -10,20,20） Can be regarded as equivalent to logical or （OR）：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-ezOANy9y-1642852713980)(…/images/7527e61b1612dcf84dadbcf7a26a22fb.png)]

The neurons in the picture below （ The two weights are 10,-20） It can be considered that the action is equivalent to logical non （NOT）：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-yqySaFhF-1642852713981)(…/images/1fd3017dfa554642a5e1805d6d2b1fa6.png)]

We can use neurons to form more complex neural networks to achieve more complex operations . For example, we want to achieve XNOR function （ The two values entered must be the same , Are all 1 Or both 0）, namely $\text{XNOR}=( \text{x}_1\, \text{AND}\, \text{x}_2 )\, \text{OR} \left( \left( \text{NOT}\, \text{x}_1 \right) \text{AND} \left( \text{NOT}\, \text{x}_2 \right) \right)$
First, construct an expression that can express $\left( \text{NOT}\, \text{x}_1 \right) \text{AND} \left( \text{NOT}\, \text{x}_2 \right)$ Partial neurons ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-k4CJEbxt-1642852713982)(…/images/4c44e69a12b48efdff2fe92a0a698768.png)]

Then it will mean AND Neurons and representations of $\left( \text{NOT}\, \text{x}_1 \right) \text{AND} \left( \text{NOT}\, \text{x}_2 \right)$ The neuron and the representation of OR Of neurons ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-LwDe4Gv5-1642852713982)(…/images/432c906875baca78031bd337fe0c8682.png)]

We have a solution that can $\text{XNOR}$ Operator function neural network .

In this way, we can gradually construct more and more complex functions , You can also get more powerful eigenvalues .

This is the power of neural networks .

8.7 Multiple categories

Reference video : 8 - 7 - Multiclass Classification (4 min).mkv

When we have more than two categories （ That is to say $y = 1, 2, 3 \dots .$ ）, For example, the following situation , What should I do ？ If we want to train a neural network algorithm to identify passers-by 、 automobile 、 Motorcycles and trucks , In the output layer we should have 4 It's worth . for example , The first value is 1 or 0 Used to predict whether a pedestrian , The second value is used to determine whether it is a car .

Input vector $x$ There are three dimensions , Two intermediate layers , Output layer 4 Two neurons are used to represent 4 class , That is, every data will appear in the output layer ${\left[ a\text{ }b\text{ }c\text{ }d \right]}^{T}}$ , And $a, b, c, d$ Only one of them is for 1, Represents the current class . The following is an example of the possible structure of the neural network ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-2f20N4A3-1642852713983)(…/images/f3236b14640fa053e62c73177b3474ed.jpg)]

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-jJzXWaos-1642852713984)(…/images/685180bf1774f7edd2b0856a8aae3498.png)]

The output of neural network algorithm is one of four possible cases ：

[ Failed to transfer the external chain picture , The origin station may have anti-theft chain mechanism , It is suggested to save the pictures and upload them directly (img-iPAWhPTM-1642852713984)(…/images/5e1a39d165f272b7f145c68ef78a3e13.png)]

原网站

版权声明
本文为[Thick Cub with thorns]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202170554155129.html