当前位置:网站首页>Day 5 of DL
Day 5 of DL
2022-07-16 07:51:00 【The sun is falling】
How to understand back propagation ? Explain how it works ?
This problem is designed to test your knowledge of how neural networks work . You need to make the following points clear :
- Forward process ( Forward calculation ) It is a process to help the model calculate the weight of each layer , The resulting calculation will produce a yp result . The value of the loss function is calculated , The value of the loss function will show how good the model is . If the loss function is not good enough , We need to find a way to reduce the value of the loss function . In essence, training neural network is to minimize the loss function . Loss function L (yp, yt) Express yp The output value of the model and yt The difference between the actual values of data labels .
- To reduce the value of the loss function , We need to use derivatives . Back propagation helps us calculate the derivative of each layer of the network . According to the derivative value on each layer , Optimizer (Adam、SGD、AdaDelta…) Apply gradient descent to update the weight of the network .
- Back propagation uses chain rules or derivative functions to calculate the gradient value of each layer from the last layer to the first layer .
What happens when the learning rate is too high or too low ?
When the learning rate of the model is set too low , Model training will be very slow , Because it makes very small updates to the weights . It needs to be updated many times before reaching the local best .
If the set learning rate is too high , Because the weight update is too large , The model may not converge . It's possible in a step of updating weights , The model jumps out of local optimization , Make it difficult to update the model to the best in the future , But in the local optimization point near jump around .
When the image size changes to 2 times ,CNN How many times the number of parameters ? Why?
CNN The number of model parameters depends on the number and size of filters , Instead of inputting images . therefore , Doubling the size of the image does not change the number of parameters in the model .
explain bias and Variance The trade-off between
What is? bias? You can understand ,bias It's the difference between the average forecast of the current model and the actual result we need to forecast . A tall one bias Our model shows that it pays less attention to training data . This makes the model too simple , In the training and testing did not achieve good accuracy . This phenomenon is also called Under fitting .
Variance It can be simply understood as the distribution of model output on a data point .Variance The bigger it is , The more likely the model is to pay close attention to the training data , Instead of providing generalization of data that has never been encountered . therefore , This model has achieved very good results in the training data set , But compared to the test data set , It turned out to be very bad , This is it. Over fitting The phenomenon of .
The relationship between these two concepts can be seen in the figure below :
In the diagram above , The center of the circle is a model , It perfectly predicts the exact value . in fact , You've never found such a good model . As we get further and further away from the center of the circle , Our prediction is getting worse .
We can change the model , In this way, we can increase the number of model guesses falling into the center of the circle as much as possible . We need to balance the deviation value with the variance value . If our model is too simple , Few parameters , Then it may have high bias and low variance .
On the other hand , If our model has a lot of parameters , Then it will have high square error and low deviation . This is the basis of calculating the complexity of the model when we design the algorithm .
边栏推荐
- NAT与PAT原理以及配置
- How to solve the relationship between the two use cases?
- 如何将 @Transactional 事务注解运用到炉火纯青?
- Byte test director stayed up for 10 days, and the test post interview script came out of the liver, giving you wings to your big factory dream~
- 网络布线概述
- 2-3 tree B tree b+ tree
- 头文件ctype.h(详细)
- 自动备份MySQL。且保留7天案例
- Redis主从集群搭建及哨兵模式配置
- to flash back
猜你喜欢
随机推荐
文件管理-阿里云OSS学习(一)
Basic introduction to flask 7 cookies and sessions
全排列next_permutation()函数
Master-slave copy reading and writing separation nanny level teaching
Installing redis on Linux
守望相助
三层交换与VRRP
数制转换与子网划分
网络层协议
A simple JVM tuning. Write it in your resume
快速幂求解a^b%p
LVM and disk quota
I2C协议
Code quality inspection based on sonarqube
IDEA 注释模板,这样配置才够逼格!
Day 13 of leetcode + day 3 of DL
jmeter中设置登录接口只调用一次
MySQL learning records
2021/12/12 attack and defense world crypto question making record
Looking at "money" ~ the experience of a tester with two years' graduation and an annual salary of 30W









