当前位置:网站首页>Learning rate optimization strategy
Learning rate optimization strategy
2022-07-24 05:57:00 【Didi'cv】
Warmup Strategy
Deep learning , The initial weight of the model is randomly generated in the training start-up stage , choice Warmup The strategy can make the model use less learning at the beginning of training
Practice at a regular rate , After a set number of iterations , After the model tends to be stable , Then change to the preset learning rate , Achieve the effect of preheating the learning rate , It can prevent the model from shaking , Accelerate the convergence speed of the network , Improve the effect .
Used in experiments Warmup In strategy Gradual Warmup, That is, in the preheating stage of learning rate, the learning rate will gradually increase with the increase of iteration times , Until the end of the warm-up phase, the learning rate reaches the preset value , And then do the follow-up training , This can avoid the sudden increase of learning rate and the sharp increase of training error .


Poly Strategy
Learning rate is a super parameter that has a great influence on the weight update of the model . Only when the initial learning rate is set reasonably can the model be optimized , Too small will lead to slow convergence , Too large will lead to instability or convergence failure . The learning rate needs to change with the degree of online training , Its change strategy is very important , There are many strategies in deep learning , Such as Fixed Strategy 、Poly Strategy and sigmoid Strategy . In this paper, the experimental results are as follows SGD The optimization strategy adds Poly Learning rate decay strategy , The current learning rate is 

边栏推荐
- "Statistical learning methods (2nd Edition)" Li Hang Chapter 14 clustering method mind map notes and after-school exercise answers (detailed steps) K-means hierarchical clustering Chapter 14
- 第四章 决策树总结
- [raspberry pie 4B] VII. Summary of remote login methods for raspberry pie xshell, putty, vncserver, xrdp
- Chapter III summary of linear model
- DeepSort 总结
- [activiti] group task
- Machine learning (zhouzhihua) Chapter 5 notes on neural network learning
- Machine learning (zhouzhihua) Chapter 1 Introduction notes learning experience
- AD1256
- The problem that the user name and password are automatically filled in when Google / Firefox manages the background new account
猜你喜欢

顺序栈 C语言 进栈 出栈 遍历

Multi merchant mall system function disassembly Lecture 14 - platform side member level

Watermelon book / Pumpkin book -- Chapter 1 and 2 Summary

day-7 jvm完结

Loss after cosine annealing decay of learning rate

Canal+kafka actual combat (monitor MySQL binlog to realize data synchronization)

Could not load library cudnn_cnn_infer64_8.dll. Error code 126Please make sure cudnn_cnn_infer64_8.

Statistical learning methods (2nd Edition) Li Hang Chapter 22 summary of unsupervised learning methods mind mapping notes

Signals and systems: Hilbert transform

AD1256
随机推荐
信号与系统:希尔伯特变换
json.dumps()函数解析
[raspberry pie 4B] VII. Summary of remote login methods for raspberry pie xshell, putty, vncserver, xrdp
《统计学习方法(第2版)》李航 第22章 无监督学习方法总结 思维导图笔记
Native JS magnifying glass effect
Positional argument after keyword argument
In GCC__ attribute__ ((constructor) and__ attribute__ ((destructor)).
Numpy数组广播规则记忆方法 array broadcast 广播原理 广播机制
PDF文本合并
[MYCAT] MYCAT sets up read-write separation
找ArrayList<ArrayList<Double>>中出现次数最多的ArrayList<Double>
‘Results do not correspond to current coco set‘
[activiti] process example
LSTM神经网络
Typora 安装包2021年11月最后一次免费版本的安装包下载V13.6.1
[activiti] group task
Iotp2pgate two IOT devices point-to-point communication fast implementation scheme
STM32 standard peripheral Library (Standard Library) official website download method, with 2021 latest standard firmware library download link
CRC-16 Modbus代码
树莓派大用处,利用校园网搭建一个校园局域网站