当前位置:网站首页>Pytoch learning notes -- Summary of common functions 3
Pytoch learning notes -- Summary of common functions 3
2022-07-25 15:41:00 【whut_ L】
1--torch.optim.SGD() Function extension
import torch
LEARNING_RATE = 0.01 # Gradient descent learning rate
MOMENTUM = 0.9 # Impulse size
WEIGHT_DECAY = 0.0005 # Weight attenuation coefficient
optimizer = torch.optim.SGD(
net.parameters(),
lr = LEARNING_RATE,
momentum = MOMENTUM,
weight_decay = WEIGHT_DECAY,
nesterov = True
)Parameter interpretation :lr It means the learning rate ;momentum Represents impulse factor ;weight_decay Represents the weight attenuation coefficient ( Will use L2 The regularization );nesterov Said the use of Nesterov impulse ;
Conventional gradient descent algorithm :

l It means the learning rate ; J(θ) The loss function ;▽ Indicates gradient ;
belt momentum Gradient descent algorithm :

m Represents impulse factor ,l It means the learning rate ;
be based on Nesterov impulse Gradient descent algorithm :

belt weight_decay Gradient descent algorithm :
The main function is the loss function increase L2 The regularization , It is strongly recommended that Reference link 1 understand L2 The role of regularization , That is, how to avoid over fitting , Weight attenuation through Reference link 2 understand .
2--torch.manual_seed() Functions and torch.cuda.manual_seed() function
torch.manual_seed() function : by CPU Set seeds , Ensure that the random number generated by each experiment is fixed , That is, the initialization is the same ;
torch.cuda.manual_seed() function : by At present GPU Set seeds , The functions and torch.manual_seed() function identical ;
torch.cuda.manual_seed_all() function : by all GPU Set seeds .
In the neural network , Parameters are initialized randomly by default . Different initialization parameters often lead to different results , When we get good results, we usually hope that this result can be repeated . stay pytorch in , By setting the random number seed, ensure that the initialization operation is the same every time the code runs , Thus in the same algorithm or neural network program , Make sure the result of the operation is the same . Reference link 1 Reference link 2
边栏推荐
- Flink-1.13.6版本的 Flink sql以yarn session 模式运行,怎么禁用托管
- Pytorch学习笔记--SEResNet50搭建
- 带你创建你的第一个C#程序(建议收藏)
- Week303 of leetcode
- 为什么PrepareStatement性能更好更安全?
- No tracked branch configured for branch xxx or the branch doesn‘t exist. To make your branch trac
- Singleton mode 3-- singleton mode
- 自定义注解校验API参数电话号
- CVPR 2022 | in depth study of batch normalized estimation offset in network
- Seata中jdbc下只有一个lib嘛?
猜你喜欢

带你创建你的第一个C#程序(建议收藏)

p4552-差分

Brain racking CPU context switching

GAMES101复习:变换

Box avoiding mouse

PAT甲级1152 Google Recruitment (20 分)

《图书馆管理系统——“借书还书”模块》项目研发阶段性总结

LeetCode - 232 用栈实现队列 (设计 双栈实现队列)

Idea eye care settings

No tracked branch configured for branch xxx or the branch doesn‘t exist. To make your branch trac
随机推荐
Deadlock gossip
谷歌博客:采用多重游戏决策Transformer训练通用智能体
matlab randint,Matlab的randint函数用法「建议收藏」
JVM—类加载器和双亲委派模型
In depth: micro and macro tasks
Are you ready to break away from the "involution circle"?
2019 Zhejiang race c-wrong arrangement, greedy
Flink-1.13.6版本的 Flink sql以yarn session 模式运行,怎么禁用托管
2019陕西省省赛J-位运算+贪心
LeetCode - 232 用栈实现队列 (设计 双栈实现队列)
No tracked branch configured for branch xxx or the branch doesn‘t exist. To make your branch trac
2021上海市赛-D-卡特兰数变种,dp
2019 Shaanxi provincial competition j-bit operation + greed
不愧是阿里内部“千亿级并发系统架构设计笔记”面面俱到,太全了
JS URLEncode function
IDEA—点击文件代码与目录自动同步对应
2016CCPC网络选拔赛C-换根dp好题
Notes on inputview and inputaccessoryview of uitextfield
Matlab randInt, matlab randInt function usage "recommended collection"
PAT甲级1151 LCA in a Binary Tree (30 分)