当前位置:网站首页>Pytoch learning notes -- Summary of common functions 3
Pytoch learning notes -- Summary of common functions 3
2022-07-25 15:41:00 【whut_ L】
1--torch.optim.SGD() Function extension
import torch
LEARNING_RATE = 0.01 # Gradient descent learning rate
MOMENTUM = 0.9 # Impulse size
WEIGHT_DECAY = 0.0005 # Weight attenuation coefficient
optimizer = torch.optim.SGD(
net.parameters(),
lr = LEARNING_RATE,
momentum = MOMENTUM,
weight_decay = WEIGHT_DECAY,
nesterov = True
)Parameter interpretation :lr It means the learning rate ;momentum Represents impulse factor ;weight_decay Represents the weight attenuation coefficient ( Will use L2 The regularization );nesterov Said the use of Nesterov impulse ;
Conventional gradient descent algorithm :

l It means the learning rate ; J(θ) The loss function ;▽ Indicates gradient ;
belt momentum Gradient descent algorithm :

m Represents impulse factor ,l It means the learning rate ;
be based on Nesterov impulse Gradient descent algorithm :

belt weight_decay Gradient descent algorithm :
The main function is the loss function increase L2 The regularization , It is strongly recommended that Reference link 1 understand L2 The role of regularization , That is, how to avoid over fitting , Weight attenuation through Reference link 2 understand .
2--torch.manual_seed() Functions and torch.cuda.manual_seed() function
torch.manual_seed() function : by CPU Set seeds , Ensure that the random number generated by each experiment is fixed , That is, the initialization is the same ;
torch.cuda.manual_seed() function : by At present GPU Set seeds , The functions and torch.manual_seed() function identical ;
torch.cuda.manual_seed_all() function : by all GPU Set seeds .
In the neural network , Parameters are initialized randomly by default . Different initialization parameters often lead to different results , When we get good results, we usually hope that this result can be repeated . stay pytorch in , By setting the random number seed, ensure that the initialization operation is the same every time the code runs , Thus in the same algorithm or neural network program , Make sure the result of the operation is the same . Reference link 1 Reference link 2
边栏推荐
- Understanding the difference between wait() and sleep()
- LeetCode - 303 区域和检索 - 数组不可变 (设计 前缀和数组)
- window系统黑窗口redis报错20Creating Server TCP listening socket *:6379: listen: Unknown error19-07-28
- Cf365-e - Mishka and divisors, number theory +dp
- BPSK调制系统MATLAB仿真实现(1)
- wait()和sleep()的区别理解
- Pytorch学习笔记--SEResNet50搭建
- LeetCode - 380 O(1) 时间插入、删除和获取随机元素 (设计 哈希表+数组)
- var、let、const之间的区别
- JVM - classloader and parental delegation model
猜你喜欢

Leetcode - 641 design cycle double ended queue (Design)*

Take you to create your first C program (recommended Collection)

window系统黑窗口redis报错20Creating Server TCP listening socket *:6379: listen: Unknown error19-07-28

LeetCode - 677 键值映射(设计)*

matlab---错误使用 var 数据类型无效。第一个输入参数必须为单精度值或双精度值

你准备好脱离“内卷化怪圈”了吗?

Are you ready to break away from the "involution circle"?

Cf888g clever dictionary tree + violent divide and conquer (XOR minimum spanning tree)

Redis分布式锁,没它真不行

Leetcode - 303 area and retrieval - array immutable (design prefix and array)
随机推荐
伤透脑筋的CPU 上下文切换
Pytorch学习笔记--常用函数总结3
不愧是阿里内部“千亿级并发系统架构设计笔记”面面俱到,太全了
Leetcode - 379 telephone directory management system (Design)
PAT甲级题目目录
Phased summary of the research and development of the "library management system -" borrowing and returning "module
2019 Shaanxi provincial competition j-bit operation + greed
Week303 of leetcode
LeetCode - 379 电话目录管理系统(设计)
Gary marcus: learning a language is more difficult than you think
二进制补码
Leetcode - 362 knock counter (Design)
Qtime定义(手工废物利用简单好看)
Flex layout
Leetcode - 359 log rate limiter (Design)
The difference between mouseover and mouseenter
CVPR 2022 | in depth study of batch normalized estimation offset in network
Pat grade a 1153 decode registration card of PAT (25 points)
Cf365-e - Mishka and divisors, number theory +dp
Window system black window redis error 20creating server TCP listening socket *: 6379: listen: unknown error19-07-28