当前位置:网站首页>Yolov5 super parameter setting and data enhancement analysis
Yolov5 super parameter setting and data enhancement analysis
2022-06-26 04:49:00 【qq_ forty-one million six hundred and twenty-seven thousand six】
1、YOLOV5 Introduction to the super parameter configuration file of
YOLOv5 There are about 30 Super parameters are used for various training settings . They are *xml In the definition of ./data In the catalog Yaml file . A better initial guess will produce a better final result , So it is important to initialize these values correctly before evolution . If in doubt , Just use the default values , These default values are YOLOv5 COCO Training is optimized from scratch .
YOLOv5 See... For the super parameter file of data/hyp.finetune.yaml( apply VOC Data sets ) perhaps hyo.scrach.yaml( apply COCO Data sets ) file
1、yolov5/data/hyps/hyp.scratch-low.yaml(YOLOv5 COCO Train from scratch , Low data enhancement )
# Hyperparameters for low-augmentation COCO training from scratch
# python train.py --batch 64 --cfg yolov5n6.yaml --weights '' --data coco.yaml --img 640 --epochs 300 --linear
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3) Initial learning rate
lrf: 0.01 # final OneCycleLR learning rate (lr0 * lrf) , Final OneCycleLR Learning rate
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4 , Weight decay
warmup_epochs: 3.0 # warmup epochs (fractions ok) Study rate warm up epoch
warmup_momentum: 0.8 # warmup initial momentum Learning rate warm up initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr Learning rate warming up paranoid learning rate
box: 0.05 # box loss gain
cls: 0.5 # cls loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 1.0 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
iou_t: 0.20 # IoU training threshold
anchor_t: 4.0 # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
# Color brightness , tonal (Hue)、 saturation (Saturation)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
# Image rotation
degrees: 0.0 # image rotation (+/- deg)
# Image translation
translate: 0.1 # image translation (+/- fraction)
## Scaling of image affine transformation
scale: 0.5 # image scale (+/- gain)
# Set the trimmed affine matrix coefficient
shear: 0.0 # image shear (+/- deg)
# Perspective transformation
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001 ,range 0-0.001 0.0: Affine transformation ,>0 Transform for perspective
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.0 # image mixup (probability) # stay mosaic When enabled , To enable
copy_paste: 0.0 # segment copy-paste (probability), stay mosaic When enabled , To enable
2、yolov5/data/hyps/hyp.scratch-mdeia.yaml( Data enhancement )
# YOLOv5 by Ultralytics, GPL-3.0 license
# Hyperparameters for medium-augmentation COCO training from scratch
# python train.py --batch 32 --cfg yolov5m6.yaml --weights '' --data coco.yaml --img 1280 --epochs 300
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.1 # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4
warmup_epochs: 3.0 # warmup epochs (fractions ok)
warmup_momentum: 0.8 # warmup initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr
box: 0.05 # box loss gain
cls: 0.3 # cls loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 0.7 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
iou_t: 0.20 # IoU training threshold
anchor_t: 4.0 # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.9 # image scale (+/- gain)
shear: 0.0 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.1 # image mixup (probability)
copy_paste: 0.0 # segment copy-paste (probability)
3、hyp.scratch-high.yaml( High data enhancement )
# YOLOv5 by Ultralytics, GPL-3.0 license
# Hyperparameters for high-augmentation COCO training from scratch
# python train.py --batch 32 --cfg yolov5m6.yaml --weights '' --data coco.yaml --img 1280 --epochs 300
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.1 # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4
warmup_epochs: 3.0 # warmup epochs (fractions ok)
warmup_momentum: 0.8 # warmup initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr
box: 0.05 # box loss gain
cls: 0.3 # cls loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 0.7 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
iou_t: 0.20 # IoU training threshold
anchor_t: 4.0 # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.9 # image scale (+/- gain)
shear: 0.0 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.1 # image mixup (probability)
copy_paste: 0.1 # segment copy-paste (probability)
2、OneCycleLR Learning rate
according to “OneCycleLR Learning rate ” Strategy , Set the learning rate of each parameter group .1cycle The strategy anneals the learning rate from the initial learning rate to the maximum learning rate , Then annealing from the maximum learning rate to the minimum learning rate which is much lower than the initial learning rate . Address of thesis
3、Warmup
warmup Is a learning rate optimization method , First appeared in resnet In the paper , At the beginning of model training, select a small learning rate , After a period of training (10epoch perhaps 10000steps) Use the preset learning rate for training
Why use
At the beginning of model training , Weight randomization , The understanding of data is 0, At the first epoch in , The model will quickly adjust parameters according to the input data , At this time, if a large learning rate is adopted , There is a great possibility that the model will deviate , It takes more rounds to pull back
When the model is trained for a period of time , Have some prior knowledge of data , At this time, it is not easy to learn biases by using a larger learning rate model , You can use a higher learning rate to speed up your training .
When the model is trained with a large learning rate for a period of time , The distribution of the model is relatively stable , It is not appropriate to learn new features from the data , If we continue to use a large learning rate, it will destroy the stability of the model , And using a smaller learning rate is more optimal .
Pytorch There is no inside warmup The interface of , To do this, you need to use a third-party package pytorch_warmup , You can use commands pip install pytorch_warmup Installation
1、 When the learning rate plan uses global iterations , Non tuned linear warm-up can be used like this :
import torch
import pytorch_warmup as warmup
optimizer = torch.optim.AdamW(params, lr=0.001, betas=(0.9, 0.999), weight_decay=0.01)
num_steps = len(dataloader) * num_epochs
lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=num_steps)
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)
for epoch in range(1,num_epochs+1):
for batch in dataloader:
optimizer.zero_grad()
loss = ...
loss.backward()
optimizer.step()
with warmup_scheduler.dampening():
lr_scheduler.step()
2、 If you want to use PyTorch 1.4.0 Or later versions support learning rate scheduling “ link ”, You can simply give a set of with Statement learning rate scheduler code :
lr_scheduler1 = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.9)
lr_scheduler2 = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)
for epoch in range(1,num_epochs+1):
for batch in dataloader:
...
optimizer.step()
with warmup_scheduler.dampening():
lr_scheduler1.step()
lr_scheduler2.step()
3、 When the learning rate plan uses epoch Time of signal , The preheating plan can be used in this way :
lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[num_epochs//3], gamma=0.1)
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)
for epoch in range(1,num_epochs+1):
for iter, batch in enumerate(dataloader):
optimizer.zero_grad()
loss = ...
loss.backward()
optimizer.step()
if iter < len(dataloader)-1:
with warmup_scheduler.dampening():
pass
with warmup_scheduler.dampening():
lr_scheduler.step()
4、Warmup Schedules
1、Manual Warmup
Preheating factor w(t) Depending on the warm-up period , Linear preheating and exponential preheating must be specified manually .
1、 Linear
w(t) = min(1, t / warmup_period)
warmup_scheduler = warmup.LinearWarmup(optimizer, warmup_period=2000)
2、 Exponential
warmup_period = 1 / (1 - beta2)
warmup_scheduler = warmup.UntunedExponentialWarmup(optimizer)
3、 RAdam Warmup
The warmup factor depends on Adam’s beta2 parameter for RAdamWarmup. Please see the original paper for the details.
warmup_scheduler = warmup.RAdamWarmup(optimizer)
4、 Apex’s Adam
The Apex library provides an Adam optimizer tuned for CUDA devices, FusedAdam. The FusedAdam optimizer can be used with the warmup schedulers. For example:
optimizer = apex.optimizers.FusedAdam(params, lr=0.001, betas=(0.9, 0.999), weight_decay=0.01)
lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=num_steps)
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)
4、YOLOV5 Data to enhance (yolov5-v6\utils\datasets.py)
object detection YOLOv5 - Data to enhance
Yolov5(v6.1) Analysis of data enhancement methods
Once the training starts , You can go to train_batch*.jpg View the effect of the enhancement strategy in the image . These images will be in your train log directory , Usually yolov5/runs/train/exp:
train_batch0.jpg shows train batch 0 mosaics and labels:
5、 YOLOv5 Integrate Albumentations, Add new data enhancement methods
To use albumentations simply pip install -U albumentations and then update the augmentation pipeline as you see fit in the new Albumentations class in yolov5/utils/augmentations.py. Note these Albumentations operations run in addition to the YOLOv5 hyperparameter augmentations, i.e. defined in hyp.scratch.yaml.
Here’s an example that applies Blur, MedianBlur and ToGray albumentations in addition to the YOLOv5 hyperparameter augmentations normally applied to your training mosaics
class Albumentations:
# YOLOv5 Albumentations class (optional, used if package is installed)
def __init__(self):
self.transform = None
try:
import albumentations as A
check_version(A.__version__, '1.0.3') # version requirement
self.transform = A.Compose([
A.Blur(blur_limit=50, p=0.1),
A.MedianBlur(blur_limit=51, p=0.1),
A.ToGray(p=0.3)],
bbox_params=A.BboxParams(format='yolo', label_fields=['class_labels']))
logging.info(colorstr('albumentations: ') + ', '.join(f'{x}' for x in self.transform.transforms))
except ImportError: # package not installed, skip
pass
except Exception as e:
logging.info(colorstr('albumentations: ') + f'{e}')
def __call__(self, im, labels, p=1.0):
if self.transform and random.random() < p:
new = self.transform(image=im, bboxes=labels[:, 1:], class_labels=labels[:, 0]) # transformed
im, labels = new['image'], np.array([[c, *b] for c, b in zip(new['class_labels'], new['bboxes'])])
return im, labels
## You can go to YOLOv5 Integration of additional Albumentations Enhancements :
stay YOLOv5 Insert... Into the data loader albumentaugment The best place to function is here :
if self.augment:
# Augment imagespace
if not mosaic:
img, labels = random_perspective(img, labels,
degrees=hyp['degrees'],
translate=hyp['translate'],
scale=hyp['scale'],
shear=hyp['shear'],
perspective=hyp['perspective'])
# Augment colorspace
augment_hsv(img, hgain=hyp['hsv_h'], sgain=hyp['hsv_s'], vgain=hyp['hsv_v'])
# Apply cutouts
# if random.random() < 0.9:
# labels = cutout(img, labels)
among img For image ,label For border labels . Please note that , Any... You add albuments Enhancements will be made to existing automation defined in the superparameter file YOLOv5 Enhanced supplement :
6、 Define evaluation indicators
Health is the value maximization we pursue . stay YOLOv5 in , We define the default fitness function as the weighted combination of indicators :[email protected] Weighted 10%,[email protected]:0.95 Take up the rest 90%, No, Precision P and Recall R. You can adjust it according to your own needs , Or use the default fitness definition ( recommend ).
yolov5/utils/metrics.py
Lines 12 to 16 in 4103ce9
def fitness(x):
# Model fitness as a weighted combination of metrics
w = [0.0, 0.0, 0.1, 0.9] # weights for [P, R, [email protected], [email protected]:0.95]
return (x[:, :4] * w).sum(1)
7、 Evolve( Model parameter update evolution )
# Single-GPU
python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve
# Multi-GPU
for i in 0 1 2 3 4 5 6 7; do
sleep $(expr 30 \* $i) && # 30-second delay (optional)
echo 'Starting GPU '$i'...' &&
nohup python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --device $i --evolve > evolve_gpu_$i.log &
done
# Multi-GPU bash-while (not recommended)
for i in 0 1 2 3 4 5 6 7; do
sleep $(expr 30 \* $i) && # 30-second delay (optional)
echo 'Starting GPU '$i'...' &&
"$(while true; do nohup python train.py... --device $i --evolve 1 > evolve_gpu_$i.log; done)" &
done
# YOLOv5 Hyperparameter Evolution Results
# Best generation: 287
# Last generation: 300
# metrics/precision, metrics/recall, metrics/mAP_0.5, metrics/mAP_0.5:0.95, val/box_loss, val/obj_loss, val/cls_loss
# 0.54634, 0.55625, 0.58201, 0.33665, 0.056451, 0.042892, 0.013441
lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.2 # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937 # SGD momentum/Adam beta1
weight_decay: 0.0005 # optimizer weight decay 5e-4
warmup_epochs: 3.0 # warmup epochs (fractions ok)
warmup_momentum: 0.8 # warmup initial momentum
warmup_bias_lr: 0.1 # warmup initial bias lr
box: 0.05 # box loss gain
cls: 0.5 # cls loss gain
cls_pw: 1.0 # cls BCELoss positive_weight
obj: 1.0 # obj loss gain (scale with pixels)
obj_pw: 1.0 # obj BCELoss positive_weight
iou_t: 0.20 # IoU training threshold
anchor_t: 4.0 # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015 # image HSV-Hue augmentation (fraction)
hsv_s: 0.7 # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4 # image HSV-Value augmentation (fraction)
degrees: 0.0 # image rotation (+/- deg)
translate: 0.1 # image translation (+/- fraction)
scale: 0.5 # image scale (+/- gain)
shear: 0.0 # image shear (+/- deg)
perspective: 0.0 # image perspective (+/- fraction), range 0-0.001
flipud: 0.0 # image flip up-down (probability)
fliplr: 0.5 # image flip left-right (probability)
mosaic: 1.0 # image mosaic (probability)
mixup: 0.0 # image mixup (probability)
copy_paste: 0.0 # segment copy-paste (probability)
We suggest at least 300 The best results can be achieved through the evolution of generations . Please note that , Evolution is often expensive and time-consuming , Because the basic scene has to be trained hundreds of times , It may take hundreds or thousands GPU Hours .
8、 Super parametric Visualization
evolve.csv is plotted as evolve.png by utils.plots.plot_evolve() after evolution finishes with one subplot per hyperparameter showing fitness (y axis) vs hyperparameter values (x axis). Yellow indicates higher concentrations. Vertical distributions indicate that a parameter has been disabled and does not mutate. This is user selectable in the meta dictionary in train.py, and is useful for fixing parameters and preventing them from evolving.
边栏推荐
- Solution to back-off restarting failed container
- Using Matplotlib to add an external image at the canvas level
- 基础查询
- 2022.2.13
- 防撤回测试记录
- [H5 development] 01 take you to experience H5 development from a simple page ~ the whole page implementation process from static page to interface adjustment manual teaching
- 1.19 learning summary
- Use of better scroll
- Multipass Chinese document - share data with instances
- [H5 development] 02 take you to develop H5 list page ~ including query, reset and submission functions
猜你喜欢
Realize video call and interactive live broadcast in the applet
Navicat connects the pit of shardingsphere sub table and sub library plug-ins
1.19 learning summary
How can the intelligent transformation path of manufacturing enterprises be broken due to talent shortage and high cost?
A new paradigm for large model application: unified feature representation optimization (UFO)
2.22.2.14
Physical design of database design (2)
Rdkit chemical formula molecular formula search
NVM installation and use and NPM package installation failure record
Modify the number of Oracle connections
随机推荐
钟珊珊:被爆锤后的工程师会起飞|OneFlow U
PSIM software learning ---08 call of C program block
Thymeleaf data echo, single selection backfill, drop-down backfill, time frame backfill
202.2.9
Anti withdrawal test record
基础查询
Oracle data pump table
mysql高级学习(跟着尚硅谷老师周阳学习)
2020-12-18
2022.2.15
2022.2.13
Record a circular reference problem
Using Matplotlib to add an external image at the canvas level
Rdkit chemical formula molecular formula search
Introduction to markdown grammar
Sixtool- source code of multi-functional and all in one generation hanging assistant
1.24 learning summary
[H5 development] 03- take you hand in hand to improve H5 development - single submission vs batch submission with a common interface
Motivational skills for achieving goals
Method of saving pictures in wechat applet