当前位置：网站首页>Yolov5 super parameter setting and data enhancement analysis

Yolov5 super parameter setting and data enhancement analysis

2022-06-26 04:49:00 【qq_ forty-one million six hundred and twenty-seven thousand six】

1、YOLOV5 Introduction to the super parameter configuration file of

YOLOv5 There are about 30 Super parameters are used for various training settings . They are *xml In the definition of ./data In the catalog Yaml file . A better initial guess will produce a better final result , So it is important to initialize these values correctly before evolution . If in doubt , Just use the default values , These default values are YOLOv5 COCO Training is optimized from scratch .

YOLOv5 See... For the super parameter file of data/hyp.finetune.yaml（ apply VOC Data sets ） perhaps hyo.scrach.yaml（ apply COCO Data sets ） file

1、yolov5/data/hyps/hyp.scratch-low.yaml(YOLOv5 COCO Train from scratch , Low data enhancement )

# Hyperparameters for low-augmentation COCO training from scratch 
 # python train.py --batch 64 --cfg yolov5n6.yaml --weights '' --data coco.yaml --img 640 --epochs 300 --linear 
 # See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials 
  
 lr0: 0.01  # initial learning rate (SGD=1E-2, Adam=1E-3)  Initial learning rate 
 lrf: 0.01  # final OneCycleLR learning rate (lr0 * lrf) , Final OneCycleLR Learning rate 
 momentum: 0.937  # SGD momentum/Adam beta1 
 weight_decay: 0.0005  # optimizer weight decay 5e-4 , Weight decay 
 warmup_epochs: 3.0  # warmup epochs (fractions ok)  Study rate warm up epoch
 warmup_momentum: 0.8  # warmup initial momentum  Learning rate warm up initial momentum 
 warmup_bias_lr: 0.1  # warmup initial bias lr  Learning rate warming up paranoid learning rate 
 box: 0.05  # box loss gain 
 cls: 0.5  # cls loss gain 
 cls_pw: 1.0  # cls BCELoss positive_weight 
 obj: 1.0  # obj loss gain (scale with pixels) 
 obj_pw: 1.0  # obj BCELoss positive_weight 
 iou_t: 0.20  # IoU training threshold 
 anchor_t: 4.0  # anchor-multiple threshold 
 # anchors: 3 # anchors per output layer (0 to ignore) 
 fl_gamma: 0.0  # focal loss gamma (efficientDet default gamma=1.5) 
 # Color brightness , tonal (Hue)、 saturation (Saturation)
 hsv_h: 0.015  # image HSV-Hue augmentation (fraction) 
 hsv_s: 0.7  # image HSV-Saturation augmentation (fraction) 
 hsv_v: 0.4  # image HSV-Value augmentation (fraction) 
 # Image rotation 
 degrees: 0.0  # image rotation (+/- deg) 
 # Image translation 
 translate: 0.1  # image translation (+/- fraction) 
 ## Scaling of image affine transformation 
 scale: 0.5  # image scale (+/- gain) 
 # Set the trimmed affine matrix coefficient 
 shear: 0.0  # image shear (+/- deg) 
 # Perspective transformation 
 perspective: 0.0  # image perspective (+/- fraction), range 0-0.001 ,range 0-0.001 0.0： Affine transformation ,>0 Transform for perspective 
 flipud: 0.0  # image flip up-down (probability) 
 fliplr: 0.5  # image flip left-right (probability) 
 mosaic: 1.0  # image mosaic (probability) 
 mixup: 0.0  # image mixup (probability) # stay mosaic When enabled , To enable 
 copy_paste: 0.0  # segment copy-paste (probability), stay mosaic When enabled , To enable

2、yolov5/data/hyps/hyp.scratch-mdeia.yaml（ Data enhancement ）

# YOLOv5  by Ultralytics, GPL-3.0 license
# Hyperparameters for medium-augmentation COCO training from scratch
# python train.py --batch 32 --cfg yolov5m6.yaml --weights '' --data coco.yaml --img 1280 --epochs 300
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials

lr0: 0.01  # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.1  # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937  # SGD momentum/Adam beta1
weight_decay: 0.0005  # optimizer weight decay 5e-4
warmup_epochs: 3.0  # warmup epochs (fractions ok)
warmup_momentum: 0.8  # warmup initial momentum
warmup_bias_lr: 0.1  # warmup initial bias lr
box: 0.05  # box loss gain
cls: 0.3  # cls loss gain
cls_pw: 1.0  # cls BCELoss positive_weight
obj: 0.7  # obj loss gain (scale with pixels)
obj_pw: 1.0  # obj BCELoss positive_weight
iou_t: 0.20  # IoU training threshold
anchor_t: 4.0  # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0  # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015  # image HSV-Hue augmentation (fraction)
hsv_s: 0.7  # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4  # image HSV-Value augmentation (fraction)
degrees: 0.0  # image rotation (+/- deg)
translate: 0.1  # image translation (+/- fraction)
scale: 0.9  # image scale (+/- gain)
shear: 0.0  # image shear (+/- deg)
perspective: 0.0  # image perspective (+/- fraction), range 0-0.001
flipud: 0.0  # image flip up-down (probability)
fliplr: 0.5  # image flip left-right (probability)
mosaic: 1.0  # image mosaic (probability)
mixup: 0.1  # image mixup (probability)
copy_paste: 0.0  # segment copy-paste (probability)

3、hyp.scratch-high.yaml（ High data enhancement ）

# YOLOv5  by Ultralytics, GPL-3.0 license
# Hyperparameters for high-augmentation COCO training from scratch
# python train.py --batch 32 --cfg yolov5m6.yaml --weights '' --data coco.yaml --img 1280 --epochs 300
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials

lr0: 0.01  # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.1  # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937  # SGD momentum/Adam beta1
weight_decay: 0.0005  # optimizer weight decay 5e-4
warmup_epochs: 3.0  # warmup epochs (fractions ok)
warmup_momentum: 0.8  # warmup initial momentum
warmup_bias_lr: 0.1  # warmup initial bias lr
box: 0.05  # box loss gain
cls: 0.3  # cls loss gain
cls_pw: 1.0  # cls BCELoss positive_weight
obj: 0.7  # obj loss gain (scale with pixels)
obj_pw: 1.0  # obj BCELoss positive_weight
iou_t: 0.20  # IoU training threshold
anchor_t: 4.0  # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0  # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015  # image HSV-Hue augmentation (fraction)
hsv_s: 0.7  # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4  # image HSV-Value augmentation (fraction)
degrees: 0.0  # image rotation (+/- deg)
translate: 0.1  # image translation (+/- fraction)
scale: 0.9  # image scale (+/- gain)
shear: 0.0  # image shear (+/- deg)
perspective: 0.0  # image perspective (+/- fraction), range 0-0.001
flipud: 0.0  # image flip up-down (probability)
fliplr: 0.5  # image flip left-right (probability)
mosaic: 1.0  # image mosaic (probability)
mixup: 0.1  # image mixup (probability)
copy_paste: 0.1  # segment copy-paste (probability)

2、OneCycleLR Learning rate

according to “OneCycleLR Learning rate ” Strategy , Set the learning rate of each parameter group .1cycle The strategy anneals the learning rate from the initial learning rate to the maximum learning rate , Then annealing from the maximum learning rate to the minimum learning rate which is much lower than the initial learning rate . Address of thesis

3、Warmup

warmup Is a learning rate optimization method , First appeared in resnet In the paper , At the beginning of model training, select a small learning rate , After a period of training （10epoch perhaps 10000steps） Use the preset learning rate for training

Why use

At the beginning of model training , Weight randomization , The understanding of data is 0, At the first epoch in , The model will quickly adjust parameters according to the input data , At this time, if a large learning rate is adopted , There is a great possibility that the model will deviate , It takes more rounds to pull back

When the model is trained for a period of time , Have some prior knowledge of data , At this time, it is not easy to learn biases by using a larger learning rate model , You can use a higher learning rate to speed up your training .

When the model is trained with a large learning rate for a period of time , The distribution of the model is relatively stable , It is not appropriate to learn new features from the data , If we continue to use a large learning rate, it will destroy the stability of the model , And using a smaller learning rate is more optimal .

Pytorch There is no inside warmup The interface of , To do this, you need to use a third-party package pytorch_warmup , You can use commands pip install pytorch_warmup Installation

1、 When the learning rate plan uses global iterations , Non tuned linear warm-up can be used like this :

import torch
import pytorch_warmup as warmup

optimizer = torch.optim.AdamW(params, lr=0.001, betas=(0.9, 0.999), weight_decay=0.01)
num_steps = len(dataloader) * num_epochs
lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=num_steps)
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)
for epoch in range(1,num_epochs+1):
    for batch in dataloader:
        optimizer.zero_grad()
        loss = ...
        loss.backward()
        optimizer.step()
        with warmup_scheduler.dampening():
            lr_scheduler.step()

2、 If you want to use PyTorch 1.4.0 Or later versions support learning rate scheduling “ link ”, You can simply give a set of with Statement learning rate scheduler code :

lr_scheduler1 = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=0.9)
lr_scheduler2 = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)
for epoch in range(1,num_epochs+1):
    for batch in dataloader:
        ...
        optimizer.step()
        with warmup_scheduler.dampening():
            lr_scheduler1.step()
            lr_scheduler2.step()

3、 When the learning rate plan uses epoch Time of signal , The preheating plan can be used in this way :

lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[num_epochs//3], gamma=0.1)
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)
for epoch in range(1,num_epochs+1):
    for iter, batch in enumerate(dataloader):
        optimizer.zero_grad()
        loss = ...
        loss.backward()
        optimizer.step()
        if iter < len(dataloader)-1:
            with warmup_scheduler.dampening():
                pass
    with warmup_scheduler.dampening():
        lr_scheduler.step()

4、Warmup Schedules

1、Manual Warmup

Preheating factor w(t) Depending on the warm-up period , Linear preheating and exponential preheating must be specified manually .

1、 Linear

w(t) = min(1, t / warmup_period)
warmup_scheduler = warmup.LinearWarmup(optimizer, warmup_period=2000)

2、 Exponential

warmup_period = 1 / (1 - beta2)

warmup_scheduler = warmup.UntunedExponentialWarmup(optimizer)

3、 RAdam Warmup

The warmup factor depends on Adam’s beta2 parameter for RAdamWarmup. Please see the original paper for the details.

warmup_scheduler = warmup.RAdamWarmup(optimizer)

4、 Apex’s Adam

The Apex library provides an Adam optimizer tuned for CUDA devices, FusedAdam. The FusedAdam optimizer can be used with the warmup schedulers. For example:

optimizer = apex.optimizers.FusedAdam(params, lr=0.001, betas=(0.9, 0.999), weight_decay=0.01)
lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=num_steps)
warmup_scheduler = warmup.UntunedLinearWarmup(optimizer)

4、YOLOV5 Data to enhance （yolov5-v6\utils\datasets.py）

object detection YOLOv5 - Data to enhance
Yolov5(v6.1) Analysis of data enhancement methods
Once the training starts , You can go to train_batch*.jpg View the effect of the enhancement strategy in the image . These images will be in your train log directory , Usually yolov5/runs/train/exp:
train_batch0.jpg shows train batch 0 mosaics and labels:
Insert picture description here

5、 YOLOv5 Integrate Albumentations, Add new data enhancement methods

To use albumentations simply pip install -U albumentations and then update the augmentation pipeline as you see fit in the new Albumentations class in yolov5/utils/augmentations.py. Note these Albumentations operations run in addition to the YOLOv5 hyperparameter augmentations, i.e. defined in hyp.scratch.yaml.

Here’s an example that applies Blur, MedianBlur and ToGray albumentations in addition to the YOLOv5 hyperparameter augmentations normally applied to your training mosaics

class Albumentations:
    # YOLOv5 Albumentations class (optional, used if package is installed)
    def __init__(self):
        self.transform = None
        try:
            import albumentations as A
            check_version(A.__version__, '1.0.3')  # version requirement

            self.transform = A.Compose([
                A.Blur(blur_limit=50, p=0.1),
                A.MedianBlur(blur_limit=51, p=0.1),
                A.ToGray(p=0.3)],
                bbox_params=A.BboxParams(format='yolo', label_fields=['class_labels']))

            logging.info(colorstr('albumentations: ') + ', '.join(f'{x}' for x in self.transform.transforms))
        except ImportError:  # package not installed, skip
            pass
        except Exception as e:
            logging.info(colorstr('albumentations: ') + f'{e}')

    def __call__(self, im, labels, p=1.0):
        if self.transform and random.random() < p:
            new = self.transform(image=im, bboxes=labels[:, 1:], class_labels=labels[:, 0])  # transformed
            im, labels = new['image'], np.array([[c, *b] for c, b in zip(new['class_labels'], new['bboxes'])])
        return im, labels

Insert picture description here
## You can go to YOLOv5 Integration of additional Albumentations Enhancements :
stay YOLOv5 Insert... Into the data loader albumentaugment The best place to function is here :

if self.augment: 
     # Augment imagespace 
     if not mosaic: 
         img, labels = random_perspective(img, labels, 
                                          degrees=hyp['degrees'], 
                                          translate=hyp['translate'], 
                                          scale=hyp['scale'], 
                                          shear=hyp['shear'], 
                                          perspective=hyp['perspective']) 
  
     # Augment colorspace 
     augment_hsv(img, hgain=hyp['hsv_h'], sgain=hyp['hsv_s'], vgain=hyp['hsv_v']) 
  
     # Apply cutouts 
     # if random.random() < 0.9: 
     # labels = cutout(img, labels)

among img For image ,label For border labels . Please note that , Any... You add albuments Enhancements will be made to existing automation defined in the superparameter file YOLOv5 Enhanced supplement :

6、 Define evaluation indicators

Health is the value maximization we pursue . stay YOLOv5 in , We define the default fitness function as the weighted combination of indicators :[email protected] Weighted 10%,[email protected]:0.95 Take up the rest 90%, No, Precision P and Recall R. You can adjust it according to your own needs , Or use the default fitness definition ( recommend ).

yolov5/utils/metrics.py

Lines 12 to 16 in 4103ce9

 def fitness(x): 
     # Model fitness as a weighted combination of metrics 
     w = [0.0, 0.0, 0.1, 0.9]  # weights for [P, R, [email protected], [email protected]:0.95] 
     return (x[:, :4] * w).sum(1)

7、 Evolve（ Model parameter update evolution ）

# Single-GPU
python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve

# Multi-GPU
for i in 0 1 2 3 4 5 6 7; do
  sleep $(expr 30 \* $i) &&  # 30-second delay (optional)
  echo 'Starting GPU '$i'...' &&
  nohup python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --device $i --evolve > evolve_gpu_$i.log &
done

# Multi-GPU bash-while (not recommended)
for i in 0 1 2 3 4 5 6 7; do
  sleep $(expr 30 \* $i) &&  # 30-second delay (optional)
  echo 'Starting GPU '$i'...' &&
  "$(while true; do nohup python train.py... --device $i --evolve 1 > evolve_gpu_$i.log; done)" &
done

# YOLOv5 Hyperparameter Evolution Results
# Best generation: 287
# Last generation: 300
# metrics/precision, metrics/recall, metrics/mAP_0.5, metrics/mAP_0.5:0.95, val/box_loss, val/obj_loss, val/cls_loss
# 0.54634, 0.55625, 0.58201, 0.33665, 0.056451, 0.042892, 0.013441

lr0: 0.01  # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.2  # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937  # SGD momentum/Adam beta1
weight_decay: 0.0005  # optimizer weight decay 5e-4
warmup_epochs: 3.0  # warmup epochs (fractions ok)
warmup_momentum: 0.8  # warmup initial momentum
warmup_bias_lr: 0.1  # warmup initial bias lr
box: 0.05  # box loss gain
cls: 0.5  # cls loss gain
cls_pw: 1.0  # cls BCELoss positive_weight
obj: 1.0  # obj loss gain (scale with pixels)
obj_pw: 1.0  # obj BCELoss positive_weight
iou_t: 0.20  # IoU training threshold
anchor_t: 4.0  # anchor-multiple threshold
# anchors: 3 # anchors per output layer (0 to ignore)
fl_gamma: 0.0  # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015  # image HSV-Hue augmentation (fraction)
hsv_s: 0.7  # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4  # image HSV-Value augmentation (fraction)
degrees: 0.0  # image rotation (+/- deg)
translate: 0.1  # image translation (+/- fraction)
scale: 0.5  # image scale (+/- gain)
shear: 0.0  # image shear (+/- deg)
perspective: 0.0  # image perspective (+/- fraction), range 0-0.001
flipud: 0.0  # image flip up-down (probability)
fliplr: 0.5  # image flip left-right (probability)
mosaic: 1.0  # image mosaic (probability)
mixup: 0.0  # image mixup (probability)
copy_paste: 0.0  # segment copy-paste (probability)

Insert picture description here
We suggest at least 300 The best results can be achieved through the evolution of generations . Please note that , Evolution is often expensive and time-consuming , Because the basic scene has to be trained hundreds of times , It may take hundreds or thousands GPU Hours .

8、 Super parametric Visualization

evolve.csv is plotted as evolve.png by utils.plots.plot_evolve() after evolution finishes with one subplot per hyperparameter showing fitness (y axis) vs hyperparameter values (x axis). Yellow indicates higher concentrations. Vertical distributions indicate that a parameter has been disabled and does not mutate. This is user selectable in the meta dictionary in train.py, and is useful for fixing parameters and preventing them from evolving.
Insert picture description here