当前位置:网站首页>Deep learning image data enhancement

Deep learning image data enhancement

2022-06-21 20:22:00 It's Twilight

As cv The basis of , Data expansion is a very important link . Generally speaking, there are mainly the following :1、 adopt openCV operation 2、 Use torchvision.transform 3、 Use torchvision.transform.function 4、 Use nvidia.dali 5、albumentations library . The corresponding data can also be divided into 1、 Single image processing , Such as classification .2、 Multiple data processing of the same size , Such as segmentation , Denoise .3、 Multiple data processing with different sizes , Such as super score . The following is a brief summary of the method .

1、 Data expansion type

Geometric transformation
rotate , The zoom , Flip , tailoring , translation , Affine transformation
Color space
brightness , Contrast , saturation , Color space conversion , Color adjustment ,gamma Transformation
other
Gaussian noise , Salt and pepper noise
Random erase ,mixup, Image blending ,Mosaic,copy-paste etc.

2、 openCV operation

This method is the most flexible , You can operate on single graph or paired data .

import cv2
pic = cv2.imread(“img.jpg”) # Read in the picture , get [h,w,c] Array 
pic = pic[startH:endH, startW:endW]          #  tailoring 
pic = pic[:,:,::-1]                          #  Channel transformation 
pic = cv2.resize(pic, (neww,newh))           # resize
pic = cv2.flip(pic, 1)                       # Flip horizontal 
pic = cv2.flip(pic, 0)                       # Flip vertically 
#  rotate   First, obtain the rotation matrix , Then rotate 
rotationMatrix = cv2.getRotationMatrix2D((width/2, height/2), 45, .5)
pic = cv2.warpAffine(pic, rotationMatrix, (width, height))
# addWeighted Parameters ( chart 1, chart 1 The weight , chart 2, chart 2 The weight , brightness , Output )
pic = cv2.addWeighted(img1,0.5, img2,0.5)    # mixup
#  Contrast , alpha1 Setting controls high and low contrast .
pic = cv2.addWeighted(pic, alpha1, np.zeros(img.shape, img.dtype), 0, 0)
#  Contrast ,  adopt brightness adjustment , Set to integer 
pic = cv2.addWeighted(pic, contrast, pic, 0, brightness)
#  Gaussian blur 
pic = cv2.GaussianBlur(pic, (7, 7), 3)
pic[startH:endH, startW:endW] = 0         #  erase 
pic[startH:endH, startW:endW] = src_pic   #  Copy and paste 

Other erasures , blend , Mosaic data enhancement , Copy and paste can be done by setting the canvas , Then replace the value to get .
Generally, it is used to deal with multiple drawings of the same size , When there are multiple graphs with different sizes, we can get by setting the parameters of different functions .
Reference 1 , Reference 2

3、torchvision.transform

Yes tensor To deal with , It is generally applicable to single graph processing , Often with PIL Functions are used together . This function looks at Official website The introduction is the clearest , Or recommend This demonstration example , Only a few of these functions are selected to show .

from PIL import Image
pic = Image.open('img.png')                        # (C, H, W)
#  Get wide height , Note that this time size Function does not return c The channel number 
w, h = pic.size                                    
transforms.Compose([                               #  Combine multiple transforms 
    transforms.CenterCrop(10),                     #  Center cut 
    transforms.Resize(size),                       # resize
    #  Randomly change the brightness of the image 、 Contrast 、 Saturation and hue 
    transforms.ColorJitter(brightness, contrast, saturation, hue),
    transforms.Grayscale(num_output_channels),     #  Go to grayscale 
    transforms.RandomCrop(size),                   #  Random cutting 
    transforms.RandomHorizontalFlip(p=0.5),        #  Random horizontal flip 
    transforms.RandomVerticalFlip(p=0.5),          #  Random vertical flip 
    transforms.RandomRotation(degrees),            #  Random rotation (degrees,-degress)
    transforms.GaussianBlur(ks, sigma),            #  Gaussian blur 
    #  turn tensor, It will be normalized automatically 
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))     #  Only right tensor Handle 
])

This method can be used to deal with multiple graphs of the same size , Some operations can combine the two figures concat complete , You can also write a random function to handle it , Such as

import random
def randflip(img1, img2, threshold):
    if random.random() > threshold:   #  Generate a random number [0,1), Flip horizontal 
        img1 = transforms.RandomHorizontalFlip(p=1)(img1) 
        img2 = transforms.RandomHorizontalFlip(p=1)(img2) 
    if random.random() > threshold:   #  Generate a random number [0,1), Flip vertically 
        img1 = transforms.RandomVerticalFlip(p=1)(img1) 
        img2 = transforms.RandomVerticalFlip(p=1)(img2)
    return img1, img2

But flexibility is still poor , Multi image processing of different sizes , Or many official expansion operations cannot be realized .
Reference 1

4、torchvision.transform.function

As transform Lower level functions , It does not contain specific random numbers , With more flexible features . It's the same collocation PIL The use of . In the realization of transform Basic functions at the same time , It can handle multiple drawings of different sizes .
TvF There are two ways to use it :1) As expandable torchvision.transform function , structure transform class , And Integrated into the compose in . The official sample as follows :

import torchvision.transforms.functional as TF
import random
class MyRotationTransform:
    """Rotate by one of the given angles."""
    def __init__(self, angles):
        self.angles = angles

    def __call__(self, x):
        angle = random.choice(self.angles)
        return TF.rotate(x, angle)
rotation_transform = MyRotationTransform(angles=[-30, -15, 0, 15, 30])

2) As a user-defined expansion function . The abbreviation here is TvF Is to avoid F(torch.nn.functional) or TF(tensorflow) confusion .

from PIL import Image
import torchvision.transforms.functional as TvF
img = Image.open('img.jpg')
c, h, w = TvF.get_dimensions(img)                      #  Back to the dimension 
img = TvF.crop(img, top, left, height, width)          #  tailoring 
img = TvF.erase(img, i, j, h, w, v[, inplace])         #  erase , Given with v replace 
img = TvF.rotate(img, angle)                           #  Rotate by a given angle 
img = TvF.hflip(img)                                   #  Flip horizontal 
img = TvF.vflip(img)                                   #  Flip vertically 
img = TvF.adjust_brightness(img, brightness_factor)    #  Brightness adjustment 
img = TvF.adjust_contrast(image,contrast_factor)       #  Contrast adjustment 
img = TvF.adjust_gamma(img, gamma)                     # gamma Transformation 
img = TvF.gaussian_blur(image, ks, sigma)              #  Gaussian blur 
img = TvF.normalize(img, mean, std)                    #  Standardization 

In a word, there are many functions to view official documents ,TvF In response to most of the expansion scenarios , For example, images of different sizes are matched crop, Just put HR The parameter of can be applied to LR in . But some expansion methods are not as good as cv2 flexible , such as rgb Channel transformation ,Mosaic I'm still not sure how to pass TvF Realization .
Function source code
Some function effects are visualized

5、NVIDIA DALI

DALI A miracle , yes NVIDIA Recently, the main data enhancement methods . Can be directly in GPU Handle , Significantly improve the speed of data enhancement .
Official documents
Code live , I will become more skilled in the future .

6、albumentations

be based on OpenCV Fast training data enhancement Library , It has a very simple and powerful that can be used for a variety of tasks ( Division 、 testing ) The interface of , Easy to customize and easy to add other frameworks .
The official sample
source

原网站

版权声明
本文为[It's Twilight]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/172/202206211828291834.html