当前位置:网站首页>Deep learning image data enhancement
Deep learning image data enhancement
2022-06-21 20:22:00 【It's Twilight】
As cv The basis of , Data expansion is a very important link . Generally speaking, there are mainly the following :1、 adopt openCV operation 2、 Use torchvision.transform 3、 Use torchvision.transform.function 4、 Use nvidia.dali 5、albumentations library . The corresponding data can also be divided into 1、 Single image processing , Such as classification .2、 Multiple data processing of the same size , Such as segmentation , Denoise .3、 Multiple data processing with different sizes , Such as super score . The following is a brief summary of the method .
1、 Data expansion type
Geometric transformation
rotate , The zoom , Flip , tailoring , translation , Affine transformation
Color space
brightness , Contrast , saturation , Color space conversion , Color adjustment ,gamma Transformation
other
Gaussian noise , Salt and pepper noise
Random erase ,mixup, Image blending ,Mosaic,copy-paste etc.
2、 openCV operation
This method is the most flexible , You can operate on single graph or paired data .
import cv2
pic = cv2.imread(“img.jpg”) # Read in the picture , get [h,w,c] Array
pic = pic[startH:endH, startW:endW] # tailoring
pic = pic[:,:,::-1] # Channel transformation
pic = cv2.resize(pic, (neww,newh)) # resize
pic = cv2.flip(pic, 1) # Flip horizontal
pic = cv2.flip(pic, 0) # Flip vertically
# rotate First, obtain the rotation matrix , Then rotate
rotationMatrix = cv2.getRotationMatrix2D((width/2, height/2), 45, .5)
pic = cv2.warpAffine(pic, rotationMatrix, (width, height))
# addWeighted Parameters ( chart 1, chart 1 The weight , chart 2, chart 2 The weight , brightness , Output )
pic = cv2.addWeighted(img1,0.5, img2,0.5) # mixup
# Contrast , alpha1 Setting controls high and low contrast .
pic = cv2.addWeighted(pic, alpha1, np.zeros(img.shape, img.dtype), 0, 0)
# Contrast , adopt brightness adjustment , Set to integer
pic = cv2.addWeighted(pic, contrast, pic, 0, brightness)
# Gaussian blur
pic = cv2.GaussianBlur(pic, (7, 7), 3)
pic[startH:endH, startW:endW] = 0 # erase
pic[startH:endH, startW:endW] = src_pic # Copy and paste
Other erasures , blend , Mosaic data enhancement , Copy and paste can be done by setting the canvas , Then replace the value to get .
Generally, it is used to deal with multiple drawings of the same size , When there are multiple graphs with different sizes, we can get by setting the parameters of different functions .
Reference 1 , Reference 2
3、torchvision.transform
Yes tensor To deal with , It is generally applicable to single graph processing , Often with PIL Functions are used together . This function looks at Official website The introduction is the clearest , Or recommend This demonstration example , Only a few of these functions are selected to show .
from PIL import Image
pic = Image.open('img.png') # (C, H, W)
# Get wide height , Note that this time size Function does not return c The channel number
w, h = pic.size
transforms.Compose([ # Combine multiple transforms
transforms.CenterCrop(10), # Center cut
transforms.Resize(size), # resize
# Randomly change the brightness of the image 、 Contrast 、 Saturation and hue
transforms.ColorJitter(brightness, contrast, saturation, hue),
transforms.Grayscale(num_output_channels), # Go to grayscale
transforms.RandomCrop(size), # Random cutting
transforms.RandomHorizontalFlip(p=0.5), # Random horizontal flip
transforms.RandomVerticalFlip(p=0.5), # Random vertical flip
transforms.RandomRotation(degrees), # Random rotation (degrees,-degress)
transforms.GaussianBlur(ks, sigma), # Gaussian blur
# turn tensor, It will be normalized automatically
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,)) # Only right tensor Handle
])
This method can be used to deal with multiple graphs of the same size , Some operations can combine the two figures concat complete , You can also write a random function to handle it , Such as
import random
def randflip(img1, img2, threshold):
if random.random() > threshold: # Generate a random number [0,1), Flip horizontal
img1 = transforms.RandomHorizontalFlip(p=1)(img1)
img2 = transforms.RandomHorizontalFlip(p=1)(img2)
if random.random() > threshold: # Generate a random number [0,1), Flip vertically
img1 = transforms.RandomVerticalFlip(p=1)(img1)
img2 = transforms.RandomVerticalFlip(p=1)(img2)
return img1, img2
But flexibility is still poor , Multi image processing of different sizes , Or many official expansion operations cannot be realized .
Reference 1
4、torchvision.transform.function
As transform Lower level functions , It does not contain specific random numbers , With more flexible features . It's the same collocation PIL The use of . In the realization of transform Basic functions at the same time , It can handle multiple drawings of different sizes .
TvF There are two ways to use it :1) As expandable torchvision.transform function , structure transform class , And Integrated into the compose in . The official sample as follows :
import torchvision.transforms.functional as TF
import random
class MyRotationTransform:
"""Rotate by one of the given angles."""
def __init__(self, angles):
self.angles = angles
def __call__(self, x):
angle = random.choice(self.angles)
return TF.rotate(x, angle)
rotation_transform = MyRotationTransform(angles=[-30, -15, 0, 15, 30])
2) As a user-defined expansion function . The abbreviation here is TvF Is to avoid F(torch.nn.functional) or TF(tensorflow) confusion .
from PIL import Image
import torchvision.transforms.functional as TvF
img = Image.open('img.jpg')
c, h, w = TvF.get_dimensions(img) # Back to the dimension
img = TvF.crop(img, top, left, height, width) # tailoring
img = TvF.erase(img, i, j, h, w, v[, inplace]) # erase , Given with v replace
img = TvF.rotate(img, angle) # Rotate by a given angle
img = TvF.hflip(img) # Flip horizontal
img = TvF.vflip(img) # Flip vertically
img = TvF.adjust_brightness(img, brightness_factor) # Brightness adjustment
img = TvF.adjust_contrast(image,contrast_factor) # Contrast adjustment
img = TvF.adjust_gamma(img, gamma) # gamma Transformation
img = TvF.gaussian_blur(image, ks, sigma) # Gaussian blur
img = TvF.normalize(img, mean, std) # Standardization
In a word, there are many functions to view official documents ,TvF In response to most of the expansion scenarios , For example, images of different sizes are matched crop, Just put HR The parameter of can be applied to LR in . But some expansion methods are not as good as cv2 flexible , such as rgb Channel transformation ,Mosaic I'm still not sure how to pass TvF Realization .
Function source code
Some function effects are visualized
5、NVIDIA DALI
DALI A miracle , yes NVIDIA Recently, the main data enhancement methods . Can be directly in GPU Handle , Significantly improve the speed of data enhancement .
Official documents
Code live , I will become more skilled in the future .
6、albumentations
be based on OpenCV Fast training data enhancement Library , It has a very simple and powerful that can be used for a variety of tasks ( Division 、 testing ) The interface of , Easy to customize and easy to add other frameworks .
The official sample
source
边栏推荐
- 1157 Anniversary
- How to query the data in MySQL
- LeetCode个人题解(剑指offer 21-25)21. 调整数组顺序使奇数位于偶数前面,22. 链表中倒数第k个节点,24. 反转链表,25. 合并两个排序的链表
- Model evaluation and selection of machine learning
- SD6.20集训总结
- Simple use of JS
- 2022-06-20
- Recycleview lazy load failure
- 软件测试办公工具推荐-桌面日历
- TensorFlow 2:使用神经网络对Fashion MNIST分类并进行比较分析
猜你喜欢

Selected articles of the research paper | interpretation of the trend of McKinsey's China's Digital Innovation future

如何查询mysql中所有表

Resttemplate multiple authentication information authorization

机器学习之聚类和降维与度量技术

How to query the data in MySQL

How MySQL implements grouping sum

点云转深度图:转化,保存,可视化

ENVI-Classic-Annotation-object添加的元素图例比例尺如何撤回修改删除

一种简单的架构设计逻辑|得物技术

Clustering, dimension reduction and measurement techniques for machine learning
随机推荐
Using fastjson to deserialize simplegrantedauthority in the security framework
1157 Anniversary
【时序预测完整教程】以气温预测为例说明论文组成及PyTorch代码管道构建
jmeter线程持续时间
How to query all tables in MySQL
Daily development of common tools to improve efficiency
What statements are added to MySQL
manjaro安装下载的ttf字体文件
Source code analysis of ArrayList
粗读Targeted Supervised Contrastive Learning for Long-Tailed Recognition
MySQL must know - Chapter 9 - Search with regular expressions
机器学习之绪论
自定义代码模板
如何查询mysql中所有表
【微信小程序更改appid失败】微信小程序修改appid一直失败报错tourist appid解决办法
Uniapp obtains login authorization and mobile number authorization (sorting)
Functor
2022-06-20
Tensorflow 2: use neural network to classify and compare fashion MNIST
HMS core machine learning service ID card identification function to achieve efficient information entry