当前位置:网站首页>TACo:一种关于文字识别的数据增强技术
TACo:一种关于文字识别的数据增强技术
2022-06-28 03:55:00 【陈壮实的编程生活】
1. 介绍
TACo是一种数据增强技术,通过横向或纵向污损来对原图进行污损,以提高模型的普适性。污损类型有[randon, black, white, mean]四种形式,污损方向有[vertical, horizontal]
源代码地址:https://github.com/kartikgill/taco-box
2. 示意图
(1)原图:
(2)污损后的图片
3. 污损步骤(以vertical、randon为例)
Step1: 先判断输入图像是否是二维的灰度图,因为只针对2维灰度图进行污损;
if len(image.shape) < 2 or len(image.shape) > 3: # 确保是2维的灰度输入图像
raise Exception("Input image with Invalid Shape!")
if len(image.shape) == 3:
raise Exception("Only Gray Scale Images are supported!")
Step2: 然后再在预设的单片最小污损宽度和最大污损宽度之间随机选取一个数,最为污损宽度;
if orientation =='vertical':
tiles = []
start = 0
tile_width = random.randint(min_tw, max_tw)
Step3: 再根据确定的污损宽度对原图进行切片,并根据预设的污损概率判断是否污损该切片;
while start < (img_w - 1):
tile = image[:, start:start+min(img_w-start-1, tile_width)]
if random.random() <= self.corruption_probability_vertical: # 如果随机数 < 预设的概率值,则进行污损
tile = self._corrupted_tile(tile, corruption_type)
tiles.append(tile)
start = start + tile_width
Step4: 拼接各切片并返回该合成图片(即增强后的图片)
augmented_image = np.hstack(tiles)
4. 源码
import matplotlib.pyplot as plt
import random
import numpy as np
class Taco:
def __init__(self,
cp_vertical=0.25,
cp_horizontal=0.25,
max_tw_vertical=100,
min_tw_vertical=20,
max_tw_horizontal=50,
min_tw_horizontal=10
):
"""
-: Creating Taco object and setting up parameters:-
-------Arguments--------
:cp_vertical: corruption probability of vertical tiles 垂直切片的无损概率
:cp_horizontal: corruption probability for horizontal tiles 水平切片的无损概率
:max_tw_vertical: maximum possible tile width for vertical tiles in pixels 垂直平铺的最大可能平铺宽度(像素)
:min_tw_vertical: minimum tile width for vertical tiles in pixels 垂直平铺的最小平铺宽度(像素)
:max_tw_horizontal: maximum possible tile width for horizontal tiles in pixels 水平平铺的最大可能平铺宽度(像素)
:min_tw_horizontal: minimum tile width for horizontal tiles in pixels 水平平铺的最小平铺宽度(像素)
"""
self.corruption_probability_vertical = cp_vertical
self.corruption_probability_horizontal = cp_horizontal
self.max_tile_width_vertical = max_tw_vertical
self.min_tile_width_vertical = min_tw_vertical
self.max_tile_width_horizontal = max_tw_horizontal
self.min_tile_width_horizontal = min_tw_horizontal
def apply_vertical_taco(self, image, corruption_type='random'):
"""
Only applies taco augmentations in vertical direction.
Default corruption type is 'random', other supported types are [black, white, mean].
-------Arguments-------
:image: A gray scaled input image that needs to be augmented. 需要增强的 灰度 输入图像。
:corruption_type: Type of corruption needs to be applied [one of- black, white, random or mean]
-------Returns--------
A TACO augmented image. 返回增强图像
"""
if len(image.shape) < 2 or len(image.shape) > 3: # 确保是2维的灰度输入图像
raise Exception("Input image with Invalid Shape!")
if len(image.shape) == 3:
raise Exception("Only Gray Scale Images are supported!")
img_h, img_w = image.shape[0], image.shape[1]
image = self._do_taco(image, img_h, img_w,
self.min_tile_width_vertical,
self.max_tile_width_vertical,
orientation='vertical',
corruption_type=corruption_type)
return image
def apply_horizontal_taco(self, image, corruption_type='random'):
"""
Only applies taco augmentations in horizontal direction.
Default corruption type is 'random', other supported types are [black, white, mean].
-------Arguments-------
:image: A gray scaled input image that needs to be augmented.
:corruption_type: Type of corruption needs to be applied [one of- black, white, random or mean]
-------Returns--------
A TACO augmented image.
"""
if len(image.shape) < 2 or len(image.shape) > 3:
raise Exception("Input image with Invalid Shape!")
if len(image.shape) == 3:
raise Exception("Only Gray Scale Images are supported!")
img_h, img_w = image.shape[0], image.shape[1]
image = self._do_taco(image, img_h, img_w,
self.min_tile_width_horizontal,
self.max_tile_width_horizontal,
orientation='horizontal',
corruption_type=corruption_type)
return image
def apply_taco(self, image, corruption_type='random'):
"""
Applies taco augmentations in both directions (vertical and horizontal).
Default corruption type is 'random', other supported types are [black, white, mean].
-------Arguments-------
:image: A gray scaled input image that needs to be augmented.
:corruption_type: Type of corruption needs to be applied [one of- black, white, random or mean]
-------Returns--------
A TACO augmented image.
"""
image = self.apply_vertical_taco(image, corruption_type)
image = self.apply_horizontal_taco(image, corruption_type)
return image
def visualize(self, image, title='example_image'):
"""
A function to display images with given title.
"""
plt.figure(figsize=(5, 2))
plt.imshow(image, cmap='gray')
plt.title(title)
plt.tight_layout()
plt.show()
def _do_taco(self, image, img_h, img_w, min_tw, max_tw, orientation, corruption_type):
"""
apply taco algorithm on image and return augmented image.
"""
if orientation =='vertical':
tiles = []
start = 0
tile_width = random.randint(min_tw, max_tw)
while start < (img_w - 1):
tile = image[:, start:start+min(img_w-start-1, tile_width)]
if random.random() <= self.corruption_probability_vertical: # 如果随机数 < 预设的概率值,则进行污损
tile = self._corrupted_tile(tile, corruption_type)
tiles.append(tile)
start = start + tile_width
augmented_image = np.hstack(tiles)
else:
tiles = []
start = 0
tile_width = random.randint(min_tw, max_tw)
while start < (img_h - 1):
tile = image[start:start+min(img_h-start-1,tile_width), :]
if random.random() <= self.corruption_probability_vertical:
tile = self._corrupted_tile(tile, corruption_type)
tiles.append(tile)
start = start + tile_width
augmented_image = np.vstack(tiles)
return augmented_image
def _corrupted_tile(self, tile, corruption_type):
"""
Return a corrupted tile with given shape and corruption type.
"""
tile_shape = tile.shape
if corruption_type == 'random':
corrupted_tile = np.random.random(tile_shape)*255
if corruption_type == 'white':
corrupted_tile = np.ones(tile_shape)*255
if corruption_type == 'black':
corrupted_tile = np.zeros(tile_shape)
if corruption_type == 'mean':
corrupted_tile = np.ones(tile_shape)*np.mean(tile)
return corrupted_tile
边栏推荐
- 一文详解|增长那些事儿
- Multithreading and high concurrency III: AQS underlying source code analysis and implementation classes
- Iso8191 test is mentioned in as 3744.1. Are the two tests the same?
- 仅用递归函数和栈操作逆序一个栈
- Multithreading and high concurrency V: detailed explanation of wait queue, executor and thread pool (key)
- [matlab traffic light identification] traffic light identification [including GUI source code 1908]
- 27 years, Microsoft IE is over!
- 有人用cdc同步到mysql发生过死锁吗?
- 2022年中國音頻市場年度綜合分析
- Ppt production tips
猜你喜欢

Web3来临时的风口浪尖

2022年中國音頻市場年度綜合分析

Excel knowledge and skills summary

Iso8191 test is mentioned in as 3744.1. Are the two tests the same?

June 27, 2022: give a 01 string with a length of N. now please find two intervals so that the number of 1 and the number of 0 in the two intervals are equal. The two intervals can intersect, but not c

The coming wave of Web3

Multithreading and high concurrency IV: varhandle, strong weak virtual reference and ThreadLocal

Problems with cat and dog queues

Matlab exercises -- basic data processing

从零到一,教你搭建「以文搜图」搜索服务(一)
随机推荐
Are test / development programmers really young? The world is fair. We all speak by strength
Analyse complète annuelle du marché chinois de l'audio en 2022
Lazy loading and preloading of pictures
AspNetCoreRateLimit 速率限制 接口访问限制 限流控制
Pinda general permission system (day 5~day 6)
A summary of my recent situation in June 2022
filinCdc 的sql,多表的时候总报这个错,请问下该怎么解决呀
一文详解|增长那些事儿
Sum of squares of each bit of a number
Multithreading and high concurrency IV: varhandle, strong weak virtual reference and ThreadLocal
[MySQL] multi table connection query
僅用遞歸函數和棧操作逆序一個棧
测试/开发程序员真的是青春饭吗?世界是公平的,咱们都凭实力说话......
抖音實戰~關注博主
01 overview, application scenarios, Download methods, connection methods and development history of mongodb
UI自动化测试框架搭建 —— 编写一个APP自动化
27 years, Microsoft IE is over!
[small program practice series] e-commerce platform source code and function implementation
视频直播系统源码,倒计时显示,商品秒杀倒计时
由两个栈组成的队列