当前位置:网站首页>Machine learning notes - Hagrid - Introduction to gesture recognition image data set
Machine learning notes - Hagrid - Introduction to gesture recognition image data set
2022-06-22 16:26:00 【Sit and watch the clouds rise】
In this paper , We introduce a method for gesture recognition (HGR) The huge data set of the system Ha-GRID(HAnd Gesture Recognition Image Dataset). The dataset contains 552,992 Samples , It is divided into 18 Gesture like . The annotation consists of the bounding box of the hand with the gesture label and the mark of the leading hand . The proposed dataset allows for the construction of HGR System , The system can be used for video conference services 、 Home automation system 、 Automobile industry 、 Services for people with language and hearing impairment, etc . We pay particular attention to the interaction with devices to manage them . That's why all the 18 All the gestures are functional 、 Most people are familiar with , And may be the motivation to take some action . Besides , We use the crowdsourcing platform to collect data sets and consider various parameters to ensure the diversity of data . We described using existing... For our task HGR The challenge of data sets , And provides a detailed overview of them . Besides , A baseline for hand detection and gesture classification tasks is also proposed .HaGRID And the pre training model is publicly available .
Gestures play an important role in human communication : Gestures can emotionally reinforce statements or completely replace them . what's more , Gesture recognition (HGR) It can be a part of human-computer interaction . These systems are used in the automotive field 、 Home automation system 、 Various videos / Streaming media platform (Zoom、Skype、Discord、Jazz etc. ) And other fields have a wide range of practical applications . Besides , The system can also become an active sign language user ( People with hearing and language impairments ) Virtual assistant or part of a service . These areas require the system to work online , And the background 、 scene 、 Subject and lighting conditions are robust .

In this paper , We showed HaGRID Data set to design HGR System . It contains more than 50 10000 images , It is divided into 18 Gesture like symbols ( chart 1), They are not language oriented . This gesture was chosen for the design of device control systems , And serve a symbolic function role [18]. Symbolic gestures help people communicate with each other , In our case , They are used for human-computer interaction . The first 3 Section describes how to use the selected static gestures to design dynamic gestures , namely Create active gestures with symbolic gestures ( Another functional role , Corresponds to the ability to manipulate objects ). A small dictionary of functional gestures in the dataset is designed to reduce HGR The complexity of the system and avoid unnecessary cognitive burden of device users . When using gesture control system , Must have a comfortable design movement . All the gestures presented are selected as the most useful gestures for this . We also added an additional class that contains samples of natural hand movements , And call it “ No gesture ”. Background of all images 、 lighting 、 The scenes and themes are different . This heterogeneity is achieved by using two crowdsourcing platforms , namely Yandex.Toloka3 and ABC Elementary4. All samples in the dataset are of high resolution , And RGB Format collection .
Yes (1) High resolution image 、(2) Heterogeneity of image scene 、 Subjects 、 Their age and gender 、 lighting 、 The distance from the camera to the subject and (3) The combination of features such as the number of samples becomes the creation of HaGRID The motive of . The data set consists of approximately 50 m FullHD (1920 × 1080) RGB Image composition , have 18 A gesture and a “ No gesture ” class . It's there
At least 34,730 A unique scene . please remember , The proposed dataset contains some gestures for two positions : The front and back of the hand . This allows two static gestures to be used to interpret dynamic gestures . for example , Using gestures “ stop it ” and “ Stop reversing ”, You can design dynamic gestures “ You slide ”(“ Stop thumb down ”, namely “ stop it ” rotate 180 degree , As the beginning of the line ,“ Stop reversing ” by way of conclusion ) and “ Slide down ”(“ stop it ” As the beginning of the line ,“ Stop thumbing down ”, namely “ Stop reversing ” rotate 180 degree , by way of conclusion ). Besides , You can also get 2 A dynamic gesture ,“ Slide to the right ” and “ Scroll left ”, have 90 Degree rotation enhancement . Examples of all designed dynamic gestures are shown in the figure 2 Shown .

Except for gesture classification ,HaGRID It can also be used for hand detection problems ( Each image has one frame n One hand corresponds to n A bounding box ) And two binary classification problems :(1) gesture / Non gesture and (2) Right / left hand 7. chart 3 Provides an example of a tag for a sample in a dataset .

https://github.com/hukenovs/hagrid
https://github.com/hukenovs/hagrid HaGRID Model training results on

HaGRID Designed for gesture recognition systems . Besides , This dataset can be used for leading search . Using the forehand can also increase the number of gestures 2 times , To match a large number of computer responses . Our follow-up work on this topic includes increasing the size of the data set by adding new static gestures and samples with natural hand behaviors similar to the target gestures .
There are also plans to use additional annotations to extend the markup , Such as gender 、 Split mask 、 Key points, etc . Besides , We are about to launch new datasets for image and video recognition and some popular computer vision tasks .
In this paper , We introduced a new topic called HaGRID Of HAnd Gesture Recognition Dataset, It is one of the largest and most diverse data sets in terms of topics and collection conditions . Gesture data sets are mainly used for system control equipment , But its application potential is quite huge . Compared with other datasets ,HaGRID Is the most complex data set , Because it's from about 35.000
Scenes with different lighting and distance from the camera . Besides , It also provides HGR Baseline for mission evaluation .
边栏推荐
- 【山大会议】应用设置模块
- 过气剧本杀,被露营“复活”
- Differences between Oracle client and server
- SAP ABAP 对话框编程教程:中的模块池-09
- mysql - sql执行过程
- 19、 Xv6 context switching (implementation of context switching; encapsulation and recovery of state machine)
- [Shanda conference] use typescript to reconstruct the project
- C语言贪吃蛇
- '不敢去怀疑代码,又不得不怀疑代码'记一次网络请求超时分析
- 二叉树练习第二弹
猜你喜欢
随机推荐
【山大会议】软件性能优化及bug修复
Simulation of vector
wallys/WiFi6 MiniPCIe Module 2T2R 2×2.4GHz 2x5GHz
webDriver以及Selenium使用总结
6.GUI(图形,填充)
SAP web service cannot log in to SOA management page with soamanager
Pod type
期货怎么开户?网上期货开户安全吗?
SAP价值流程&帮助请求流程-011
CMake教程系列-00-简介
SAP ABAP data types, operators and editors-02
SAP abap 数据类型,操作符和编辑器-02
什么是 SAP ABAP? 类型、ABAP 完整形式和含义
音视频基础知识|ANS 噪声抑制原理解析
Simulation of stack and queue
What is the relationship between CSC securities and qiniu school? Is it safe to open a securities account
【山大会议】一些基本工具类定义
Process address space
天翼云乘风新基建,构建数字化转型“4+2”能力体系
安全信得过!天翼云数据安全管理平台通过评测









