当前位置:网站首页>Machine learning notes - Hagrid - Introduction to gesture recognition image data set

Machine learning notes - Hagrid - Introduction to gesture recognition image data set

2022-06-22 16:26:00 Sit and watch the clouds rise

         In this paper , We introduce a method for gesture recognition (HGR) The huge data set of the system Ha-GRID(HAnd Gesture Recognition Image Dataset). The dataset contains 552,992 Samples , It is divided into 18 Gesture like . The annotation consists of the bounding box of the hand with the gesture label and the mark of the leading hand . The proposed dataset allows for the construction of HGR System , The system can be used for video conference services 、 Home automation system 、 Automobile industry 、 Services for people with language and hearing impairment, etc . We pay particular attention to the interaction with devices to manage them . That's why all the 18 All the gestures are functional 、 Most people are familiar with , And may be the motivation to take some action . Besides , We use the crowdsourcing platform to collect data sets and consider various parameters to ensure the diversity of data . We described using existing... For our task HGR The challenge of data sets , And provides a detailed overview of them . Besides , A baseline for hand detection and gesture classification tasks is also proposed .HaGRID And the pre training model is publicly available .

         Gestures play an important role in human communication : Gestures can emotionally reinforce statements or completely replace them . what's more , Gesture recognition (HGR) It can be a part of human-computer interaction . These systems are used in the automotive field 、 Home automation system 、 Various videos / Streaming media platform (Zoom、Skype、Discord、Jazz etc. ) And other fields have a wide range of practical applications . Besides , The system can also become an active sign language user ( People with hearing and language impairments ) Virtual assistant or part of a service . These areas require the system to work online , And the background 、 scene 、 Subject and lighting conditions are robust .

chart 1. HaGRID It contains 18 Gesture categories (“inv.” yes “inverted” Abbreviation ).

          In this paper , We showed HaGRID Data set to design HGR System . It contains more than 50 10000 images , It is divided into 18 Gesture like symbols ( chart 1), They are not language oriented . This gesture was chosen for the design of device control systems , And serve a symbolic function role [18]. Symbolic gestures help people communicate with each other , In our case , They are used for human-computer interaction . The first 3 Section describes how to use the selected static gestures to design dynamic gestures , namely Create active gestures with symbolic gestures ( Another functional role , Corresponds to the ability to manipulate objects ). A small dictionary of functional gestures in the dataset is designed to reduce HGR The complexity of the system and avoid unnecessary cognitive burden of device users . When using gesture control system , Must have a comfortable design movement . All the gestures presented are selected as the most useful gestures for this . We also added an additional class that contains samples of natural hand movements , And call it “ No gesture ”. Background of all images 、 lighting 、 The scenes and themes are different . This heterogeneity is achieved by using two crowdsourcing platforms , namely Yandex.Toloka3 and ABC Elementary4. All samples in the dataset are of high resolution , And RGB Format collection .

         Yes (1) High resolution image 、(2) Heterogeneity of image scene 、 Subjects 、 Their age and gender 、 lighting 、 The distance from the camera to the subject and (3) The combination of features such as the number of samples becomes the creation of HaGRID The motive of . The data set consists of approximately 50 m FullHD (1920 × 1080) RGB Image composition , have 18 A gesture and a “ No gesture ” class . It's there
At least 34,730 A unique scene . please remember , The proposed dataset contains some gestures for two positions : The front and back of the hand . This allows two static gestures to be used to interpret dynamic gestures . for example , Using gestures “ stop it ” and “ Stop reversing ”, You can design dynamic gestures “ You slide ”(“ Stop thumb down ”, namely “ stop it ” rotate 180 degree , As the beginning of the line ,“ Stop reversing ” by way of conclusion ) and “ Slide down ”(“ stop it ” As the beginning of the line ,“ Stop thumbing down ”, namely “ Stop reversing ” rotate 180 degree , by way of conclusion ). Besides , You can also get 2 A dynamic gesture ,“ Slide to the right ” and “ Scroll left ”, have 90 Degree rotation enhancement . Examples of all designed dynamic gestures are shown in the figure 2 Shown .

chart 2. Sliding example of design .

          Except for gesture classification ,HaGRID It can also be used for hand detection problems ( Each image has one frame n One hand corresponds to n A bounding box ) And two binary classification problems :(1) gesture / Non gesture and (2) Right / left hand 7. chart 3 Provides an example of a tag for a sample in a dataset .

chart 3. Examples and their annotation examples .

https://github.com/hukenovs/hagridicon-default.png?t=M5H6https://github.com/hukenovs/hagrid         HaGRID Model training results on

surface 3. HaGRID Model training results on . choice F1 Scores are used as classification indicators . Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz Used to calculate reasoning time .

        HaGRID Designed for gesture recognition systems . Besides , This dataset can be used for leading search . Using the forehand can also increase the number of gestures 2 times , To match a large number of computer responses . Our follow-up work on this topic includes increasing the size of the data set by adding new static gestures and samples with natural hand behaviors similar to the target gestures .

        There are also plans to use additional annotations to extend the markup , Such as gender 、 Split mask 、 Key points, etc . Besides , We are about to launch new datasets for image and video recognition and some popular computer vision tasks . 

          In this paper , We introduced a new topic called HaGRID Of HAnd Gesture Recognition Dataset, It is one of the largest and most diverse data sets in terms of topics and collection conditions . Gesture data sets are mainly used for system control equipment , But its application potential is quite huge . Compared with other datasets ,HaGRID Is the most complex data set , Because it's from about 35.000
Scenes with different lighting and distance from the camera . Besides , It also provides HGR Baseline for mission evaluation .

原网站

版权声明
本文为[Sit and watch the clouds rise]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206221512099694.html