当前位置:网站首页>Aicon2021 | AI technology helps content security and promotes the healthy development of Internet Environment
Aicon2021 | AI technology helps content security and promotes the healthy development of Internet Environment
2022-06-23 22:56:00 【Youtu Laboratory】
In recent years , With the maturity of deep learning technology and the growth of computer computing power , Artificial intelligence technology has achieved rapid popularization and landing in business scenarios of various industries . Under the background of further implementation of artificial intelligence technology , What changes and technological innovations will be brought to the industry , It has become an issue of common concern .
11 month 25 to 26 Japan , With “AI Technological evolution under commercialization ” For the main research direction AICon The Beijing station of the global conference on artificial intelligence and machine learning technology was successfully held . according to the understanding of ,AICon Beijing railway station is equipped with “ Artificial intelligence frontier technology ”、“ Computer vision practice ”、“ Combination of intelligent financial technology and business ”、“ Frontier exploration of cognitive intelligence ” etc. 14 A technical topic , And invited 50 More than senior experts in the industry , Share their latest AI Technological innovation and application practice .
This conference , Yan Ke, head of content security algorithm of Tencent Youtu lab Invited to attend “ Computer vision practice ” Discussion on technical topics , And pass 《 Research and practice of Tencent Youtu in the field of visual content understanding 》 Keynote speech of , The research results and application examples of Tencent Youtu in the field of content security are shared , It provides experience and ideas for technological innovation and implementation .
01
The application of visual content understanding in the field of content security
Technical features and challenges
With the rapid development of Internet , Both the presentation form and information volume of network content have ushered in explosive growth . And behind these increases , There is also a huge amount of porn hidden 、 Bad and harmful information such as blood , Not only does it harm the content ecology of the Internet platform , More likely to cause safety problems . In the context of increasing content security issues ,AI、 Information technologies such as big data can assist traditional manual audit , Play an important role in content security .
Based on this , Tencent Youtu relies on vision AI Technical research results , It has been made to include yellow related 、 advertisement 、 Violations of laws and regulations 、 One stop content security solutions . With the support of integrated access 、 Demand customization 、 Detailed label system and automatic training platform , The solution can be used in the community 、UGC、 live broadcast 、 Assist people to review in on-demand scenes , So as to improve the efficiency of content security audit .
But in promoting vision AI In the process of technology landing in the business scenario , Tencent Youtu also summarized the technical characteristics and challenges of visual content understanding :
First , Content security audit is widely used in various industries and businesses in different countries at home and abroad , The audit scenarios for different businesses vary greatly ; Take the live game scenario as an example , This scene is generally a game screen in the quadratic mode , However, the pixel quality of overseas mobile phones is different from that of domestic mobile phones , Many of them are blurry and low-quality images , The variety of scenes is a serious test AI The stability and generalization ability of the algorithm .
secondly , For the same content , The standard definitions of different customers vary greatly , Develop a label and standard system that can achieve full coverage according to customer needs , Put forward higher requirements for complete technology .
Last , Diversified content audit scenarios also require the scheme to have multi tag identification 、 object detection 、 The picture is fine-grained 、OCR Technology , It is impossible to solve all problems through a simple technical point or a common model , It also puts forward higher requirements for refinement and rapid optimization of model capability .
02
Tencent Youtu lab is engaged in visual content understanding
The main research direction of the scene
at present , Tencent Youtu's main research directions in the field of content security include Fine grained recognition 、 Multi tag identification 、 object detection 、 target location 、 Against attack 、 Image description Wait for the direction .
Rotating target detection
Object detection plays an important role in content understanding , Object detection in most scenes is common forward frame detection , But on remote sensing maps 、 Dense commodities 、 Text detection in natural scenes and other scenes need to use multi angle target detection technology , Tencent Youtu proposed DRN The Internet To improve the effect of multi angle detection .
First, in the feature selection module FSM An adaptive receptive field adjustment module is designed , So that the model can be based on the target shape , The rotation angle adjusts the receptive field adaptively , Alleviate the contradiction between single receptive field and changeable goal . Then we design dynamic revision classifiers for classification and regression tasks respectively DRHC With dynamic correction regression DRHR, Make the model learn the general knowledge irrelevant to samples and the special knowledge sensitive to samples at the same time , Endow the model with the ability of self-adaptive adjustment according to samples . Besides , This work also designs a unified dynamic correction network , The model can learn the rotating target detection task from end to end .
Weak supervision target positioning
Because of its excellent effect, fully supervised target detection has been widely used in various tasks of content understanding , But its labeling cost is very high , Statistics show that according to the requirements of weak supervision, only image-level The category label of does not indicate bbox, The marking speed can be faster than bbox level The annotation of is increased several times .
To improve efficiency and reduce costs , Tencent has carried out in-depth research on target detection and location with excellent image and weak supervision . The work proposes The target structure remains Is the key problem of weak supervision positioning , Firstly, a restricted activation module is designed to mitigate the problem of missing structural information of the model , Then we redefine the concept of high-order similarity and propose an auto correlation graph generation module , The target positioning accuracy is significantly improved .
The restricted activation module mainly includes two parts , First, the rough pseudo - random distribution is obtained by calculating the variance distribution of each feature position on the category response graph mask, To distinguish the front background ; And then use it Sigmoid Operate to normalize the category response characteristic graph , Then, the proposed restricted activation loss function is used to guide the model to focus on the target foreground area . In the autocorrelation graph generation module , take CAM The positioning result of is treated as a seed node , Extract the similarity map of foreground and background respectively , The final localization result is obtained by the background similarity map before aggregation .
Multi tag identification
Multi tag recognition is a very common technical problem in content understanding scenarios , The main goal is to recognize multiple objects in an image at the same time . Most of the existing work mainly enhances the semantic expression of features by learning the co-occurrence dependency of tags . Tencent Youtu proposed that in addition to co-occurrence dependency , Spatial dependence is also an important factor affecting multi label recognition .
thus , Tencent Youtu proposed be based on Transformer Two-way complementary relationship learning framework To jointly learn spatial dependence and co-occurrence dependence . For spatial dependencies , Use cross scale Transformer Modeling long-distance spatial context correlation , say concretely , Yes CNN After cross-scale enhancement of the extracted features, the image features with richer spatial information can be obtained , Then use the shared weight transformer Layer to model spatial dependence in image features , Promote category response according to spatial association . For co-occurrence dependency , We propose category aware constraints and spatial association guidance , Joint modeling of dynamic semantic association based on graph neural network , Finally, the two complementary relationships are combined for collaborative learning to obtain robust multi label prediction results .
Fine grained recognition
Fine-grained image recognition is a hot issue in computer vision , It aims to classify similar objects with high apparent similarity into different subclasses . The existing fine-grained recognition algorithms usually use the high-order features between channels to obtain the distinctive representation . It ignores the relationship between spatial location and different semantic features , In the case of complex background or small class spacing, the misjudgment is more significant . Tencent Youtu aims at this problem , Creatively put forward a Feature high-order relation modeling Methods , Mining spatial and semantic associations between features to model high-order relationships , The features with high discrimination can be obtained by merging the similarity relations among them .
Firstly, the method constructs a high-dimensional feature library through semantic and location awareness among features (feature bank), At the same time regularization constraints . Secondly, a graph based semantic grouping method is proposed (graph grouping), Mapping high dimensional features to low dimensional space , Keep the characteristics of high differentiation . In the process of training , A group learning strategy is designed (group-wise learning), The feature clustering center is constrained . Through the cooperation of the above three modules , This method can learn more discriminative information between fine-grained categories . In addition, during training , A balanced grouping strategy is also designed , Sample different samples according to centralization , Perform grouping constraint iteration , Make image features tend to cluster prototypes , Suppress the characterization of abnormal samples .
03
Tencent Youtu visual content understanding
A practical application case of
at present , The actual application scenarios of Tencent Youtu visual content understanding include ACG Sensitive content recognition and image emotional tendency analysis .
ACG Sensitive content recognition
In the field of content security , because ACG There is a big difference between the scene style and the general scene , Lead to generic models in animation 、 The recognition ability in the comic field is relatively weak , It is prone to a large number of omissions and misjudgments . To solve such problems , Tencent Youtu proposed Progressive domain adaptive method , First, the characteristic distributions of the source domain and the target domain are counted , use MMD Shorten common features and ACG Distance between feature distributions , Then the dynamic progressive learning strategy is proposed PAS, Learn from easy to difficult , Reduce the difficulty of migration . Finally, the semi supervised learning fast iteration model , Get oriented ACG Scene specific recognition model .
in application ,ACG Compared with the general model , Significantly improve the recall rate , It greatly improves the efficiency and effect of the scenario audit .
Image tendency analysis
In the content review at this stage , The audit system will recall the pictures with sensitive elements such as RMB . But in the actual scenario , It is normal to see a large number of pictures of RMB elements , This virtually adds a lot of burden to the human trial process .
So , Tencent Youtu is based on image description (Image Caption) technology , Put forward a set of Image propensity analysis framework , It can accurately identify malicious sensitive elements , Compared with the general element detection and identification scheme , It can significantly reduce the cost of human audit of relevant capabilities .
Activity Notice
2021 Tencent of the year light Forum Upcoming 12 month 23 Day in Xiamen . We'll invite technology gurus , Public welfare representatives , Academic masters gather together , We will work together to find a direction for the continuous improvement of science and technology .
Look forward to more people who want to use technology to help a better future “ Innovator ” We follow and participate in , Witness the occurrence of future innovation together , Let the good continue to happen !
边栏推荐
- PHPMailer 发送邮件 PHP
- How do new investors open accounts by speculation? Is online account opening safe?
- Troubleshooting of undefined problems in the channel list of easynvr channel management
- Detailed explanation of flutter exception capture
- 【技术干货】蚂蚁办公零信任的技术建设路线与特点
- PostgreSQL怎么创建分区表详解
- How to choose the server for website construction, including which servers and how much to rent
- Understand the data consistency between MySQL and redis
- Dlib detects 68 facial features, and uses sklearn to train a facial smile recognition model based on SVM
- Save: software analysis, verification and test platform
猜你喜欢

Pourquoi une seule valeur apparaît - elle sur votre carte de données?

C#/VB.NET Word转Text

专业“搬砖”老司机总结的 12 条 SQL 优化方案,非常实用!

混沌工程,了解一下
PostgreSQL怎么创建分区表详解

Ant group's self-developed tee technology has passed the national financial technology product certification

SLSA: 成功SBOM的促进剂
Docker中部署Redis集群与部署微服务项目的详细过程
SQL语句中EXISTS的详细用法大全

游戏安全丨喊话CALL分析-写代码
随机推荐
Pourquoi une seule valeur apparaît - elle sur votre carte de données?
C language picture transcoding for performance testing
Log4j has been exposed to a nuclear bomb level vulnerability, and the developer has fried the pot!
国家邮政局等三部门:加强涉邮政快递个人信息安全治理,推行隐私面单、虚拟号码等个人信息去标识化技术
How to create a virtual server through a fortress machine? What are the functions of the fortress machine?
Build the first security defense line for enterprises to go to the cloud Tencent's new generation cloud firewall product launch is about to open
Detailed explanation of bitmap optimization
Docker中部署Redis集群与部署微服务项目的详细过程
脚本之美│VBS 入门交互实战
Website construction column setting form which website construction company is better
What is the API gateway architecture? What are the common gateway types?
Installation and use of qingscan scanner
openGauss Developer Day 2022正式开启,与开发者共建开源数据库根社区
H265 enables mobile phone screen projection
How to batch generate flattermark barcode
应用实践 | Apache Doris 整合 Iceberg + Flink CDC 构建实时湖仓一体的联邦查询分析架构
sql server常用sql
Analysis and application of ThreadLocal source code
Notes to nodejs (III)
AAAI 2022 | Tencent Youtu 14 papers were selected, including image coloring, face security, scene text recognition and other frontier fields