当前位置:网站首页>Beijing University and Tencent jointly build angel4.0, and the self-developed in-depth learning framework "River map" is integrated into the ecology

Beijing University and Tencent jointly build angel4.0, and the self-developed in-depth learning framework "River map" is integrated into the ecology

2022-06-26 15:59:00 Tencent big data official

In recent days, , Peking University, - Tencent collaborative innovation laboratory ( hereinafter referred to as : laboratory ) announce , Peking University and Tencent big data team will jointly build Angel4.0—— A new generation of distributed deep learning platform , For people with massive training data 、 Deep learning training scenario of super large model parameters , Bring new large-scale in-depth learning and breaking strategies to the industry .

The laboratory was established in 2017 year , Mainly in artificial intelligence 、 Carry out frontier exploration and talent training in big data and other scientific research fields , Professor cuibin, deputy director of the computer department of Peking University, served as the director of the laboratory , Vice president of Tencent 、 Jiang Jie, general manager of the data platform department, serves as the deputy director .

 picture

Made in the lab Angel Distributed machine learning platform (https://github.com/Angel-ML, On 2017 Open source in 1.0 edition ;2018 Released in 2.0 edition , At the same time, it was officially announced to join LF AI The foundation ;2019 year ,Angel Release 3.0 edition , Upgrade to a full stack platform covering the whole process of machine learning . Soon after that ,Angel From LF AI Foundation graduate , Become the first top open source project graduated from the foundation in China .

In the field of deep learning , Distributed training has become a trend . But the design of distributed system is complex , Moreover, the current in-depth learning framework commonly used in the industry has shortcomings in distributed training , For example, mixed parallel scalability is not flexible 、 The domain class model library is not rich , This brings a challenge that can not be ignored to practitioners . So , The laboratory will be responsible for Angel Upgrade the platform , By expanding their ability to learn in depth , Build a fully compatible ecosystem 、 Industry leading performance 、 At the same time, it provides an industrial distributed deep learning platform with rich function support , help AI Industrial Development , Push AI Pervasive applications .

It is worth mentioning that , The laboratory independently developed the river map (Hetu) Deep learning engine , Solve the automatic parallel problem in super large model training , And has a general function 、 Efficient 、 agile 、 Flexible and extensible .

 picture

The existing distributed deep learning system , There are three types of problems :1、 System functionality issues , Supported communication architecture 、 Parallel strategy 、 Consistency protocols are limited ;2、 System usability issues , Distributed execution deployment is complex , High learning cost ;3、 System complexity , High degree of coupling between computing and communication , Not conducive to expansion and optimization .

For the above problems , The corresponding optimization design has been carried out for the river chart . First , River chart supports all mainstream communication architectures 、 Parallel mode 、 Synchronization protocol and common optimization schemes , More functions are provided , More versatility ; secondly , The river chart supports semi-automatic and automatic parallel modes , Hardware adaptive sensing optimal distributed deployment scheme , Easier deployment , Improved ease of use ; Last , The river chart supports the unified distributed computing chart intermediate expression , After compilation, it adapts to a variety of communication operators , Significantly reduce the complexity of the system architecture .

 picture

besides , The laboratory has also carried out a number of system optimization and academic innovation on the basis of River map , Relevant results have been published in SIGMOD、VLDB、ICDE、TKDE And other top international academic conferences and journals , Other innovative achievements will continue to be released , And with the help of Angel4.0 Apply ecology to Tencent business scenarios , Stay tuned .

at present , The river map has been opened to the outside world (https://github.com/PKU-DAIR/Hetu). stay 7 month 30 Japan -8 month 1 Day 2021ACM China Turing Conference , Main R & D director of Hetu 、 Professor cuibin of Peking University also gave a special report , The design concept and system highlights of the river chart were shared with the participating experts and scholars , Widely recognized . In recent days, , Hetu has launched the title of the fourth China open source software innovation competition held by the National Natural Science Foundation of China and other units , Invite more developers to participate in the R & D of Hetu .

 picture

Except for river chart ,Angel4.0 And... Will also be implemented in TensorFlow、PyTorch And so on , And more ecological compatibility of Tencent self-developed components , Greatly reduce the use complexity , Strive to be business friendly 、 Efficient and easy-to-use industrial level distributed deep learning framework .

原网站

版权声明
本文为[Tencent big data official]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202170506472505.html