当前位置:网站首页>In this year's English college entrance examination, CMU delivered 134 high scores with reconstruction pre training, significantly surpassing gpt3

In this year's English college entrance examination, CMU delivered 134 high scores with reconstruction pre training, significantly surpassing gpt3

2022-06-23 14:16:00 Zhiyuan community

The reconfiguration pre training proposed in this paper (reStructured Pre-training,RST), Not only in various NLP Perform brilliantly on the task , In college entrance examination English , Also handed over a satisfactory result .

 

The way we store data is changing , From biological neural network to artificial neural network , In fact, the most common case is to use the brain to store data . With the growing amount of data available today , People are looking for different external devices to store data , Such as hard disk drive or cloud storage . With the rise of deep learning technology , Another promising storage technology has emerged , It uses artificial neural networks to store information in data .
Researchers believe , The ultimate goal of data storage is to better serve human life , The way data is accessed is as important as the way it is stored . However , There are differences in how data is stored and accessed . In the history of , People have been trying to bridge this gap , In order to make better use of the information that exists in the world . Pictured 3 Shown :
  • In biological neural networks ( Such as human brain ) aspect , Human beings are taught at a very young age ( Knowledge ) education , So that they can extract specific data to deal with the complex and changeable life .
  • For external device storage , People usually follow a certain pattern ( For example, form ) Structure the data , Then use a special language ( for example SQL) Effectively retrieve the required information from the database .
  • For storage based on artificial neural network , Researchers use self supervised learning to store data from large corpora ( Pre training ), The network is then used for various downstream tasks ( For example, emotional classification ).
come from CMU A new method for accessing data containing various types of information has been proposed by the researchers of , This information can be used as a pre training signal to guide the model to optimize parameters . The research shows the data structurally in the unit of signal . This is similar to the scenario of using a database to store data : First construct them into tables or JSON Format , In this way, we can use special language ( Such as SQL) Accurately retrieve the information you need .
Besides , This study believes that valuable signals are abundant in all kinds of data in the world , Rather than simply exist in manually managed supervisory data sets , What researchers need to do is (a) Identifying data (b) Reorganize data in a unified language (c) Integrate and store them in the pre training language model . This study calls this learning paradigm "reconstructive pre training" (reStructured Pre-training,RST). The researchers liken this process to 「 Mine treasure hunt 」. Different data sources like Wikipedia , It is equivalent to a mine rich in precious stones . They contain a wealth of information , For example, named entities from hyperlinks , It can provide signals for model pre training . A good pre training model (PLM) The composition of various signals in the data should be clearly understood , In order to provide accurate information according to the different needs of downstream tasks .
Address of thesis : https://arxiv.org/pdf/2206.11147.pdf
原网站

版权声明
本文为[Zhiyuan community]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/174/202206231323479959.html