当前位置:网站首页>CMU puts forward a new NLP paradigm - reconstructing pre training, and achieving 134 high scores in college entrance examination English
CMU puts forward a new NLP paradigm - reconstructing pre training, and achieving 134 high scores in college entrance examination English
2022-06-28 02:56:00 【Zhiyuan community】
What this article puts forward Reconstruction pre training (reStructured Pre-training,RST), Not only in various NLP Perform brilliantly on the task , In college entrance examination English , Also handed over a satisfactory result .
The way we store data is changing , From biological neural network to artificial neural network , In fact, the most common case is to use the brain to store data . With the growing amount of data available today , People are looking for different external devices to store data , Such as hard disk drive or cloud storage . With the rise of deep learning technology , Another promising storage technology has emerged , It uses artificial neural networks to store information in data .
Researchers believe , The ultimate goal of data storage is to better serve human life , The way data is accessed is as important as the way it is stored . However , There are differences in how data is stored and accessed . In the history of , People have been trying to bridge this gap , In order to make better use of the information that exists in the world . Pictured 3 Shown :
In biological neural networks ( Such as human brain ) aspect , Human beings are taught at a very young age ( Knowledge ) education , So that they can extract specific data to deal with the complex and changeable life .
For external device storage , People usually follow a certain pattern ( For example, form ) Structure the data , Then use a special language ( for example SQL) Effectively retrieve the required information from the database .
For storage based on artificial neural network , Researchers use self supervised learning to store data from large corpora ( Pre training ), The network is then used for various downstream tasks ( For example, emotional classification ).
come from CMU A new method for accessing data containing various types of information has been proposed by the researchers of , This information can be used as a pre training signal to guide the model to optimize parameters . The research shows the data structurally in the unit of signal . This is similar to the scenario of using a database to store data : First construct them into tables or JSON Format , In this way, we can use special language ( Such as SQL) Accurately retrieve the information you need .
Besides , This study believes that valuable signals are abundant in all kinds of data in the world , Rather than simply exist in manually managed supervisory data sets , What researchers need to do is (a) Identifying data (b) Reorganize data in a unified language (c) Integrate and store them in the pre training language model . This study calls this learning paradigm "reconstructive pre training" (reStructured Pre-training,RST). The researchers liken this process to 「 Mine treasure hunt 」. Different data sources like Wikipedia , It is equivalent to a mine rich in precious stones . They contain a wealth of information , For example, named entities from hyperlinks , It can provide signals for model pre training . A good pre training model (PLM) The composition of various signals in the data should be clearly understood , In order to provide accurate information according to the different needs of downstream tasks .
Paper title :
reStructured Pre-training
Thesis link :
https://arxiv.org/pdf/2206.11147.pdf
▲ Pre training language model Treasure Hunt
This study proposes a new paradigm of Task-based Learning in naturallanguageprocessing , namely RST, This paradigm re emphasizes the role of data , Model pre training and fine-tuning of downstream tasks are regarded as data storage and access processes . On this basis , This study implements a simple principle , That is, a good storage mechanism should not only have the ability to cache a large amount of data , Ease of access should also be considered .
After overcoming some engineering challenges , This research is based on the reconstruction of data ( It consists of all kinds of valuable information rather than raw data ) Pre training to achieve this . Experimental proof ,RST Models are not only coming from various NLP Mission ( For example, classification 、 Information extraction 、 Fact retrieval 、 Text generation, etc ) Of 52/55 Performance on popular datasets significantly exceeds that of the best existing systems ( for example ,T0), And there is no need to fine tune downstream tasks . Every year, millions of students take part in the most authoritative college entrance examination in China, and have also achieved excellent results .
To be specific , The college entrance examination AI (Qin) Higher than the student's average score 40 branch , Than using 1/16 Parametric GPT3 Higher than 15 branch . Special Qin stay 2018 In the English test, I got 138.5 The high score ( Full marks 150).
Besides , The study also released the college entrance examination benchmark (Gaokao Benchmark) Online submission platform , contain 2018-2021 So far this year 10 An annotated English test paper ( And will be expanded every year ), Make more AI The model takes part in the college entrance examination , The study also established a relatively fair human and AI Competitive test platforms , Help us better understand where we are . in addition , In a few days ago (2022.06.08) Of 2022 College entrance examination English test , The AI The system obtains 134 Good grades , and GPT3 Only got 108 branch .
The main contributions of this study include :
1. carry Out NLP The evolutionary hypothesis of the method . This study attempts to explore modern NLP The inner link between technological development , From the overall point of view 「NLP Technological evolution hypothesis 」. In short , The core idea of this hypothesis is : The iteration of technology always develops in this direction : That is, developers need to do less to design better 、 A more general-purpose system .

Reconstruction pre training


Reconstruction project


stay 55 Kind of commonly used NLP Experiments on datasets













边栏推荐
- 【历史上的今天】6 月 12 日:美国进入数字化电视时代;Mozilla 的最初开发者出生;3Com 和美国机器人公司合并
- [today in history] June 17: the creator of the term "hypertext" was born; The birth of Novell's chief scientist; Discovery channel on
- isEmpty 和 isBlank 的用法区别
- 【历史上的今天】6 月 15 日:第一个手机病毒;AI 巨匠司马贺诞生;Chromebook 发布
- 树莓派-环境设置和交叉编译
- Packet capturing and sorting out external Fiddler -- understanding the toolbar [1]
- 《天天数学》连载53:二月二十一日
- Reprinted article: the digital economy generates strong demand for computing power Intel releases a number of innovative technologies to tap the potential of computing power
- 新手炒股开户选哪家证券平台办理是最好最安全的
- math_ (function & sequence) meaning of limit & misunderstanding and symbol sorting / neighborhood & de centring neighborhood & neighborhood radius
猜你喜欢
Feign远程调用fallback回调失败,无效果
[today in history] June 10: Apple II came out; Microsoft acquires gecad; The scientific and technological pioneer who invented the word "software engineering" was born
math_ (function & sequence) meaning of limit & misunderstanding and symbol sorting / neighborhood & de centring neighborhood & neighborhood radius
Opencv -- Hough transform and some problems encountered
Simple elk configuration to realize production level log collection and query practice
为什么大厂压力大,竞争大,还有这么多人热衷于大厂呢?
The first place on the list - the carrying rate of front-end equipment is up to 10%, and the top 10 suppliers of digital key solutions
[today in history] June 23: Turing's birthday; The birth of the founder of the Internet; Reddit goes online
[inverted pendulum control] Simulink simulation of inverted pendulum control based on UKF unscented Kalman filter
转载文章:数字经济催生强劲算力需求 英特尔发布多项创新技术挖掘算力潜能
随机推荐
CMU提出NLP新范式—重构预训练,高考英语交出134高分
[today in history] May 31: the father of Amiga was born; The co developer of basic language was born; BlackBerry BBM shutdown
【二维码图像矫正增强】基于MATLAB的二维码图像矫正增强处理仿真
Interpretation of bilstm-crf in NER forward_ algorithm
Moving Tencent to the cloud: half of the evolution history of cloud server CVM
Flask Foundation: template inheritance + static file configuration
Is it safe for qiniu to open an account? How do I open an account online?
Raspberry pie - environment settings and cross compilation
[inverted pendulum control] Simulink simulation of inverted pendulum control based on UKF unscented Kalman filter
[cloud native] - docker installation and deployment of distributed database oceanbase
2021年软件测试工具总结——模糊测试工具
[today in history] June 10: Apple II came out; Microsoft acquires gecad; The scientific and technological pioneer who invented the word "software engineering" was born
Shuttle uses custompaint to paint basic shapes
> Could not create task ‘:app:MyTest. main()‘. > SourceSet with name ‘main‘ not found. Problem repair
基于STM32的编写
Flashtext, a data cleaning tool, has directly increased the efficiency by dozens of times
MFC common current path
Summary of software testing tools in 2021 - fuzzy testing tools
【历史上的今天】5 月 31 日:Amiga 之父诞生;BASIC 语言的共同开发者出生;黑莓 BBM 停运
Flutter 使用 CustomPaint 绘制基本图形