当前位置:网站首页>CMU puts forward a new NLP paradigm - reconstructing pre training, and achieving 134 high scores in college entrance examination English
CMU puts forward a new NLP paradigm - reconstructing pre training, and achieving 134 high scores in college entrance examination English
2022-06-28 02:56:00 【Zhiyuan community】
What this article puts forward Reconstruction pre training (reStructured Pre-training,RST), Not only in various NLP Perform brilliantly on the task , In college entrance examination English , Also handed over a satisfactory result .
The way we store data is changing , From biological neural network to artificial neural network , In fact, the most common case is to use the brain to store data . With the growing amount of data available today , People are looking for different external devices to store data , Such as hard disk drive or cloud storage . With the rise of deep learning technology , Another promising storage technology has emerged , It uses artificial neural networks to store information in data .
Researchers believe , The ultimate goal of data storage is to better serve human life , The way data is accessed is as important as the way it is stored . However , There are differences in how data is stored and accessed . In the history of , People have been trying to bridge this gap , In order to make better use of the information that exists in the world . Pictured 3 Shown :
In biological neural networks ( Such as human brain ) aspect , Human beings are taught at a very young age ( Knowledge ) education , So that they can extract specific data to deal with the complex and changeable life .
For external device storage , People usually follow a certain pattern ( For example, form ) Structure the data , Then use a special language ( for example SQL) Effectively retrieve the required information from the database .
For storage based on artificial neural network , Researchers use self supervised learning to store data from large corpora ( Pre training ), The network is then used for various downstream tasks ( For example, emotional classification ).
come from CMU A new method for accessing data containing various types of information has been proposed by the researchers of , This information can be used as a pre training signal to guide the model to optimize parameters . The research shows the data structurally in the unit of signal . This is similar to the scenario of using a database to store data : First construct them into tables or JSON Format , In this way, we can use special language ( Such as SQL) Accurately retrieve the information you need .
Besides , This study believes that valuable signals are abundant in all kinds of data in the world , Rather than simply exist in manually managed supervisory data sets , What researchers need to do is (a) Identifying data (b) Reorganize data in a unified language (c) Integrate and store them in the pre training language model . This study calls this learning paradigm "reconstructive pre training" (reStructured Pre-training,RST). The researchers liken this process to 「 Mine treasure hunt 」. Different data sources like Wikipedia , It is equivalent to a mine rich in precious stones . They contain a wealth of information , For example, named entities from hyperlinks , It can provide signals for model pre training . A good pre training model (PLM) The composition of various signals in the data should be clearly understood , In order to provide accurate information according to the different needs of downstream tasks .
Paper title :
reStructured Pre-training
Thesis link :
https://arxiv.org/pdf/2206.11147.pdf
▲ Pre training language model Treasure Hunt
This study proposes a new paradigm of Task-based Learning in naturallanguageprocessing , namely RST, This paradigm re emphasizes the role of data , Model pre training and fine-tuning of downstream tasks are regarded as data storage and access processes . On this basis , This study implements a simple principle , That is, a good storage mechanism should not only have the ability to cache a large amount of data , Ease of access should also be considered .
After overcoming some engineering challenges , This research is based on the reconstruction of data ( It consists of all kinds of valuable information rather than raw data ) Pre training to achieve this . Experimental proof ,RST Models are not only coming from various NLP Mission ( For example, classification 、 Information extraction 、 Fact retrieval 、 Text generation, etc ) Of 52/55 Performance on popular datasets significantly exceeds that of the best existing systems ( for example ,T0), And there is no need to fine tune downstream tasks . Every year, millions of students take part in the most authoritative college entrance examination in China, and have also achieved excellent results .
To be specific , The college entrance examination AI (Qin) Higher than the student's average score 40 branch , Than using 1/16 Parametric GPT3 Higher than 15 branch . Special Qin stay 2018 In the English test, I got 138.5 The high score ( Full marks 150).
Besides , The study also released the college entrance examination benchmark (Gaokao Benchmark) Online submission platform , contain 2018-2021 So far this year 10 An annotated English test paper ( And will be expanded every year ), Make more AI The model takes part in the college entrance examination , The study also established a relatively fair human and AI Competitive test platforms , Help us better understand where we are . in addition , In a few days ago (2022.06.08) Of 2022 College entrance examination English test , The AI The system obtains 134 Good grades , and GPT3 Only got 108 branch .
The main contributions of this study include :
1. carry Out NLP The evolutionary hypothesis of the method . This study attempts to explore modern NLP The inner link between technological development , From the overall point of view 「NLP Technological evolution hypothesis 」. In short , The core idea of this hypothesis is : The iteration of technology always develops in this direction : That is, developers need to do less to design better 、 A more general-purpose system .

Reconstruction pre training


Reconstruction project


stay 55 Kind of commonly used NLP Experiments on datasets













边栏推荐
- Shuttle uses custompaint to paint basic shapes
- 【历史上的今天】6 月 19 日:iPhone 3GS 上市;帕斯卡诞生;《反恐精英》开始测试
- [elevator control system] design of elevator control system based on VHDL language and state machine, using state machine
- "Dadao Zhichuang" won a ten million prea+ round of financing and launched a technology consumption robot
- 【二維碼圖像矯正增强】基於MATLAB的二維碼圖像矯正增强處理仿真
- [today in history] June 2: Apple launched swift programming language; China Telecom acquires China Unicom C network; OS X Yosemite release
- 迪赛智慧数——柱状图(折柱混合图):2021年毕业季租房价格和房租收入比
- 【方块编码】基于matlab的图像方块编码仿真
- Win11 cannot create a new text document? Solution to win11 right click failure to create a new text document
- 【历史上的今天】6 月 17 日:术语“超文本”的创造者出生;Novell 首席科学家诞生;探索频道开播
猜你喜欢
Online text batch inversion by line tool
The graduation season is coming, and the number of college graduates in 2022 has exceeded 10 million for the first time
分布式事务—基于消息补偿的最终一致性方案(本地消息表、消息队列)
Win11 cannot create a new text document? Solution to win11 right click failure to create a new text document
[cloud native] - docker installation and deployment of distributed database oceanbase
初始线性回归
Online JSON to plaintext tool
[today in history] June 24: Netease was established; The first consumer electronics exhibition was held; The first webcast in the world
How to enable multi language text suggestions? Win11 method to open multilingual text suggestions
Win11无法使用动态壁纸怎么办?Win11用不了动态壁纸的解决方法
随机推荐
在线JSON转PlainText工具
Flutter 使用 CustomPaint 绘制基本图形
把腾讯搬上云:云服务器 CVM 的半部进化史
MFC常用 当前路径
Flashtext, a data cleaning tool, has directly increased the efficiency by dozens of times
How fiddle uses agents
Writing based on stm32
Packet capturing and sorting out external Fiddler -- understanding the toolbar [1]
Shuttle uses custompaint to paint basic shapes
树莓派-环境设置和交叉编译
Which securities platform is the best and safest for a novice to open a stock trading account
【历史上的今天】6 月 7 日:Kubernetes 开源版本发布;《魔兽世界》登陆中国;分组交换网络发明者出生
[today in history] May 29: the pioneer of sharing software was born; Chromebox launched; VoodooPC founder was born
Usage differences between isempty and isblank
Win11 cannot create a new text document? Solution to win11 right click failure to create a new text document
Publicity of the third batch of shortlisted enterprises! Annual Top100 smart network supplier selection
数据清洗工具flashtext,效率直接提升了几十倍数
NER中BiLSTM-CRF解读Forward_algorithm
【模糊神经网络】基于matlab的模糊神经网络仿真
isEmpty 和 isBlank 的用法区别