当前位置:网站首页>【AI4Code】《IntelliCode Compose: Code Generation using Transformer》 ESEC/FSE 2020
【AI4Code】《IntelliCode Compose: Code Generation using Transformer》 ESEC/FSE 2020
2022-07-25 11:11:00 【chad_lee】
《IntelliCode Compose: Code Generation using Transformer》 ESEC/FSE 2020
不仅仅是生成一个词,而是生成一行。用的是GPT-2。数据集是12亿行Python, C#, Javascript, TypeScript语言的代码
Byte-Pair Encoding (BPE)
对序列token化的处理,一个是用subtoken来缩小词表,一个是屏蔽字符串以防止敏感数据泄漏。
IntelliCode Compose
模型用的是GPT,在推断的时候将sequence decoding的过程视为树的搜索过程,直至 token出现:
生成树的时候使用beam search,beam with为K,假设最终生成的序列长度为L,模型一共需要预测 K*L 次,但是模型可以batch执行,所以一共只需要L次。
Multilingual model
比较了四种建模多语言的方式:
1)忽略语言之间的不同,用统一的模型训练多种语言【实验表明:这种方式比单独对单语言训练效果更差】
2)加入language type embedding信息,每种语言用一个向量表示,和原本的token embedding等结合。
3)在每个训练样本的最开始加上一句"lang * remaining token sequence",其中 l a n g ∈ { P y t h o n , C # , J a v a S c r i p t , T y p e S c r i p t } lang \in \{Python, C\#, JavaScript,TypeScript\} lang∈{ Python,C#,JavaScript,TypeScript}
4)在预训练时,加入一个language type classification任务,即多一个head,每次预测该语言的类型。
边栏推荐
- How to solve the problem that "w5500 chip cannot connect to the server immediately after power failure and restart in tcp_client mode"
- Solutions to the failure of winddowns planning task execution bat to execute PHP files
- Start with the development of wechat official account
- 'C:\xampp\php\ext\php_ zip. Dll'-%1 is not a valid Win32 Application Solution
- Transformer变体(Sparse Transformer,Longformer,Switch Transformer)
- [USB device design] - composite device, dual hid high-speed (64BYTE and 1024byte)
- JS运算符
- 【多模态】《HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval》ICCV 2021
- brpc源码解析(七)—— worker基于ParkingLot的bthread调度
- Transformer变体(Routing Transformer,Linformer,Big Bird)
猜你喜欢
![[multimodal] transferrec: learning transferable recommendation from texture of modality feedback arXiv '22](/img/02/5f24b4af44f2f9933ce0f031d69a19.png)
[multimodal] transferrec: learning transferable recommendation from texture of modality feedback arXiv '22

GPT plus money (OpenAI CLIP,DALL-E)

return 和 finally的执行顺序 ?各位大佬请看过来,

Make a reliable delay queue with redis

Brpc source code analysis (V) -- detailed explanation of basic resource pool

任何时间,任何地点,超级侦探,认真办案!

【GCN-RS】Are Graph Augmentations Necessary? Simple Graph Contrastive Learning for RS (SIGIR‘22)

What is the global event bus?

硬件连接服务器 tcp通讯协议 gateway

Transformer变体(Routing Transformer,Linformer,Big Bird)
随机推荐
Review in the middle of 2022 | understand the latest progress of pre training model
[high concurrency] I summarized the best learning route of concurrent programming with 10 diagrams!! (recommended Collection)
微星主板前面板耳机插孔无声音输出问题【已解决】
brpc源码解析(三)—— 请求其他服务器以及往socket写数据的机制
Introduction to pl/sql, very detailed notes
Arrays in JS
JS scope and pre parsing
[MySQL learning 08]
PHP curl post x-www-form-urlencoded
【对比学习】Understanding the Behaviour of Contrastive Loss (CVPR‘21)
brpc源码解析(二)—— brpc收到请求的处理过程
【GCN-RS】MCL: Mixed-Centric Loss for Collaborative Filtering (WWW‘22)
PHP curl post x-www-form-urlencoded
JS作用域以及预解析
软件缺陷的管理
Onenet platform control w5500 development board LED light
【MySQL 17】安装异常:Could not open file ‘/var/log/mysql/mysqld.log‘ for error logging: Permission denied
W5500 upload temperature and humidity to onenet platform
Transformer变体(Sparse Transformer,Longformer,Switch Transformer)
JaveScript循环