当前位置:网站首页>【AI4Code】《IntelliCode Compose: Code Generation using Transformer》 ESEC/FSE 2020
【AI4Code】《IntelliCode Compose: Code Generation using Transformer》 ESEC/FSE 2020
2022-07-25 13:08:00 【chad_ lee】
《IntelliCode Compose: Code Generation using Transformer》 ESEC/FSE 2020
It's not just about generating a word , Instead, generate a line . It's using GPT-2. The dataset is 12 Billion rows Python, C#, Javascript, TypeScript Code of language
Byte-Pair Encoding (BPE)
On the sequence token Chemical treatment , One is to use subtoken To narrow the vocabulary , One is to shield strings to prevent sensitive data leakage .
IntelliCode Compose
The model uses GPT, When inferring, it will sequence decoding The process of is regarded as the search process of the tree , until token appear :
Use when building trees beam search,beam with by K, Suppose the length of the final generated sequence is L, The model needs to predict K*L Time , But the model can batch perform , So all you need is L Time .
Multilingual model
Four ways of modeling multilingualism are compared :
1) Ignore differences between languages , Train multiple languages with a unified model 【 Experiments show that : This method is worse than monolingual training alone 】
2) Join in language type embedding Information , Each language is represented by a vector , And the original token embedding Equal combination .
3) Add a sentence at the beginning of each training sample "lang * remaining token sequence", among l a n g ∈ { P y t h o n , C # , J a v a S c r i p t , T y p e S c r i p t } lang \in \{Python, C\#, JavaScript,TypeScript\} lang∈{ Python,C#,JavaScript,TypeScript}
4) In pre training , Add a language type classification Mission , That is, one more head, Predict the type of the language each time .
边栏推荐
- [300 opencv routines] 239. accurate positioning of Harris corner detection (cornersubpix)
- 【AI4Code】《CoSQA: 20,000+ Web Queries for Code Search and Question Answering》 ACL 2021
- Go: Gin custom log output format
- 【OpenCV 例程 300篇】239. Harris 角点检测之精确定位(cornerSubPix)
- 【历史上的今天】7 月 25 日:IBM 获得了第一项专利;Verizon 收购雅虎;亚马逊发布 Fire Phone
- 简单了解流
- Mid 2022 review | latest progress of large model technology Lanzhou Technology
- Summary of Niuke forum project deployment
- Use vsftpd service to transfer files (anonymous user authentication, local user authentication, virtual user authentication)
- 7行代码让B站崩溃3小时,竟因“一个诡计多端的0”
猜你喜欢

零基础学习CANoe Panel(14)——二极管( LED Control )和液晶屏(LCD Control)

Shell常用脚本:判断远程主机的文件是否存在

2022.07.24 (lc_6124_the first letter that appears twice)

EMQX Cloud 更新:日志分析增加更多参数,监控运维更省心

感动中国人物刘盛兰

MLX90640 红外热成像仪测温传感器模块开发笔记(五)
![[Video] Markov chain Monte Carlo method MCMC principle and R language implementation | data sharing](/img/20/bb43ab1bc447b519c3b1de0f809b31.png)
[Video] Markov chain Monte Carlo method MCMC principle and R language implementation | data sharing

基于JEECG制作一个通用的级联字典选择控件-DictCascadeUniversal

卷积神经网络模型之——AlexNet网络结构与代码实现

【AI4Code】《CoSQA: 20,000+ Web Queries for Code Search and Question Answering》 ACL 2021
随机推荐
Mid 2022 review | latest progress of large model technology Lanzhou Technology
[machine learning] experimental notes - emotion recognition
Microsoft proposed CodeT: a new SOTA for code generation, with 20 points of performance improvement
迁移PaloAlto HA高可用防火墙到Panorama
EMQX Cloud 更新:日志分析增加更多参数,监控运维更省心
【CSDN 年终总结】结束与开始,一直在路上—— “1+1=王”的2021总结
跌荡的人生
Mysql 远程连接权限错误1045问题
conda常用命令:安装,更新,创建,激活,关闭,查看,卸载,删除,清理,重命名,换源,问题
Go: Gin custom log output format
状态(State)模式
卷积核越大性能越强?一文解读RepLKNet模型
“蔚来杯“2022牛客暑期多校训练营2 补题题解(G、J、K、L)
录制和剪辑视频,如何解决占用空间过大的问题?
OAuth, JWT, oidc, you mess me up
Connotation and application of industrial Internet
基于JEECG制作一个通用的级联字典选择控件-DictCascadeUniversal
网络空间安全 渗透攻防9(PKI)
Lu MENGZHENG's "Fu of broken kiln"
Substance designer 2021 software installation package download and installation tutorial