当前位置:网站首页>NLP commonly used Backbone model cheat sheet (1)
NLP commonly used Backbone model cheat sheet (1)
2022-08-03 01:50:00 【Andy Dennis】
Foreword
Since the appearance of Transformer in 2017, it has appeared in all major NLP jobs.Recently, Stanford also opened a course CS25 specifically for transformers: [Stanford] CS25 Transformers United | Fall 2021
People who are new to NLP can read an article I wrote earlier Research 0_NLPer set off
For the corresponding model, you can go to hugginface's transformers library to see transformers/models (github), you can find the corresponding model to see its source code implementation.
Now it is mainly the dynamic word vector coding technology combined with the context, and the word2vec and glove vocabulary are rarely used for static word vector mapping.
B station a video Blow up!Doctor of Computer [NLP Natural Language Processing] is worthy of being a professor of Tsinghua University!5 hours got me done with NLP Natural Language Processing! (虽然标题有些emm…但是看了一下目录啥的好像还行…
Thesis
Mass
Bart
T5
Exploring the Limits of Transfer Learning with a Unified
Text-to-Text Transformer
Bert
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
encoder structure.There are many bert families, such as the distilled version distilBert, the variant Roberta, etc.
Word vector input composition:

transformer
The famous self-attention comes from this article.
Attention Is All You Need
This model has been reproduced before: Transformer structure reproduction__attention is all you need (pytorch)
encoder-decoder structure:
Attention模块:
边栏推荐
猜你喜欢
随机推荐
华为设备配置BFD与接口联动(触发与BFD联动的接口物理状态变为Down)
00 -- jieba分词
nmap: Bad CPU type in executable
Technology Sharing | How to do assertion verification for xml format in interface automation testing?
思源笔记 本地存储无使用第三方同步盘,突然打不开文件。
matplotlib中的3D绘图警告解决:MatplotlibDeprecationWarning: Axes3D(fig) adding itself to the figure
Introduction to resubmit Progressive Anti-Duplicate Submission Framework
合并两个excel表格工具
threejs dynamically adjust the camera position so that the camera can see the object exactly
新公链时代的跨链安全性解决方案
js基础知识整理之 —— 获取元素和命名规范
Strict feedback nonlinear systems based on event trigger preset since the immunity of finite time tracking control
基于奇异谱分析法和长短时记忆网络组合模型的滑坡位移预测
Mock工具之Moco使用教程
用了 TCP 协议,数据一定不会丢吗?
年近30 ,4月无情被辞,想给划水的兄弟提个醒...
优秀论文以及思路分析01
HCIP(17)
用了这么多年的LinkedList,作者说自己从来不用它?为什么?
js基础知识整理之 —— 判断语句和三元运算符









