当前位置:网站首页>Bert's summary of me
Bert's summary of me
2022-06-25 17:37:00 【Green Lantern swordsman】
BERT Read a lot of information , I think I have some insight . For two years , I didn't sort it out myself . Now start sorting :
One 、Google Bert In the source modeling file
modeling yes bert The origin of , It's better to understand here first . You can refer to the materials of other great gods :
1. Code interpretation , Analysis of a three-year-old brother , It's very clear
2. bert The paper of , The first article should read 《BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding》, This link has a good explanation in Chinese Add link description
3. The second important paper is 《Pre-Training with Whole Word Maskingfor Chinese BERT》. The idea is google Bring up the , The Chinese version was trained by Harbin Institute of technology , Hada's link is This github. Relevant supporting materials include :BERT-WWM note 、BERT-wwm、BERT-wwm-ext
4. Met a summary BERT Information articles , Look at this link . however , I think he wrote too much , This means that these things are not necessarily useful .
Two 、transform You should make a good understanding of
2.1 The first one is Wang Yudi's pdf, It's really good . After seeing , combination tensorflow Code , View paper Attention Is All You Need
3、 ... and 、 How to load the code in the application ?
(1)keras The loading method is simple , There is a tool developed by sujianlin's team . See here for its use : Introduction 、github Address
(2)huggingface Of github see here ,Google Officially recommended PyTorch BERB Version implementation . For example , see B The graduate student at the station Example , You can also learn by hand Bert Text classification of this Example
(3) Official Google Code , It seems that loading is also good , Sure
Four 、 Other matters needing attention
(1) Optimizer used adamw, It is different from the conventional adam What improvements have been made , see here
边栏推荐
- [matlab] data interpolation
- Sword finger offer II 025 Adding two numbers in a linked list
- UART波特率对时钟精度的要求有多高?
- Huawei cloud gaussdb (for redis) unveiling issue 19: gaussdb (for redis) comprehensive comparison with CODIS
- bert之我的小总结
- Which is better for intermediate and advanced soft exam?
- College Students' hot summer exchange, Rog star product phantom 16 flipped version / phantom 13 / phantom x appointment
- Mobx学习之路----强大的“状态管理工具”
- Precautions for the use of Jerry's wake-up mouth [chapter]
- Precautions for using Jerry's timer [chapter]
猜你喜欢
![[compilation principle] lexical analysis](/img/b2/8f7dea3944839e27199b28d903d9f0.png)
[compilation principle] lexical analysis

使用DiskGenius拓展系統盤C盤的容量

超全金属PBR多通道贴图素材网站整理

CVPR small target detection: context and attention mechanisms improve small target detection (attached with paper Download)
![Jerry's system clock setting is reset or invalid [chapter]](/img/c6/ee6b287af7d309f98abda8e11d674c.png)
Jerry's system clock setting is reset or invalid [chapter]

杰理之系统时钟设置出现复位或无效问题【篇】

【Matlab】数值微积分与方程求解

How does social e-commerce operate and promote?

大学生暑假换机热,ROG 明星产品幻 16 翻转版 / 幻 13 / 幻 X 预约

Mathematical modeling - nonlinear programming
随机推荐
Why are there few embedded system designers in the soft test?
Which is better for intermediate and advanced soft exam?
CGI connects to database through ODBC
HMS Core机器学习服务实现同声传译,支持中英文互译和多种音色语音播报
Assembly language (6) uses JCC instructions to construct branches and loops
[micro service sentinel] overview of flow control rules | detailed explanation of flow control mode for source | < direct link >
[matlab] numerical calculus and equation solving
Sentinel哨兵机制
Remote terminal control artifact - mobaxterm
杰理之系统时钟设置出现复位或无效问题【篇】
How does LSF see whether the job reserved slot is reasonable?
[compilation principle] lexical analysis
ACY100油烟浓度在线监控仪针对饮食业厨房油烟排放
WPF development essays Collection - ECG curve drawing
BILSTM和CRF的那些事
Redis 的PSYNC命令
Treasure and niche Chinese painting 3D texture material website sharing
Mathematical modeling - nonlinear programming
IDEA全局搜索汉字[通俗易懂]
Learning Tai Chi makers - mqtt (I) what is mqtt