当前位置:网站首页>Bert's summary of me
Bert's summary of me
2022-06-25 17:37:00 【Green Lantern swordsman】
BERT Read a lot of information , I think I have some insight . For two years , I didn't sort it out myself . Now start sorting :
One 、Google Bert In the source modeling file
modeling yes bert The origin of , It's better to understand here first . You can refer to the materials of other great gods :
1. Code interpretation , Analysis of a three-year-old brother , It's very clear
2. bert The paper of , The first article should read 《BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding》, This link has a good explanation in Chinese Add link description
3. The second important paper is 《Pre-Training with Whole Word Maskingfor Chinese BERT》. The idea is google Bring up the , The Chinese version was trained by Harbin Institute of technology , Hada's link is This github. Relevant supporting materials include :BERT-WWM note 、BERT-wwm、BERT-wwm-ext
4. Met a summary BERT Information articles , Look at this link . however , I think he wrote too much , This means that these things are not necessarily useful .
Two 、transform You should make a good understanding of
2.1 The first one is Wang Yudi's pdf, It's really good . After seeing , combination tensorflow Code , View paper Attention Is All You Need
3、 ... and 、 How to load the code in the application ?
(1)keras The loading method is simple , There is a tool developed by sujianlin's team . See here for its use : Introduction 、github Address
(2)huggingface Of github see here ,Google Officially recommended PyTorch BERB Version implementation . For example , see B The graduate student at the station Example , You can also learn by hand Bert Text classification of this Example
(3) Official Google Code , It seems that loading is also good , Sure
Four 、 Other matters needing attention
(1) Optimizer used adamw, It is different from the conventional adam What improvements have been made , see here
边栏推荐
- 智能对话01-redis的安装
- Agent white paper - jointly build agents and create the wisdom of the whole scene | cloud library No.21 recommendation
- golang sort slice int
- Which is better for intermediate and advanced soft exam?
- [compilation principle] overview
- 学习太极创客 — MQTT(一)MQTT 是什么
- 什么是公链开发?公链开发项目有哪些?
- Precautions for using timer_cap.c of Jerry [chapter]
- 数学建模——非线性规划
- Mobx学习之路----强大的“状态管理工具”
猜你喜欢
[matlab] numerical calculus and equation solving
智能对话01-redis的安装
Learn Tai Chi Maker - mqtt (III) connect to mqtt server
[black apple] Lenovo Savior y70002019pg0
学习太极创客 — MQTT(一)MQTT 是什么
【编译原理】词法分析
Mathematical modeling - linear programming
How Jerry used to output a clock source to the outside world [chapter]
How high does UART baud rate require for clock accuracy?
杰理之adc_get_voltage 函数获取电压值不准【篇】
随机推荐
什么是公链开发?公链开发项目有哪些?
js禁止浏览器默认事件
Introduction to the container of() function
Old mobile phones turn waste into treasure and serve as servers
[matlab] numerical calculus and equation solving
Remote terminal control artifact - mobaxterm
Solution to the problem of incorrect clock in FreeRTOS kernel
Sword finger offer II 010 Subarray prefix sum difference with sum K
Sword finger offer II 012 The sum of left and right subarrays is equal
Jerry's ADC_ get_ Incorrect voltage value obtained by voltage function [chapter]
【编译原理】词法分析
CGI connects to database through ODBC
Huawei cloud gaussdb (for redis) unveiling issue 19: gaussdb (for redis) comprehensive comparison with CODIS
Why are there few embedded system designers in the soft test?
[compilation principle] lexical analysis
n-queens problem
Sword finger offer II 014 A sliding window of anagrams in strings
Precautions for using timer_cap.c of Jerry [chapter]
[compilation principle] overview
IDEA全局搜索汉字[通俗易懂]