当前位置:网站首页>Transformers Roberta如何添加tokens
Transformers Roberta如何添加tokens
2022-06-24 23:04:00 【Vincy_King】
1. 前提
最近用roberta模型需要添加special tokens,但每次运行在GPU上会报错(上面还有一堆的block)
而在CPU上则报错
网上搜了很多资料,说是如果增加了special tokens或是修改了vocab.txt,则需要加上model.resize_token_embeddings(len(tokenizer)),不然维度会不对,但一直不太清楚加在哪里,刚开始加在了dataset处理的地方,但仍然报错。
2. 具体操作
先展示一下roberta文件夹
added_tokens.json放需要添加的tokens
{
"[CH-2]": 21133, "[CH-0]": 21131, "[CH-3]": 21134, "[CH-6]": 21137, "[CH-9]": 21140, "[CH-4]": 21135, "[CH-1]": 21132, "[CH-8]": 21139, "”": 21129, "</s>": 21130, "“": 21128, "[CH-5]": 21136, "[CH-7]": 21138}
special_tokens_map.json放特殊tokens
{
"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
tokenizer_config.json放tokenizer的一些的配置
{
"do_lower_case": true, "do_basic_tokenize": true, "never_split": null, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "special_tokens_map_file": "special_tokens_map.json", "name_or_path": "chinese-roberta-wwm-ext", "use_fast": true, "tokenizer_file": "tokenizer.json", "tokenizer_class": "BertTokenizer"}
在bert模型代码处添上self.bert.resize_token_embeddings(len(self.tokenizer))
class Model(nn.Module):
def __init__(self, config):
super(Model, self).__init__()
self.bert = BertModel.from_pretrained(config['bert_path'])
self.tokenizer = BertTokenizer.from_pretrained(config['bert_path'])
# self.tokenizer.add_tokens(self.new_tokens, special_tokens=True)
self.bert.resize_token_embeddings(len(self.tokenizer))
for param in self.bert.parameters():
param.requires_grad = True
这样就大功告成啦~
边栏推荐
- yarn : 无法加载文件 C:\Users\xxx\AppData\Roaming\npm\yarn.ps1,因为在此系统上禁止运行脚本
- Investigation on key threats of cloud computing applications in 2022
- Application of TSDB in civil aircraft industry
- Android Internet of things application development (smart Park) - set sensor threshold dialog interface
- 【直播回顾】战码先锋第七期:三方应用开发者如何为开源做贡献
- Migrate Oracle database from windows system to Linux Oracle RAC cluster environment (3) -- set the database to archive mode
- 入坑机器学习:一,绪论
- Rod and Schwartz cooperated with ZhongGuanCun pan Lianyuan Institute to carry out 6G technology research and early verification
- 計網 | 【四 網絡層】知識點及例題
- How to choose a regular and safe foreign exchange trading platform?
猜你喜欢

Rod and Schwartz cooperated with ZhongGuanCun pan Lianyuan Institute to carry out 6G technology research and early verification

jwt

Sumati gamefi ecological overview, element design in the magical world

Intranet learning notes (5)

EasyCVR国标协议接入的通道,在线通道部分播放异常是什么原因?
![Network planning | [four network layers] knowledge points and examples](/img/c3/d7f382409e99eeee4dcf4f50f1a259.png)
Network planning | [four network layers] knowledge points and examples

Redis

Please run IDA with elevated permissons for local debugging.

会自动化—10K,能做自动化—20K,你搞懂自动化测试没有?

记一次beego通过go get命令后找不到bee.exe的坑
随机推荐
Test / development programmers, 30, do you feel confused? And where to go
中信证券手机开户是靠谱的吗?安全吗
Please run IDA with elevated permissons for local debugging.
测试/开发程序员,30而立,你是否觉得迷茫?又当何去何从......
psql 列转行
LINQ query (3)
Jetson Nano 从入门到实战(案例:Opencv配置、人脸检测、二维码检测)
Computing service network: a systematic revolution of multi integration
Jetson nano from introduction to practice (cases: opencv configuration, face detection, QR code detection)
PSQL column to row
How to quickly familiarize yourself with the code when you join a new company?
Planification du réseau | [quatre couches de réseau] points de connaissance et exemples
What are the reasons for the abnormal playback of the online channel of the channel accessed by easycvr national standard protocol?
MOS tube related knowledge
PE文件基础结构梳理
Random list random generation of non repeating numbers
Sumati gamefi ecological overview, element design in the magical world
2022年云计算应用关键威胁调查
当他们在私域里,掌握了分寸感
Centos7.3 modifying MySQL default password_ Explain centos7 modifying the password of the specified user in MySQL