当前位置:网站首页>Making a Chatbot based on gpt2
Making a Chatbot based on gpt2
2022-06-24 04:02:00 【Goose】
1. background
Everyone must have experienced , For many reasons, a good friend no longer chats with you , So can we use his wechat chat records to roughly restore this person's chat habits, tone, and even facial expression packets that he likes to send ?
This blog is based on GPT2-Chinese About how to use friends' chat records to train a chat robot , However, the final effect still depends on whether the training materials are sufficient , And model selection , Parameter adjustment, etc , It is not difficult to run successfully , But it is difficult for debugging to imitate well , If you are interested, you can try other modeling methods or corpus selection .
The second half of the article will try to talk about GPT2 Principle and tuning of .
Don't talk much , Let's start with what is probably the most complicated part of this article , The development and running environment is almost ready demo Half of it .
2. GPT-2 Principle introduction
Prior to Blog You can review transformer BERT GPT Wait for the model
GPT-2 It's using 「transformer Decoder module 」 Built , and BERT It is through 「transformer Encoder 」 Module built . What needs to be pointed out here is , A key difference between the two is that :GPT-2 Just like the traditional language model , Output only one word at a time (token)
3. Environmental preparation
Operating environment reference :
centOS7 python3.6
Run the following command
yum -y install python36-devel
git clone
cd gpt2-chatbot
pip3 install -r requirements.txt
4.1 Corpus preparation and preprocessing
In the root directory of the project data Folder , The original training corpus is named train.txt, Store in this directory .train.txt The format is as follows , There is a line between each chat , The format is as follows :
Training materials can be downloaded using
https://github.com/codemayq/chinese_chatbot_corpus
I really want to go to the movies with you Suddenly miss you very much I miss you,too I want to see your beautiful photos Kiss me and I'll show you I kiss two I hate people beating you on the chest with small fists
Process and copy all chat records of colleagues append To the training file ( It is not known whether the sample weight can be properly )
function preprocess.py, Yes data/train.txt The dialogue corpus is used for tokenize, Then serialize and save to data/train.pkl.train.pkl The object serialized in is of type List, Record in the conversation list , Each conversation contains token.
python3 preprocess.py --train_path data/train.txt --save_path data/train.pkl
4.2 Training models
function train.py, Use the preprocessed data , Carry out autoregressive training on the model , The model is saved in the root directory model In the folder .
During the training , You can specify patience Parameters early stop. When patience=n when , If continuous n individual epoch, The model is on the validation set loss No decline , Is to early stop, Stop training . When patience=0 when , Don't make early stop.
python3 train.py --epochs 40 --batch_size 8 --device 0,1 --train_path data/train.pkl
5. other
In fact, there are still many parts that can be improved in the future , The project code has not been understood yet . Write a few at will and then have time to play :
- Online learning
- Try other pre training models
- Comment on Weibo hot search every day
It is estimated that the group will be a little less quiet if it can improve these .
Ref
- https://zhuanlan.zhihu.com/p/96755231
- https://github.com/yangjianxin1/GPT2-chitchat
- https://github.com/sfyc23/EverydayWechat
- https://github.com/Morizeyao/GPT2-Chinese
- https://zhuanlan.zhihu.com/p/57251615
- paper https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
- https://github.com/openai/gpt-2
- The heart of the machine interprets the model https://www.sohu.com/a/336262203_129720
边栏推荐
- golang clean a slice
- Halcon knowledge: contour operator on region (2)
- How to restore the default route for Tencent cloud single network card machine
- How to save pictures to CDN? What are the advantages of this?
- Black hat SEO actual combat directory wheel chain generates millions of pages in batch
- Tencent cloud console work order submission Guide
- ModStartCMS 企业内容建站系统(支持 Laravel9)v4.2.0
- The quick login of QQ cannot be directly invoked through remote login, and the automatic login of QQ can be invoked using VNC
- RPM 包的构建 - SPEC 基础知识
- TCP three handshakes and four waves
猜你喜欢

ModStartCMS 主题入门开发教程

Clickhouse (02) Clickhouse architecture design introduction overview and Clickhouse data slicing design

祝贺钟君成为 CHAOSS Metric Model 工作组的 Maintainer

Brief ideas and simple cases of JVM tuning - how to tune

黑帽SEO实战之通用301权重pr劫持
![[Numpy] Numpy对于NaN值的判断](/img/aa/dc75a86bbb9f5a235b1baf5f3495ff.png)
[Numpy] Numpy对于NaN值的判断

Black hat actual combat SEO: never be found hijacking

Old popup explorer Exe has stopped working due to problems. What should I do?

【代码随想录-动态规划】T392.判断子序列

Black hat SEO practice: General 301 weight PR hijacking
随机推荐
What is FTP? What is the FTP address of the ECS?
Can the video streams of devices connected to easygbs from the intranet and the public network go through their respective networks?
讲讲我的不丰富的远程办公经验和推荐一些办公利器 | 社区征文
Protect your system with fail2ban and firewalld blacklists
Black hat SEO practice: General 301 weight PR hijacking
Pits encountered in refactoring code (1)
MySQL cases SQL causes 100% CPU utilization
Unable to access the CVM self built container outside the TKE cluster pod
Exploration of web application component automatic discovery
How to avoid man in the middle attack (mitm)
Tell you about mvcc
ClickHouse(02)ClickHouse架构设计介绍概述与ClickHouse数据分片设计
The first 2021 Western cloud security summit is coming! See you in Xi'an on September 26!
Create a telepresence USB drive using the DD command
getLocationInWindow源码
Clickhouse synchronous asynchronous executor
On game safety (I)
C language in DSP (2) -- definition of structure
[code Capriccio - dynamic planning] t392 Judgement subsequence
How to restore the default route for Tencent cloud single network card machine