当前位置:网站首页>Textcnn paper Interpretation -- revolutionary neural networks for sense classification
Textcnn paper Interpretation -- revolutionary neural networks for sense classification
2022-06-26 01:37:00 【Green Lantern swordsman】
One 、 Abstract
CNN+static vector It performs well in the task of sentence classification , And based on concrete task Fine tuned task-specific vectors Better performance
Two 、 Model structure

It is worth noting that : Our subjects are 2 individual channels. In the first one , The word vector is maintained during training static; In the second , Word vectors are trained according to backPropagation fine-tuning .
2.1 Regularization
(1) The penultimate layer is added dropout
(2) The penultimate layer is added L2 Regular weight constraints .
3、 ... and 、 Data and experiments
3.1 Participate in training
(1) Hyperparameters , From the grid search
(2) In the verification set early stopping
When there is no validation set , Randomly select from the training set 10% As a validation set . The optimizer is SGD.
3.2 Pre trained word vectors
When there is no large amount of training data , Use publicly available word2vec vector Is a popular way to improve performance . Not in word2vec The words in , Its vector is initialized randomly .
3.3 Model variants
CNN-rand: Word vectors of all words are initialized randomly , Fine tune your workout .
CNN-static: Word vectors come from word2vec , Keep... In training static.
CNN-non-static: Word vectors come from word2vec , Fine tune your workout
CNN-multichannel: Two sets are from word2vec The word of the vector . A set of static, A set of fine-tuning in training .
Four 、 Results and Analysis
CNN-rand The result is not good ;CNN-static Very good , but CNN-non-static Perform better .
4.1 many channel Or just channel
We thought there were more channel Can prevent Over fitting , But the result is that mixup, More research is needed . for example , Instead of using more channel, It is Increase the dimension of the vector , These added dimensions can be modified during training .
4.2 Static and non static semantic representations
Vectors that use non static semantic representations , It is more professional for specific tasks specific.
4.3 Further observation
- Another friend also uses CNN Experimentalize , The result is much worse . We found that :(1) His structure is similar to ours channel Model is similar to .(2) The difference lies in , Our model has a larger capacity, That is, a variety of nuclear widths and characteristics map
- dropout+ Than necessary The bigger network A great contribution .
- From distribution U[-a,a] Medium is not in word2ec The word sampling The number , Also got a little Promotion .
- Adadelta、Adadelta The effect is similar to , But what is needed epoch Less .
5、 ... and 、 Conclusion
Unsupervised training word2vec It's really good .
边栏推荐
- I2C protocol
- 新库上线 | CnOpenData中国新房信息数据
- QT cmake pure C code calls the system console to input scanf and Chinese output garbled code
- 2021-1-15 fishing notes ctrl+c /v
- Qt Cmake 纯C 代码调用系统控制台输入scanf 及 中文输出乱码
- When you run the demo using the gin framework, there is an error "listen TCP: 8080: bind: an attempt was made to access a socket in a way forbidden"
- CityJSON
- What is the process of opening a mobile card account? Is it safe to open an account online?
- Containerd client comparison
- idea配置
猜你喜欢

Installing MySQL databases in FreeBSD

Shengxin weekly issue 34

Development and monitoring of fusion experiment pulse power supply by LabVIEW

新库上线 | CnOpenDataA股上市公司IPO申报发行文本数据

生信周刊第34期
![[Excel知识技能] Excel数据类型](/img/f6/e1ebe033d1a2a266ebda00b10098ed.png)
[Excel知识技能] Excel数据类型

idea配置

Cross validation -- a story that cannot be explained clearly

The overall process of adding, deleting, modifying and querying function items realized by super detailed SSM framework

数据分析——切片器、数据透视表与数据透视图(职场必备)
随机推荐
15 `bs object Node name Node name String` get nested node content
Etcd database source code analysis -- inter cluster network layer server interface
Summary of xlnet model
**MySQL example 1 (query by multiple conditions according to different problems)**
Native DOM vs. virtual DOM
Oracle常用的基础命令
新库上线 | CnOpenDataA股上市公司IPO申报发行文本数据
C disk cleaning strategy of win10 system
Duck feeding data instant collection solution resources
Oracle數據庫完全卸載步驟(暫無截圖)
Freertos+stm32l+esp8266+mqtt protocol transmits temperature and humidity data to Tencent cloud IOT platform
"Hot post" Statistics
leetcode 300. Longest Increasing Subsequence 最长递增子序列 (中等)
新库上线 | CnOpenData中国新房信息数据
2021 - 1 - 15 notes de pêche Ctrl + C / V
Web information collection, naked runners on the Internet
Oracle database complete uninstallation steps (no screenshot)
MySQL example - comprehensive case (multi condition combined query)
Set set!! Review quickly -- MySQL addition, deletion, modification and query, internal, left and right connection review notes
leetcode 300. Longest increasing subsequence (medium)