当前位置:网站首页>On the output representation of bidirectional LSTM in pytoch
On the output representation of bidirectional LSTM in pytoch
2022-06-28 11:38:00 【Full stack programmer webmaster】
Hello everyone , I meet you again , I'm your friend, Quan Jun .
In the use of pytorch Two way LSTM In the process of , A question came into my mind .
Bidirectional lstm Of outputs The last state of is related to hidden, There must be a connection between the two ,
But what exactly does it look like ? Will the hidden What the state stores is outputs The last state of ,
In this case , Will it not lead to hidden Can't it represent the two-way information of the whole sequence ?With this question , I started the experiment . Specific experimental code , We're not going to let it go . Let's put the results directly .
output_size: torch.Size([14, 32, 100])
hidden_size: torch.Size([2, 32, 50])
output_first: tensor([-0.0690, -0.0778, 0.0967, -0.0504, 0.1404, 0.0873, 0.1073, -0.1513,
-0.1217, 0.0537, 0.0757, 0.0448, -0.0561, -0.0421, -0.0794, -0.0940,
-0.0649, -0.1796, 0.0847, 0.0254, -0.1643, -0.0526, -0.0008, 0.0073,
-0.0754, 0.0036, -0.0565, 0.0092, 0.0123, -0.0529, -0.1597, -0.0077,
-0.0999, -0.0776, -0.0958, 0.0742, -0.0728, 0.0029, -0.0870, 0.0563,
0.0162, -0.0016, 0.0380, -0.0483, -0.0513, -0.0948, 0.1770, 0.0280,
0.0937, 0.0464, -0.0423, -0.1260, 0.0138, -0.0270, -0.2708, 0.0970,
-0.0236, 0.1324, 0.0953, -0.0506, -0.2078, 0.1213, -0.0621, 0.0084,
0.0217, -0.0931, -0.0561, -0.1457, -0.1096, -0.0949, 0.0167, -0.0168,
0.0812, -0.1475, 0.2290, 0.0154, 0.1291, 0.0186, 0.1038, -0.0363,
-0.1291, -0.0569, -0.0428, -0.0890, -0.0827, 0.0394, -0.2272, -0.0080,
0.1731, -0.0880, -0.0652, -0.1453, -0.0914, 0.0498, 0.0831, 0.0824,
0.1725, 0.1072, 0.0176, -0.0160], device='cuda:0',
grad_fn=<SelectBackward>)
output_end: tensor([-0.1091, 0.0208, 0.0523, -0.1922, 0.1080, -0.0460, 0.0918, -0.0320,
0.1930, -0.1266, 0.1744, -0.0021, -0.1772, 0.1128, -0.1105, -0.0486,
-0.1082, 0.0427, -0.2161, -0.0804, -0.1955, -0.0580, 0.1070, 0.0856,
0.0544, 0.1932, 0.0318, -0.1977, -0.1417, -0.1977, -0.0027, -0.1575,
0.0047, -0.0164, 0.1221, 0.0331, -0.1921, 0.0210, 0.0123, 0.1483,
0.0109, 0.0044, -0.1512, -0.1795, 0.0544, 0.1051, -0.2025, -0.1051,
-0.0342, 0.1321, -0.0305, -0.0173, 0.0664, -0.0764, -0.1054, -0.0213,
0.0215, -0.0251, -0.0674, 0.0949, -0.0855, 0.0422, 0.0701, -0.1804,
0.1247, 0.0426, 0.0778, -0.0756, -0.0747, -0.1250, 0.0706, 0.0458,
-0.0114, -0.0088, 0.0573, -0.0144, -0.0143, -0.0633, 0.1355, -0.0049,
0.0091, 0.0533, -0.0889, -0.0338, -0.0654, 0.0491, -0.0809, -0.0311,
0.1278, -0.0765, -0.0682, -0.1066, 0.0538, -0.1175, -0.0171, 0.0496,
0.0258, -0.0646, 0.1396, 0.0468], device='cuda:0',
grad_fn=<SelectBackward>)
hidden tensor([[-0.1091, 0.0208, 0.0523, -0.1922, 0.1080, -0.0460, 0.0918, -0.0320,
0.1930, -0.1266, 0.1744, -0.0021, -0.1772, 0.1128, -0.1105, -0.0486,
-0.1082, 0.0427, -0.2161, -0.0804, -0.1955, -0.0580, 0.1070, 0.0856,
0.0544, 0.1932, 0.0318, -0.1977, -0.1417, -0.1977, -0.0027, -0.1575,
0.0047, -0.0164, 0.1221, 0.0331, -0.1921, 0.0210, 0.0123, 0.1483,
0.0109, 0.0044, -0.1512, -0.1795, 0.0544, 0.1051, -0.2025, -0.1051,
-0.0342, 0.1321],
[-0.0423, -0.1260, 0.0138, -0.0270, -0.2708, 0.0970, -0.0236, 0.1324,
0.0953, -0.0506, -0.2078, 0.1213, -0.0621, 0.0084, 0.0217, -0.0931,
-0.0561, -0.1457, -0.1096, -0.0949, 0.0167, -0.0168, 0.0812, -0.1475,
0.2290, 0.0154, 0.1291, 0.0186, 0.1038, -0.0363, -0.1291, -0.0569,
-0.0428, -0.0890, -0.0827, 0.0394, -0.2272, -0.0080, 0.1731, -0.0880,
-0.0652, -0.1453, -0.0914, 0.0498, 0.0831, 0.0824, 0.1725, 0.1072,
0.0176, -0.0160]], device='cuda:0', grad_fn=<SliceBackward>)The above experimental results , The first output is the dimension size of the output , It's the length , Batch and hidden layer size *2. We can see that the dimension value of the last dimension is 100, Twice the size of the hidden layer . The second output is our hidden layer dimension size , They are left and right , Batch size , Hidden layer size . The third output is ( The first data ) The value of the representation vector corresponding to the first word from left to right , by “ The sequence outputs the first hidden layer state from left to right ” and “ The last hidden layer state output of the sequence from right to left ” The joining together of . The fourth output is ( The first data ) The value of the representation vector corresponding to the last word from left to right , by “ The last hidden layer state output of the sequence from left to right ” and “ The first hidden layer state output of the sequence from right to left ” The joining together of . The fifth output is the hidden layer output , by “ The last hidden layer state output of the sequence from left to right ” and “ The last hidden layer state output of the sequence from right to left ” The joining together of .
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/151100.html Link to the original text :https://javaforall.cn
边栏推荐
- mysql-.sql文件钓鱼上线
- Tidb v6.0.0 (DMR): initial test of cache table - tidb Book rush
- Day24 JS notes 2021.09.15
- Wireshark数据抓包分析之FTP协议
- Day39 prototype chain and page Fireworks Effect 2021.10.13
- Scientific research - web of science retrieval skills
- JS foundation 5
- New listing of operation light 3.0 - a sincere work of self subversion across the times!
- Word、PDF、TXT文件实现全文内容检索需要用什么方法?
- Simulation of the Saier lottery to seek expectation
猜你喜欢

Solve the problem of reading package listsdonebuilding dependency treereading state informationdone

Docker modifies the user name and password of MySQL

day36 js笔记 ECMA6语法 2021.10.09

Web page tips this site is unsafe solution

Scientific research - web of science retrieval skills

如临现场的视觉感染力,NBA决赛直播还能这样看?

功能真花哨,价格真便宜!长安全新SUV真实力到底怎样?

day31 js笔记 DOM下 2021.09.26

Simulation of the Saier lottery to seek expectation

Word、PDF、TXT文件实现全文内容检索需要用什么方法?
随机推荐
Calculate time using calendar
Day28 strict mode, string JS 2021.09.22
培训通知|2022年境外中资企业机构及人员疫情防控和安全防范专题培训通知
NFT卡牌链游系统开发dapp搭建技术详情
Excel导入导出便捷工具类
[sword finger offer] 49 Ugly number
Wealth management for programmers
Array method in JS 2021.09.18
GCC introduction
day28 严格模式、字符串 js 2021.09.22
Mysql使用max函数查询不到最大值
day39 原型鏈及頁面烟花效果 2021.10.13
Introduction to GDB
Day23 JS notes 2021.09.14
Setinterval, setTimeout and requestanimationframe
Simulation of the Saier lottery to seek expectation
Thesis reading (59):keyword based diverse image retrieval with variable multiple instance graph
SQL中的DQL、DML、DDL和DCL是怎么区分和定义的
Recommended practice sharing of Zhilian recruitment based on Nebula graph
Gee: mcd64a1 based globfire daily fire data set