当前位置:网站首页>Relationship extraction --tplinker
Relationship extraction --tplinker
2022-06-26 08:27:00 【xuanningmeng】
Relationship extraction –TPLinker
Recently, I am learning and sorting out the model of relation extraction , In the process of learning, I have a further understanding of relationship extraction . Xiaobai's long learning journey begins . Turn to today's topic .
TPLinker Innovation
(1)TPLinker It is a new paradigm of relation extraction
(2)TPLinker It is a single-stage extraction model ,
(3)TPLinker Entities and relationships share the same decoding , At the same time, avoid deviation exposure , Extract entities and relationships at the same time , It is not to draw entities before relationships , Accumulate the error of entity extraction error , The consistency of training and prediction is ensured .
(4)TPLinker Model can handle SingleEntityOverlap (SEO), and EntityPairOverlap (EPO) , At the same time, it can deal with the problem of entity nesting
TPLinker Use bert and Handshaking Kernel,Handshaking Kernel Is the core .
TPLinker Model structure diagram of :
Tagging
TPLinker The model needs to pair relational triples (subject, relation, object) Do it manually Tagging, The process is divided into three parts :
(1)entity head to entity tail (EH-TO-ET)
(2)subject head to object head (SH-to-OH)
(3)subject tail to object tail (ST-to-OT)
See the following figure for an example of marking ,EH-TO-ET In purple ,SH-to-OH In red ,ST-to-OT It's in blue .
Because of this Tagging The marking method is Handshaking Tagging, The matrix is sparse , Therefore, the lower triangle is marked with 1 Symmetrical to upper triangular part , And it's marked 2. This marking method can handle the re nesting of entities , for example New York City and New York.
decode
The decoding process extracts the corresponding relationship between entities , The process is as follows :
(1) Build entity EH-TO-ET Dictionaries D, Among them, the dictionary D Of key Is the entity head ,value For entity
(2) decode ST-to-OT, Build a dictionary E; Decode again SH-to-OH And in the dictionary D Find all entities that meet the entity header as much as possible in subject and object
(3) Verify the... Found above subject and object Whether the tail of the entity pair is E in , If in E in , Get the entity subject,objdect And relational triples
handshaking decoder (tplinker plus)
The decoding process includes extracting entities and relationships , Here we consider the type, The decoding process is as follows :
(1) according to handshaking kernel The results of all handshaking tagging Result , namely [(start, end, idstag)]
(2) According to what you get handshaking tagging The result of decoding is EH-TO-ET, namely (entity_start, entity_end, entity_type), At the same time build head_ind2entities Dictionaries , Dictionary key yes entity_head index,value yes entity token span
(3) According to what you get handshaking tagging The result of decoding is ST-TO-OT perhaps OT-TO-ST, namely (sub_tail, obj_tail, rel_type)
(4) according to handshaking tagging The result of decoding is SH-TO-OH,ST-TO-OT, sub_head, obj_head, rel_type, verification sub_head,obj_head Whether in head_ind2entities in , If sub_head, obj_head stay head_ind2entities in , verification (sub_tail, obj_tail, rel_type) Whether in (3) Results in , If in , Returns a relational triplet
Model
adopt Handshaking Tagging Get entity pairs , The characteristics of an entity pair are expressed as :
A model representation of an entity's ownership relationship :
Model loss function
among ,N It's the length of the sentence , E,H,T respectively EH-TO-ET,SH-to-OH,ST-to-OT, l ^ ∗ \hat l^{*} l^∗ It's a real tag
Experimental results
TPLinker The comparison with the experimental results of other models is shown in the following figure :
Recently in use TPLinker The model does Chinese relation extraction , The data is processed in the following format :
{
"id": 9,
"text": " In terms of tutor lineup , Inda is expected to join hands 《 The king of Chinese comedy 》 Select a new generation of comedians ",
"relation_list": [
{
"subject": " The king of Chinese comedy ",
"object": " Ying Da ",
"subj_char_span": [
15,
20
],
"obj_char_span": [
8,
10
],
"predicate": " The guest ",
"subj_tok_span": [
15,
20
],
"obj_tok_span": [
8,
10
]
}
],
"entity_list": [
{
"text": " Ying Da ",
"type": " figure ",
"char_span": [
8,
10
],
"tok_span": [
8,
10
]
},
{
"text": " The king of Chinese comedy ",
"type": " TV variety ",
"char_span": [
15,
20
],
"tok_span": [
15,
20
]
}
]
}
Recently, I have been learning relationship extraction , Mainly look at it. TPLinker,CasRel, Su Shen's bert4keras These three joint extraction models , And use baidurelation2020 A simple comparative experiment was done on the competition data set (GPU The video memory is relatively small ,batch and max length The settings are relatively small ), The number of rounds of three model training is not enough ,epoch Are all 20. Through comparative experiments, it is found that , Su Shen's bert4keras The experimental results and CasRel Similar results , The three models are different in some data , Here is only one comparative example ,
TPLinker Model results
{
"text": " Except for the superhero cast of the blind , There are also several well-known behind the scenes workers , such as “ Father of Marvel ” Stan · Li ( There are also many guest stars in front of the screen ),《 Galaxy escort 》 James, the director of the series · Goon ( He was also “ Raccoon rocket ”), Kevin, chairman of Marvel pictures · Fitch, wait ",
"id": "test_13",
"relation_list": [
{
"subject": " Galaxy escort ",
"object": " James · Goon ",
"subj_tok_span": [
56,
61
],
"obj_tok_span": [
67,
73
],
"subj_char_span": [
56,
61
],
"obj_char_span": [
67,
73
],
"predicate": " The director "
},
{
"subject": " Raccoon rocket ",
"object": " James · Goon ",
"subj_tok_span": [
80,
84
],
"obj_tok_span": [
67,
73
],
"subj_char_span": [
80,
84
],
"obj_char_span": [
67,
73
],
"predicate": " The director "
}
]
}
{
"text": " Xiazhiqing graduated from Shanghai Hujiang University ( This is an American missionary school ), The brothers have neither background nor backing in Peking University ",
"id": "test_15",
"relation_list": [
{
"subject": " C.T.Hsia ",
"object": " Shanghai Hujiang University ",
"subj_tok_span": [
0,
3
],
"obj_tok_span": [
6,
12
],
"subj_char_span": [
0,
3
],
"obj_char_span": [
6,
12
],
"predicate": " University one is graduated from "
}
]
}
{
"text": "《 Forget 》 It is a song sung by Taiwanese singer Teresa Teng , This song was originally written in 1979 year 9 month 20 Day included in the album 《 An unforgettable day 》 Issued in Taiwan and other places , Same year 11 month 15 Day included in the album 《 Sweet honey 》 Issued in Hong Kong and other places ",
"id": "test_24",
"relation_list": [
{
"subject": " Forget ",
"object": " An unforgettable day ",
"subj_tok_span": [
1,
3
],
"obj_tok_span": [
42,
47
],
"subj_char_span": [
1,
3
],
"obj_char_span": [
46,
51
],
"predicate": " The album "
},
{
"subject": " Forget ",
"object": " Sweet honey ",
"subj_tok_span": [
1,
3
],
"obj_tok_span": [
69,
72
],
"subj_char_span": [
1,
3
],
"obj_char_span": [
75,
78
],
"predicate": " The album "
},
{
"subject": " Forget ",
"object": " Teresa Deng ",
"subj_tok_span": [
1,
3
],
"obj_tok_span": [
10,
13
],
"subj_char_span": [
1,
3
],
"obj_char_span": [
10,
13
],
"predicate": " singer "
}
]
}
CasRel Model results
{
"text": " Except for the superhero cast of the blind , There are also several well-known behind the scenes workers , such as “ Father of Marvel ” Stan · Li ( There are also many guest stars in front of the screen ),《 Galaxy escort 》 James, the director of the series · Goon ( He was also “ Raccoon rocket ”), Kevin, chairman of Marvel pictures · Fitch, wait ",
"relation": [
[
" Galaxy escort ",
" The director ",
" James · Goon "
]
]
}
{
"text": " Xiazhiqing graduated from Shanghai Hujiang University ( This is an American missionary school ), The brothers have neither background nor backing in Peking University ",
"relation": [
[
" C.T.Hsia ",
" University one is graduated from ",
" Hujiang University "
],
[
" C.T.Hsia ",
" University one is graduated from ",
" Shanghai Hujiang University "
]
]
}
{
"text": "《 Forget 》 It is a song sung by Taiwanese singer Teresa Teng , This song was originally written in 1979 year 9 month 20 Day included in the album 《 An unforgettable day 》 Issued in Taiwan and other places , Same year 11 month 15 Day included in the album 《 Sweet honey 》 Issued in Hong Kong and other places ",
"relation": [
[
" Forget ",
" singer ",
" Teresa Deng "
],
[
" Forget ",
" The album ",
" An unforgettable day "
]
]
}
Su Shen's bert4keras Result
{
"text": " Except for the superhero cast of the blind , There are also several well-known behind the scenes workers , such as “ Father of Marvel ” Stan · Li ( There are also many guest stars in front of the screen ),《 Galaxy escort 》 James, the director of the series · Goon ( He was also “ Raccoon rocket ”), Kevin, chairman of Marvel pictures · Fitch, wait ",
"bert4keras_relation": [
[
" Galaxy escort ",
" The director ",
" James · Goon "
]
]
}
{
"text": " Xiazhiqing graduated from Shanghai Hujiang University ( This is an American missionary school ), The brothers have neither background nor backing in Peking University ",
"bert4keras_relation": [
[
" C.T.Hsia ",
" University one is graduated from ",
" Hujiang University "
],
[
" C.T.Hsia ",
" University one is graduated from ",
" Shanghai Hujiang University "
]
]
}
{
"text": "《 Forget 》 It is a song sung by Taiwanese singer Teresa Teng , This song was originally written in 1979 year 9 month 20 Day included in the album 《 An unforgettable day 》 Issued in Taiwan and other places , Same year 11 month 15 Day included in the album 《 Sweet honey 》 Issued in Hong Kong and other places ",
"bert4keras_relation": [
[
" Sweet honey ",
" singer ",
" Teresa Deng "
],
[
" Forget ",
" The album ",
" An unforgettable day "
],
[
" Forget ",
" The album ",
" Sweet honey "
],
[
" Forget ",
" singer ",
" Teresa Deng "
],
[
" An unforgettable day ",
" singer ",
" Teresa Deng "
]
]
}
If there is a mistake , Welcome to point out .
边栏推荐
- Stream analysis of hevc learning
- Vs2019-mfc setting edit control and static text font size
- Database learning notes I
- SOC wireless charging scheme
- Solve the problem that pychar's terminal cannot enter the venv environment
- (5) Matrix key
- Interpretation of x-vlm multimodal model
- Timer code guide in optee
- 2020-10-20
- Swift code implements method calls
猜你喜欢
Reflection example of ads2020 simulation signal
Test method - decision table learning
Go language shallow copy and deep copy
鲸会务一站式智能会议系统帮助主办方实现数字化会议管理
Project practice: parameters of pycharm configuration for credit card digital recognition and how to use opencv in Anaconda
Design based on STM32 works: multi-functional atmosphere lamp, wireless control ws2812 of mobile app, MCU wireless upgrade program
First character that appears only once
Embedded Software Engineer (6-15k) written examination interview experience sharing (fresh graduates)
MySQL practice: 2 Table definition and SQL classification
XXL job configuration alarm email notification
随机推荐
xxl-job配置告警邮件通知
1. error using XPath to locate tag
h5 localStorage
Embedded Software Engineer (6-15k) written examination interview experience sharing (fresh graduates)
Mapping '/var/mobile/Library/Caches/com.apple.keyboards/images/tmp.gcyBAl37' failed: 'Invalid argume
JS file message invalid character error
XXL job configuration alarm email notification
2: String insert
Blue Bridge Cup 3 sequence summation
Diode voltage doubling circuit
js文件报无效字符错误
optee中的timer代码导读
Read excel table and render with FileReader object
(3) Dynamic digital tube
Learn signal integrity from zero (SIPI) - (1)
Project practice: parameters of pycharm configuration for credit card digital recognition and how to use opencv in Anaconda
Necessary protection ring for weak current detection
51 MCU project design: Based on 51 MCU clock perpetual calendar
SOC的多核启动流程详解
MySQL practice: 2 Table definition and SQL classification