当前位置:网站首页>Relationship extraction --tplinker

Relationship extraction --tplinker

2022-06-26 08:27:00 xuanningmeng

Relationship extraction –TPLinker

Recently, I am learning and sorting out the model of relation extraction , In the process of learning, I have a further understanding of relationship extraction . Xiaobai's long learning journey begins . Turn to today's topic .

TPLinker Innovation

(1)TPLinker It is a new paradigm of relation extraction
(2)TPLinker It is a single-stage extraction model ,
(3)TPLinker Entities and relationships share the same decoding , At the same time, avoid deviation exposure , Extract entities and relationships at the same time , It is not to draw entities before relationships , Accumulate the error of entity extraction error , The consistency of training and prediction is ensured .
(4)TPLinker Model can handle SingleEntityOverlap (SEO), and EntityPairOverlap (EPO) , At the same time, it can deal with the problem of entity nesting
TPLinker Use bert and Handshaking Kernel,Handshaking Kernel Is the core .
TPLinker Model structure diagram of :
TPLinker

Tagging

TPLinker The model needs to pair relational triples (subject, relation, object) Do it manually Tagging, The process is divided into three parts :
(1)entity head to entity tail (EH-TO-ET)
(2)subject head to object head (SH-to-OH)
(3)subject tail to object tail (ST-to-OT)
See the following figure for an example of marking ,EH-TO-ET In purple ,SH-to-OH In red ,ST-to-OT It's in blue .
Tagging
Because of this Tagging The marking method is Handshaking Tagging, The matrix is sparse , Therefore, the lower triangle is marked with 1 Symmetrical to upper triangular part , And it's marked 2. This marking method can handle the re nesting of entities , for example New York City and New York.

decode

The decoding process extracts the corresponding relationship between entities , The process is as follows :
(1) Build entity EH-TO-ET Dictionaries D, Among them, the dictionary D Of key Is the entity head ,value For entity
(2) decode ST-to-OT, Build a dictionary E; Decode again SH-to-OH And in the dictionary D Find all entities that meet the entity header as much as possible in subject and object
(3) Verify the... Found above subject and object Whether the tail of the entity pair is E in , If in E in , Get the entity subject,objdect And relational triples

handshaking decoder (tplinker plus)

The decoding process includes extracting entities and relationships , Here we consider the type, The decoding process is as follows :
(1) according to handshaking kernel The results of all handshaking tagging Result , namely [(start, end, idstag)]
(2) According to what you get handshaking tagging The result of decoding is EH-TO-ET, namely (entity_start, entity_end, entity_type), At the same time build head_ind2entities Dictionaries , Dictionary key yes entity_head index,value yes entity token span
(3) According to what you get handshaking tagging The result of decoding is ST-TO-OT perhaps OT-TO-ST, namely (sub_tail, obj_tail, rel_type)

(4) according to handshaking tagging The result of decoding is SH-TO-OH,ST-TO-OT, sub_head, obj_head, rel_type, verification sub_head,obj_head Whether in head_ind2entities in , If sub_head, obj_head stay head_ind2entities in , verification (sub_tail, obj_tail, rel_type) Whether in (3) Results in , If in , Returns a relational triplet

Model

adopt Handshaking Tagging Get entity pairs , The characteristics of an entity pair are expressed as :
 Feature representation of entity pairs
A model representation of an entity's ownership relationship :
 Entity relationship represents
Model loss function
 Model loss
among ,N It's the length of the sentence , E,H,T respectively EH-TO-ET,SH-to-OH,ST-to-OT, l ^ ∗ \hat l^{*} l^ It's a real tag

Experimental results

TPLinker The comparison with the experimental results of other models is shown in the following figure :
 Comparison of experimental results
Recently in use TPLinker The model does Chinese relation extraction , The data is processed in the following format :

{
    "id": 9,
    "text": " In terms of tutor lineup , Inda is expected to join hands 《 The king of Chinese comedy 》 Select a new generation of comedians ",
    "relation_list": [
      {
        "subject": " The king of Chinese comedy ",
        "object": " Ying Da ",
        "subj_char_span": [
          15,
          20
        ],
        "obj_char_span": [
          8,
          10
        ],
        "predicate": " The guest ",
        "subj_tok_span": [
          15,
          20
        ],
        "obj_tok_span": [
          8,
          10
        ]
      }
    ],
    "entity_list": [
      {
        "text": " Ying Da ",
        "type": " figure ",
        "char_span": [
          8,
          10
        ],
        "tok_span": [
          8,
          10
        ]
      },
      {
        "text": " The king of Chinese comedy ",
        "type": " TV variety ",
        "char_span": [
          15,
          20
        ],
        "tok_span": [
          15,
          20
        ]
      }
    ]
  }

Recently, I have been learning relationship extraction , Mainly look at it. TPLinker,CasRel, Su Shen's bert4keras These three joint extraction models , And use baidurelation2020 A simple comparative experiment was done on the competition data set (GPU The video memory is relatively small ,batch and max length The settings are relatively small ), The number of rounds of three model training is not enough ,epoch Are all 20. Through comparative experiments, it is found that , Su Shen's bert4keras The experimental results and CasRel Similar results , The three models are different in some data , Here is only one comparative example ,

TPLinker Model results 
{
    "text": " Except for the superhero cast of the blind , There are also several well-known behind the scenes workers , such as “ Father of Marvel ” Stan · Li ( There are also many guest stars in front of the screen ),《 Galaxy escort 》 James, the director of the series · Goon ( He was also “ Raccoon rocket ”), Kevin, chairman of Marvel pictures · Fitch, wait ",
    "id": "test_13",
    "relation_list": [
      {
        "subject": " Galaxy escort ",
        "object": " James · Goon ",
        "subj_tok_span": [
          56,
          61
        ],
        "obj_tok_span": [
          67,
          73
        ],
        "subj_char_span": [
          56,
          61
        ],
        "obj_char_span": [
          67,
          73
        ],
        "predicate": " The director "
      },
      {
        "subject": " Raccoon rocket ",
        "object": " James · Goon ",
        "subj_tok_span": [
          80,
          84
        ],
        "obj_tok_span": [
          67,
          73
        ],
        "subj_char_span": [
          80,
          84
        ],
        "obj_char_span": [
          67,
          73
        ],
        "predicate": " The director "
      }
    ]
  }
  {
    "text": " Xiazhiqing graduated from Shanghai Hujiang University ( This is an American missionary school ), The brothers have neither background nor backing in Peking University ",
    "id": "test_15",
    "relation_list": [
      {
        "subject": " C.T.Hsia  ",
        "object": " Shanghai Hujiang University ",
        "subj_tok_span": [
          0,
          3
        ],
        "obj_tok_span": [
          6,
          12
        ],
        "subj_char_span": [
          0,
          3
        ],
        "obj_char_span": [
          6,
          12
        ],
        "predicate": " University one is graduated from "
      }
    ]
  }
  {
    "text": "《 Forget 》 It is a song sung by Taiwanese singer Teresa Teng , This song was originally written in 1979 year 9 month 20 Day included in the album 《 An unforgettable day 》 Issued in Taiwan and other places , Same year 11 month 15 Day included in the album 《 Sweet honey 》 Issued in Hong Kong and other places ",
    "id": "test_24",
    "relation_list": [
      {
        "subject": " Forget ",
        "object": " An unforgettable day ",
        "subj_tok_span": [
          1,
          3
        ],
        "obj_tok_span": [
          42,
          47
        ],
        "subj_char_span": [
          1,
          3
        ],
        "obj_char_span": [
          46,
          51
        ],
        "predicate": " The album "
      },
      {
        "subject": " Forget ",
        "object": " Sweet honey ",
        "subj_tok_span": [
          1,
          3
        ],
        "obj_tok_span": [
          69,
          72
        ],
        "subj_char_span": [
          1,
          3
        ],
        "obj_char_span": [
          75,
          78
        ],
        "predicate": " The album "
      },
      {
        "subject": " Forget ",
        "object": " Teresa Deng ",
        "subj_tok_span": [
          1,
          3
        ],
        "obj_tok_span": [
          10,
          13
        ],
        "subj_char_span": [
          1,
          3
        ],
        "obj_char_span": [
          10,
          13
        ],
        "predicate": " singer "
      }
    ]
  }

CasRel Model results

{
    "text": " Except for the superhero cast of the blind , There are also several well-known behind the scenes workers , such as “ Father of Marvel ” Stan · Li ( There are also many guest stars in front of the screen ),《 Galaxy escort 》 James, the director of the series · Goon ( He was also “ Raccoon rocket ”), Kevin, chairman of Marvel pictures · Fitch, wait ",
    "relation": [
      [
        " Galaxy escort ",
        " The director ",
        " James · Goon "
      ]
    ]
  }
{
    "text": " Xiazhiqing graduated from Shanghai Hujiang University ( This is an American missionary school ), The brothers have neither background nor backing in Peking University ",
    "relation": [
      [
        " C.T.Hsia  ",
        " University one is graduated from ",
        " Hujiang University "
      ],
      [
        " C.T.Hsia  ",
        " University one is graduated from ",
        " Shanghai Hujiang University "
      ]
    ]
  }
  {
    "text": "《 Forget 》 It is a song sung by Taiwanese singer Teresa Teng , This song was originally written in 1979 year 9 month 20 Day included in the album 《 An unforgettable day 》 Issued in Taiwan and other places , Same year 11 month 15 Day included in the album 《 Sweet honey 》 Issued in Hong Kong and other places ",
    "relation": [
      [
        " Forget ",
        " singer ",
        " Teresa Deng "
      ],
      [
        " Forget ",
        " The album ",
        " An unforgettable day "
      ]
    ]
  }

Su Shen's bert4keras Result

{
    "text": " Except for the superhero cast of the blind , There are also several well-known behind the scenes workers , such as “ Father of Marvel ” Stan · Li ( There are also many guest stars in front of the screen ),《 Galaxy escort 》 James, the director of the series · Goon ( He was also “ Raccoon rocket ”), Kevin, chairman of Marvel pictures · Fitch, wait ",
    "bert4keras_relation": [
      [
        " Galaxy escort ",
        " The director ",
        " James · Goon "
      ]
    ]
  }
{
    "text": " Xiazhiqing graduated from Shanghai Hujiang University ( This is an American missionary school ), The brothers have neither background nor backing in Peking University ",
    "bert4keras_relation": [
      [
        " C.T.Hsia  ",
        " University one is graduated from ",
        " Hujiang University "
      ],
      [
        " C.T.Hsia  ",
        " University one is graduated from ",
        " Shanghai Hujiang University "
      ]
    ]
  }
  {
    "text": "《 Forget 》 It is a song sung by Taiwanese singer Teresa Teng , This song was originally written in 1979 year 9 month 20 Day included in the album 《 An unforgettable day 》 Issued in Taiwan and other places , Same year 11 month 15 Day included in the album 《 Sweet honey 》 Issued in Hong Kong and other places ",
    "bert4keras_relation": [
      [
        " Sweet honey ",
        " singer ",
        " Teresa Deng "
      ],
      [
        " Forget ",
        " The album ",
        " An unforgettable day "
      ],
      [
        " Forget ",
        " The album ",
        " Sweet honey "
      ],
      [
        " Forget ",
        " singer ",
        " Teresa Deng "
      ],
      [
        " An unforgettable day ",
        " singer ",
        " Teresa Deng "
      ]
    ]
  }

If there is a mistake , Welcome to point out .

原网站

版权声明
本文为[xuanningmeng]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202170557267240.html