当前位置:网站首页>Bilstm and CRF

Bilstm and CRF

2022-06-25 17:38:00 Green Lantern swordsman

I recently watched Mr. Huang's video class , All of a sudden, I feel strange to these contents , So I looked for it . Now take a note of
One 、 frame
The choice of frame , I think Lao Huang chose the same picture , Just the author has something to expect , So link directly Add link description Enclosed .
Here's a little , The diagram here is very important . Be careful ,LSTM In the output of , Output per word as label Probability .
Two 、LSTM Parameters of

  1. Parameter calculation
  2. LSTM Official documents of
  3. LSTM The structure diagram of is as follows
    LSTM
    lstm The calculation formula of is :
     Insert picture description here
  4. GRU The structure diagram of is :
     Insert picture description here
    In the picture zt and rt They represent update gate and reset gate respectively . The update gate is used to control the extent to which the previous status information is brought into the current status , The larger the value of the update door is, the less the status information of the previous time is brought in . The reset gate controls how much information is written to the current candidate set from the previous state h~t On , The smaller the reset door is , The less information about the previous state is written .
     Insert picture description here
    The update door is GRU Main essence of . Formula analysis , It mainly looks at the writing of the renewal door
    Be careful :rt and zt from h(t-1) and xt from , In fact, it includes the correlation between them .

3、 ... and 、CRF Detailed introduction

  1. Refer to the big brother's blog Blog , I found the most impressive English explanation Add link description . Thus deepening the understanding of chapter one Understanding in .

  2. Besides , I remember relying on templates ,crf It can be learned by machine itself ( Add the template ,U and B Templates ) To carry out BIO Study . My notes have .

  3. BILSTM What we're doing is Each word for each label (BIO) The launch probability of ,CRF What we're doing is The transition probability between words .

  4. loss function yes : The most path And full path functions .

  5. forecast : viterbi algorithm .

  6. Why does Viterbi algorithm not exist The label bias problem of maximum entropy model ?

    answer : because viterbi algorithm Normalization of yes Global normalization of all paths ; Normalization of maximum entropy model It's from previous Local normalization of departure , Local normalization can cause local problems , namely Label offset problem , For details, see https://www.bbsmax.com/A/D854D91p5E/.

原网站

版权声明
本文为[Green Lantern swordsman]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/176/202206251720254291.html