当前位置:网站首页>Meaning of Jieba participle part of speech tagging

Meaning of Jieba participle part of speech tagging

2022-06-25 08:28:00 Happy little yard farmer

Part of speech tagging of stuttering participles

The default mode is to use jieba.posseg.cut(), Include 24 POS Tags ( Lowercase letters ).
paddle There are more patterns 4 Proper name category labels ( Capital ).
1

  • jieba.posseg.POSTokenizer(tokenizer=None) Create a new custom word splitter ,tokenizer Parameter to specify the internal use of jieba.Tokenizer Word segmentation is .jieba.posseg.dt Label the word breaker for the default part of speech .
  • Mark the part of speech of each word after sentence segmentation , Adopt and ictclas Compatible notation .
  • except jieba Default segmentation mode , Provide paddle Part of speech tagging function under the mode .paddle The mode uses delayed loading , adopt enable_paddle() install paddlepaddle-tiny, also import Related codes ;
  • Usage examples
>>> import jieba
>>> import jieba.posseg as pseg
>>> words = pseg.cut(" I love tian 'anmen square in Beijing ") #jieba The default mode 
>>> jieba.enable_paddle() # start-up paddle Pattern . 0.40 We started to support , Earlier versions did not support 
>>> words = pseg.cut(" I love tian 'anmen square in Beijing ",use_paddle=True) #paddle Pattern 
>>> for word, flag in words:
...    print('%s %s' % (word, flag))
...
 I  r
 Love  v
 Beijing  ns
 The tiananmen square  ns

Welcome to my official account. 【SOTA Technology interconnection 】, I will share more dry goods .

 Insert picture description here

原网站

版权声明
本文为[Happy little yard farmer]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202200556381142.html