当前位置:网站首页>IK word breaker
IK word breaker
2022-06-26 18:33:00 【cc_ nanke dream】
1、IK There are two kinds of word segmentation modes in word segmentation :ik_max_word and ik_smart Pattern .
【1】ik_max_word ( Commonly used ) Will do the most fine-grained text splitting
POST _analyze
{
"analyzer": "ik_max_word",
"text": " Beijing Changchun bridge subway station "
} 
【2】ik_smart Will do the most coarse-grained resolution
POST _analyze
{
"analyzer": "ik_smart",
"text": " Beijing Changchun bridge subway station "
} 
2、 Extended dictionary use
Extended words are words that don't want to be separated , Let them form a group , For example, Changchun bridge
1、 Custom extended Thesaurus
Enter into config/analysis-ik( Plug in installation mode ) or /usr/elasticsearch/plugins/analysis-ik/config/ Add a user-defined dictionary under the directory
vi cc_ext_dict.dic Input : Changchun bridge

2、 Add a custom extended dictionary file to IKAnalyzer.cfg.xml Configuration in progress
vi IKAnalyzer.cfg.xml

3、 restart es
/usr/elasticsearch/bin/elasticsearch
3、 Disable dictionary use
Stop words are frequently used in text , But it doesn't have much impact on semantics , For example, Chinese “ Of 、 Oh 、 了 、 Well ” etc. , These words are called stop words . It is often filtered out and not indexed
1、 Custom stop Thesaurus
Enter into config/analysis-ik( Plug in installation mode ) or /usr/elasticsearch/plugins/analysis-ik/config/ Add a user-defined dictionary under the directory
vi cc_stop_dict.dic Input Of 、 Oh 、 了 、 Well

2、 Add to IKAnalyzer.cfg.xml Configuration in progress

3、 restart es
/usr/elasticsearch/bin/elasticsearch
4、 Synonyms use
Words with the same meaning , When searching, you should also find out , such as “ Steamed bread ” and “ Steamed bread ”, This situation is called synonym query
Be careful : Extension words and stop words are used in indexing , Synonyms are the time of retrieval
1、 establish synonym.txt
vi synonym.txt Enter synonyms

2、 restart es
/usr/elasticsearch/bin/elasticsearch
3、 Use is to specify synonym.text
The prefix path is :/usr/elasticsearch/config/
analysis: Create a directory for yourself

PUT /cc003
{
"settings": {
"analysis": {
"filter": {
"word_sync": {
"type": "synonym",
"synonyms_path": "analysis/synonym.txt"
}
},
"analyzer": {
"ik_sync_max_word": {
"filter": [
"word_sync"
],
"type": "custom",
"tokenizer": "ik_max_word"
},"ik_sync_smart": {
"filter": [
"word_sync"
],
"type": "custom",
"tokenizer": "ik_smart"
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text",
"analyzer": "ik_sync_max_word",
"search_analyzer": "ik_sync_max_word"
}
}
}
}4、 Add content
POST /cc003/_doc/1
{
"name":" What do you call steamed bread "
}5、 Inquire about
POST /cc003/_doc/_search
{
"query":{
"match":{
"name": " Steamed bread "
}
}
} 
边栏推荐
猜你喜欢

Redis single sign on system + voting system

Leetcode 238 product of arrays other than itself

JVM entry Door (1)

(树) 树状数组

Introduction to Ethereum Technology Architecture

Record of user behavior log in SSO microservice Engineering

CLion断点单步调试

图像二值化处理

ARM裸板调试之串口打印及栈初步分析

(必须掌握的多线程知识点)认识线程,创建线程,使用Thread的常见方法及属性,以及线程的状态和状态转移的意义
随机推荐
LeetCode 面试题29 顺时针打印矩阵
To: Apple CEO Cook: great ideas come from constantly rejecting the status quo
LeetCode 238 除自身以外数组的乘积
(multi threading knowledge points that must be mastered) understand threads, create threads, common methods and properties of using threads, and the meaning of thread state and state transition
in和exsits、count(*)查询优化
Row lock analysis and deadlock
微服务版单点登陆系统(SSO)
Procedure steps for burning a disc
Solidity - contract inheritance sub contract contains constructor errors and one contract calls the view function of another contract to charge gas fees
Clion breakpoint single step debugging
微信小程序 自定义 弹框组件
In and exceptions, count (*) query optimization
成功解决之Jenkins报错:The goal you specified requires a project to execute but there is no POM
软考备战多媒体系统
ARM裸板调试之串口打印及栈初步分析
Which securities company is better for a novice to open a stock trading account? How is it safer to speculate in stocks??
DVD-数字通用光盘
PC end records 515 ground sweeping robot /scan data
深度学习之Numpy篇
物联网协议的王者:MQTT