当前位置:网站首页>AC automata
AC automata
2022-07-24 22:09:00 【Harrison who likes to knock code】
AC Problems solved by automata
Play on the prefix tree KMP,
Suppose a string array contains several sensitive words , Given a big article ,AC Automata is to find out which sensitive words are included in a big article
AC Automata is one of the most basic things in many similar matching algorithms
step
First, build some sensitive words into a prefix tree ,( Prefix tree ,,)
The organization between sensitive words is a prefix tree , But we need to upgrade the prefix tree

need Add a line on each node fail The pointer (fail The pointer is indicated by a dotted line ), The first node fail The pointer points to null , And the subordinate direct nodes of the head node are artificially specified fail The pointers all point to the header node ,

Then the bottom fail How to set the pointer ?
First, finish building the prefix tree , Set up fail The pointer , Set the... Of each node according to the order of width priority fail The pointer ,
The following figure ,X The parent node of a node has a path to its own b, however X Of the parent node of fail Pointer refers to the head node , The head node has a trend a node 、b node 、c Path of node , That is to say, the head node does not go b Direction of the road ; So the head node follows fail The pointer goes down , Came to null,null Of course, the node does not go b Direction of the road , therefore x Of fail The pointer points directly to the header node

Then set the following figure X Nodal fail The pointer , Same as above , The path from the parent node to its own direction is d, But from the parent node fail The pointer cannot find the direction d Direction of the road , therefore X Nodal fail The pointer points directly to the header node

Then set the following figure X Nodal fail The pointer ( Set in width first order ),X My father goes to his own way c, Paternal fail Pointer to head node , But the head node has a trend c Direction of the road , therefore X Of fail The pointer points to this point and goes to c Node of direction

fail Pointer setting rules
The first 3) This situation is more complicated ,



AC automata fail The essence of pointer
If the match fails , In the remaining string , Which string prefix matches the longest suffix that must contain the current character ,




package com.harrison.class22;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.Queue;
/** * @author Harrison * @create 2022-04-02-10:21 * @motto looking for him for thousand times in the crowd , Suddenly look back , The man was in the dim light . */
public class Code03_ACAutomaton {
// The node of the prefix tree
public static class Node{
// If one Node,end It's empty , Description is not the end
// If end Not empty , Indicates that this point is the end of a string ,end The value of is this string
public String end;
// Only on the top end When the variable is not empty ,endUse It makes sense
// Express , This string has no answer before , Prevent repeated collection of answers
public boolean endUse;
public Node fail;
// The path below the prefix tree , Hypothetical article , All sensitive words are lowercase , You can fix it 26 length
public Node[] nexts;
public Node(){
endUse=false;
end=null;
fail=null;
nexts=new Node[26];
}
}
public static class ACAutomaton{
private Node root;
// about AC For automata, characters must be on the road , Not on the node , Just like prefix tree
public ACAutomaton(){
root=new Node();
}
// The process of building prefix tree , Call as many sensitive words as you have insert Method
// After the prefix tree is built , Call again build Method to set all fail The pointer
public void insert(String s){
char[] str=s.toCharArray();
Node cur=root;
int index=0;
for(int i=0; i<str.length; i++){
index=str[i]-'a';
if(cur.nexts[index]==null){
cur.nexts[index]=new Node();
}
cur=cur.nexts[index];
}
cur.end=s;
}
// Be careful : Not set the pop-up node itself fail The pointer
// Instead, pop up the node and set all its children fail The pointer
public void build(){
Queue<Node> queue=new LinkedList<>();
queue.add(root);
Node cur=null;
Node cfail=null;
while(!queue.isEmpty()){
// The current node pops up , All descendants of the current node join the queue , The current node gives its children settings fail The pointer
// Some father cur
cur=queue.poll();
for(int i=0; i<26; i++){
// All the way
// cur-> father i No. 1 son , Must take i Son No fail The pointer is set
if(cur.nexts[i]!=null){
// If there is i No. 1 son
cur.nexts[i].fail=root;// Set it to root, If found, set it to someone else
cfail=cur.fail;
while (cfail!=null){
if(cfail.nexts[i]!=null){
cur.nexts[i].fail=cfail.nexts[i];
break;
}
cfail=cfail.fail;
}
queue.add(cur.nexts[i]);
}
}
}
}
// Big article
public List<String> containWords(String content){
char[] str=content.toCharArray();
Node cur=root;
Node follow=null;
int index=0;
List<String> ans=new ArrayList<>();
for(int i=0; i<str.length; i++){
index=str[i]-'a';// road
// If the current character is not matched on this road , Just follow fail Direction to the next path
while(cur.nexts[index]==null && cur!=root){
cur=cur.fail;
}
// 1) Now come to the path , Can continue to match
// 2) Now come to the node , Is the root node of the prefix tree
cur=cur.nexts[index]!=null?cur.nexts[index]:root;
follow=cur;
while(follow!=root){
if(follow.endUse){
break;
}
// Different needs , Revise between this paragraph
if(follow.end!=null){
ans.add(follow.end);
follow.endUse=true;
}
// Different needs , Revise between this paragraph
follow=follow.fail;
}
}
return ans;
}
}
public static void main(String[] args) {
ACAutomaton ac = new ACAutomaton();
ac.insert("dhe");
ac.insert("he");
ac.insert("abcdheks");
// Set up fail The pointer
ac.build();
List<String> contains = ac.containWords("abcdhekskdjfafhasldkflskdjhwqaeruv");
for (String word : contains) {
System.out.println(word);
}
}
}
边栏推荐
- Esp32485 air temperature and humidity test
- Description of differences between esp32c3 led PWM use and esp32
- C # review the entrustment and event
- Shallow copy deep copy
- Gradle学习集合整合
- [postgraduate entrance examination English vocabulary training camp] day 11 - offer, form, maintain, critical
- Integrated swagger learning
- How to drain the applet correctly? Three positions of whole network drainage!
- 一种兼容、更小、易用的WEB字体API
- What is a database password?
猜你喜欢

吾爱第二课-去除网页弹窗
![[good question with two points]](/img/a2/8c0610c4aba6ace4b003efd92c31cd.png)
[good question with two points]

CAD break command

Web3 security go + security

Gradle learning set integration
![[combination of classes (define a class in a class)]](/img/ae/a8226e1795bb45171a11c65d35bcac.png)
[combination of classes (define a class in a class)]

VScode默认输出到调试控制台如何调整到终端以及两者中的乱码问题
![[e-commerce operation] teach you these tips to bid farewell to invalid preset replies](/img/5b/6682c613305deb3dc15401077d38a0.png)
[e-commerce operation] teach you these tips to bid farewell to invalid preset replies
![Mathematical derivation in [pumpkin Book ml] (task4) neural network](/img/11/8d0f7254c2a22d46f2407b5f773c5c.png)
Mathematical derivation in [pumpkin Book ml] (task4) neural network

工程项目管理软件排名
随机推荐
Applet location interface application
ACL 2022 | comparative learning based on optimal transmission to achieve interpretable semantic text similarity
腾讯+360+搜狗校招笔试题+知识点总结
Gradle learning - getting started with gradle
What should I pay attention to when choosing the self built database access method on ECs?
[e-commerce operation] teach you these tips to bid farewell to invalid preset replies
Deep understanding of affairs
Tencent +360+ Sogou school recruitment test + summary of knowledge points
Makefile basics -- extensions
ESP32C3 LED PWM使用和ESP32差异说明
[icml2022] climate change and machine learning: opportunities, challenges and considerations, 121 ppt
中移链(基于EOS)测试环境搭建
PCL点云处理之pcd文件转txt文件(单个或多个批量转换)(六十三)
After reading this article, I also understand this
LED digital display driver IC and anti-interference LED digital tube display driver ic-vk1s68c ssop24 are applicable to finger clip pulse oximeter, arm electronic sphygmomanometer, thermometer, fetal
Diou and ciou loss of loss function
Feeding Program Source Code to ZK VMs
Kubernetes v1.24 is deployed based on containerd
通过企业微信自建应用向微信推送信息
【考研词汇训练营】Day 12 —— native,separate,figure,contribute,species,assumption,suppose