当前位置:网站首页>AC automata
AC automata
2022-07-24 22:09:00 【Harrison who likes to knock code】
AC Problems solved by automata
Play on the prefix tree KMP,
Suppose a string array contains several sensitive words , Given a big article ,AC Automata is to find out which sensitive words are included in a big article
AC Automata is one of the most basic things in many similar matching algorithms
step
First, build some sensitive words into a prefix tree ,( Prefix tree ,,)
The organization between sensitive words is a prefix tree , But we need to upgrade the prefix tree

need Add a line on each node fail The pointer (fail The pointer is indicated by a dotted line ), The first node fail The pointer points to null , And the subordinate direct nodes of the head node are artificially specified fail The pointers all point to the header node ,

Then the bottom fail How to set the pointer ?
First, finish building the prefix tree , Set up fail The pointer , Set the... Of each node according to the order of width priority fail The pointer ,
The following figure ,X The parent node of a node has a path to its own b, however X Of the parent node of fail Pointer refers to the head node , The head node has a trend a node 、b node 、c Path of node , That is to say, the head node does not go b Direction of the road ; So the head node follows fail The pointer goes down , Came to null,null Of course, the node does not go b Direction of the road , therefore x Of fail The pointer points directly to the header node

Then set the following figure X Nodal fail The pointer , Same as above , The path from the parent node to its own direction is d, But from the parent node fail The pointer cannot find the direction d Direction of the road , therefore X Nodal fail The pointer points directly to the header node

Then set the following figure X Nodal fail The pointer ( Set in width first order ),X My father goes to his own way c, Paternal fail Pointer to head node , But the head node has a trend c Direction of the road , therefore X Of fail The pointer points to this point and goes to c Node of direction

fail Pointer setting rules
The first 3) This situation is more complicated ,



AC automata fail The essence of pointer
If the match fails , In the remaining string , Which string prefix matches the longest suffix that must contain the current character ,




package com.harrison.class22;
import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.Queue;
/** * @author Harrison * @create 2022-04-02-10:21 * @motto looking for him for thousand times in the crowd , Suddenly look back , The man was in the dim light . */
public class Code03_ACAutomaton {
// The node of the prefix tree
public static class Node{
// If one Node,end It's empty , Description is not the end
// If end Not empty , Indicates that this point is the end of a string ,end The value of is this string
public String end;
// Only on the top end When the variable is not empty ,endUse It makes sense
// Express , This string has no answer before , Prevent repeated collection of answers
public boolean endUse;
public Node fail;
// The path below the prefix tree , Hypothetical article , All sensitive words are lowercase , You can fix it 26 length
public Node[] nexts;
public Node(){
endUse=false;
end=null;
fail=null;
nexts=new Node[26];
}
}
public static class ACAutomaton{
private Node root;
// about AC For automata, characters must be on the road , Not on the node , Just like prefix tree
public ACAutomaton(){
root=new Node();
}
// The process of building prefix tree , Call as many sensitive words as you have insert Method
// After the prefix tree is built , Call again build Method to set all fail The pointer
public void insert(String s){
char[] str=s.toCharArray();
Node cur=root;
int index=0;
for(int i=0; i<str.length; i++){
index=str[i]-'a';
if(cur.nexts[index]==null){
cur.nexts[index]=new Node();
}
cur=cur.nexts[index];
}
cur.end=s;
}
// Be careful : Not set the pop-up node itself fail The pointer
// Instead, pop up the node and set all its children fail The pointer
public void build(){
Queue<Node> queue=new LinkedList<>();
queue.add(root);
Node cur=null;
Node cfail=null;
while(!queue.isEmpty()){
// The current node pops up , All descendants of the current node join the queue , The current node gives its children settings fail The pointer
// Some father cur
cur=queue.poll();
for(int i=0; i<26; i++){
// All the way
// cur-> father i No. 1 son , Must take i Son No fail The pointer is set
if(cur.nexts[i]!=null){
// If there is i No. 1 son
cur.nexts[i].fail=root;// Set it to root, If found, set it to someone else
cfail=cur.fail;
while (cfail!=null){
if(cfail.nexts[i]!=null){
cur.nexts[i].fail=cfail.nexts[i];
break;
}
cfail=cfail.fail;
}
queue.add(cur.nexts[i]);
}
}
}
}
// Big article
public List<String> containWords(String content){
char[] str=content.toCharArray();
Node cur=root;
Node follow=null;
int index=0;
List<String> ans=new ArrayList<>();
for(int i=0; i<str.length; i++){
index=str[i]-'a';// road
// If the current character is not matched on this road , Just follow fail Direction to the next path
while(cur.nexts[index]==null && cur!=root){
cur=cur.fail;
}
// 1) Now come to the path , Can continue to match
// 2) Now come to the node , Is the root node of the prefix tree
cur=cur.nexts[index]!=null?cur.nexts[index]:root;
follow=cur;
while(follow!=root){
if(follow.endUse){
break;
}
// Different needs , Revise between this paragraph
if(follow.end!=null){
ans.add(follow.end);
follow.endUse=true;
}
// Different needs , Revise between this paragraph
follow=follow.fail;
}
}
return ans;
}
}
public static void main(String[] args) {
ACAutomaton ac = new ACAutomaton();
ac.insert("dhe");
ac.insert("he");
ac.insert("abcdheks");
// Set up fail The pointer
ac.build();
List<String> contains = ac.containWords("abcdhekskdjfafhasldkflskdjhwqaeruv");
for (String word : contains) {
System.out.println(word);
}
}
}
边栏推荐
- [postgraduate entrance examination English vocabulary training camp] day 11 - offer, form, maintain, critical
- SQL语言的通用语法及分类(二)
- 【考研英语词汇训练营】Day 11 —— offer ,form ,maintain ,critical
- Apipost签约中国电信!携手加速企业数字化变革
- C# 使用SQLite
- Gradle 学习 ----Gradle 进阶说明
- CAD text styles
- 损失函数之Diou和Ciou loss
- Ue5 reports an error using the plug-in quixel Bridge
- How much does it cost to build your own personal server
猜你喜欢

Maixll dock QR code recognition

SVM - for linear separability (Part 2)

Kubernetes v1.24 is deployed based on containerd

Gradle学习集合整合

Web3 security go + security

【考研英语词汇训练营】Day 11 —— offer ,form ,maintain ,critical

Apipost signs up with Chinatelecom! Work together to accelerate the digital transformation of enterprises

从A76到A78——在变化中学习ARM微架构

Gradle 学习 ----Gradle 与Idea整合

Machine learning kmeans
随机推荐
Description of differences between esp32c3 led PWM use and esp32
ACL 2022 | comparative learning based on optimal transmission to achieve interpretable semantic text similarity
Use of templates
集成Swagger 学习
How does novice Xiaobai build a personal server?
2018mysql technology Q & a collection, hoping to give some help to students who like MySQL
Both Chen Chunhua and Mo Yan have words of suffering
"Iruntime": undeclared identifier
PR 2022 22.5 Chinese version
模板的使用
一键编译安装redis6.2.4
实现redis哨兵,模拟master故障场景
[e-commerce operation] teach you these tips to bid farewell to invalid preset replies
Tencent +360+ Sogou school recruitment test + summary of knowledge points
第二十周作业
RISC0:Towards a Unified Compilation Framework for Zero Knowledge
CAD copy commands
[combination of classes (define a class in a class)]
通过企业微信自建应用向微信推送信息
How to refine the top products of the website in the top 10 of the list for three consecutive months?