Create an automaton with Aho-Corasick

The Aho-Corasick string search algorithm is a string searching algorithm invented by Alfred V. Aho and Margaret J. Corasick. It is an algorithm that searches for elements (patterns) from a finite set of strings (dictionary) within a text.

The Aho-Corasick algorithm is a technique used in text string searching. It allows searching for multiple patterns in a single pass and is particularly useful in pattern matching within large amounts of data. Now, you can create the automaton with the Aho-Corasick algorithm online here!

To create an automaton with the Aho-Corasick algorithm online, you need to follow these steps:

Steps to create an Aho-Corasick automaton:

  1. Build the suffix tree: The first step is to create a suffix tree for the patterns you want to search for. This tree is known as a suffix tree and is constructed by sequentially inserting all the patterns into the tree.
  2. Failure link: After constructing the suffix tree, a Failure link is assigned to each node. This link points to the closest valid suffix tree node for the suffix that follows the current node.
  3. Assign the output function: For each node in the suffix tree, an output function is assigned. The output function indicates which pattern has been found in the input when the corresponding node is reached.
  4. Search for patterns: Once the suffix tree has been constructed and the Failure links and output functions have been assigned, the automaton can be used to search for patterns in the input text. To do this, start at the root of the trie and follow the input text.
  5. If a node with an output function is encountered, it means that a pattern has been found in the input text.
  6. If a pattern is not found at the current node, follow the Failure link until reaching a node that has an output function or reaching the root of the suffix tree.

Aho example 1

pattern (he, her, his)

Aho example 2

pattern (ababa, bab, bb)

Aho example 3

pattern (gct, tt, tact, agc, acc, ct)

Aho example 4

pattern (ab, about, at, ate, bed, edge)

Inputs
Suffix Tree (format .jff)
Table F
No data
Main Table
ptaoqf(q)=g(f(p),tao)search in treeallocation