trie
{{short description|Search tree data structure}}
{{Hatnote group|
{{About|a specific type of tree data structure|tree data structures generally|tree (data structure)}}
{{Other uses of|tree}}
}}
{{Distinguish|text=tri, try, or tray}}
{{good article}}
{{Infobox data structure
| name = Trie
| invented_by = Edward Fredkin, Axel Thue, and René de la Briandais
| caption = {{math|n}} corresponds to length of the keys.
| invented_year = 1960
| space_avg = {{math|O(n)}}
| space_worst = {{math|O(n)}}
| search_avg = {{math|O(n)}}
| search_worst = {{math|O(n)}}
| insert_avg = {{math|O(n)}}
| insert_worst = {{math|O(n)}}
| delete_avg = {{math|O(n)}}
| delete_worst = {{math|O(n)}}
| type = Tree
}}
In computer science, a trie ({{IPAc-en|ˈ|t|r|aɪ}}, {{IPAc-en|ˈ|t|r|iː|audio=LL-Q1860 (eng)-Naomi Persephone Amethyst (NaomiAmethyst)-trie.wav}}), also known as a digital tree or prefix tree,{{cite web|url=https://bioinformatics.cvr.ac.uk/trie-data-structure/|publisher=CVR, University of Glasgow|title=Trie Data Structure|first=Maha|last=Maabar|date=17 November 2014|access-date=17 April 2022|archive-date=27 January 2021|url-status=live|archive-url=https://web.archive.org/web/20210127130913/https://bioinformatics.cvr.ac.uk/trie-data-structure/}} is a specialized search tree data structure used to store and retrieve strings from a dictionary or set. Unlike a binary search tree, nodes in a trie do not store their associated key. Instead, each node's position within the trie determines its associated key, with the connections between nodes defined by individual characters rather than the entire key.
Tries are particularly effective for tasks such as autocomplete, spell checking, and IP routing, offering advantages over hash tables due to their prefix-based organization and lack of hash collisions. Every child node shares a common prefix with its parent node, and the root node represents the empty string. While basic trie implementations can be memory-intensive, various optimization techniques such as compression and bitwise representations have been developed to improve their efficiency. A notable optimization is the radix tree, which provides more efficient prefix-based storage.
While tries commonly store character strings, they can be adapted to work with any ordered sequence of elements, such as permutations of digits or shapes. A notable variant is the bitwise trie, which uses individual bits from fixed-length binary data (such as integers or memory addresses) as keys.
History, etymology, and pronunciation
The idea of a trie for representing a set of strings was first abstractly described by Axel Thue in 1912.{{cite journal|last=Thue|first=Axel|title=Über die gegenseitige Lage gleicher Teile gewisser Zeichenreihen|year=1912|pages=1–67|url=https://archive.org/details/skrifterutgitavv121chri/page/n11/mode/2up|journal=Skrifter Udgivne Af Videnskabs-Selskabet I Christiania|volume=1912|number=1}} Cited by Knuth. Tries were first described in a computer context by René de la Briandais in 1959.{{cite conference |first=René |last=de la Briandais |year=1959 |title=File searching using variable length keys |conference=Proc. Western J. Computer Conf. |pages=295–298 |doi=10.1145/1457838.1457895 |s2cid=10963780 |url=https://pdfs.semanticscholar.org/3ce3/f4cc1c91d03850ed84ef96a08498e018d18f.pdf |archive-url=https://web.archive.org/web/20200211163605/https://pdfs.semanticscholar.org/3ce3/f4cc1c91d03850ed84ef96a08498e018d18f.pdf |url-status=dead |archive-date=2020-02-11 }} Cited by Brass and by Knuth.{{cite book|last=Brass|first=Peter|title=Advanced Data Structures|publisher=Cambridge University Press|date=8 September 2008|isbn= 978-0521880374|location=UK|doi=10.1017/CBO9780511800191|url=https://www.cambridge.org/core/books/advanced-data-structures/D56E2269D7CEE969A3B8105AD5B9254C}}{{rp|p=336}}
The idea was independently described in 1960 by Edward Fredkin, who coined the term trie, pronouncing it {{IPAc-en|ˈ|t|r|iː}} (as "tree"), after the middle syllable of retrieval.{{cite web|url=https://xlinux.nist.gov/dads/HTML/trie.html|title=trie|first=Paul E.|last=Black|date=2009-11-16|work=Dictionary of Algorithms and Data Structures|publisher=National Institute of Standards and Technology|archive-url=https://web.archive.org/web/20110429080033/http://xlinux.nist.gov/dads/HTML/trie.html|url-status=live|archive-date=2011-04-29}} However, other authors pronounce it {{IPAc-en|ˈ|t|r|aɪ}} (as "try"), in an attempt to distinguish it verbally from "tree".{{cite book|last=Knuth|first=Donald|author-link=Donald Knuth|title=The Art of Computer Programming Volume 3: Sorting and Searching|edition=2nd|year=1997|publisher=Addison-Wesley|isbn=0-201-89685-0|page=492|chapter=6.3: Digital Searching}}
Overview
Tries are a form of string-indexed look-up data structure, which is used to store a dictionary list of words that can be searched on in a manner that allows for efficient generation of completion lists.{{cite web|url=https://ds.cs.rutgers.edu/assignment-trie/|title=Trie|year=2022|publisher=School of Arts and Science, Rutgers University|archive-url=https://ghostarchive.org/archive/20220417170426/https://ds.cs.rutgers.edu/assignment-trie/|url-status=live|archive-date=17 April 2022|access-date=17 April 2022}}{{cite journal|publisher=Syracuse University|url=https://surface.syr.edu/eecs_techreports/162/ |doi=10.1017/S0960129500000803|first1=Richard H.|last1=Connelly|first2=F. Lockwood|last2=Morris|year=1993|title= A generalization of the trie data structure|journal= Mathematical Structures in Computer Science|volume=5 |issue=3 |pages=381–418 |s2cid=18747244 }}{{rp|p=1}} A prefix trie is an ordered tree data structure used in the representation of a set of strings over a finite alphabet set, which allows efficient storage of words with common prefixes.
Tries can be efficacious on string-searching algorithms such as predictive text, approximate string matching, and spell checking in comparison to binary search trees.{{r|reema18|p=358}} A trie can be seen as a tree-shaped deterministic finite automaton.{{cite conference|conference= International Conference on Implementation and Application of Automata |title=Comparison of Construction Algorithms for Minimal, Acyclic, Deterministic, Finite-State Automata from Sets of Strings|first=Jan|last=Daciuk|date=24 June 2003|doi=10.1007/3-540-44977-9_26|url=https://link.springer.com/chapter/10.1007/3-540-44977-9_26|isbn= 978-3-540-40391-3|publisher=Springer Publishing|pages=255–261}}
Operations
Tries support various operations: insertion, deletion, and lookup of a string key. Tries are composed of nodes that contain links, which either point to other suffix child nodes or null. As for every tree, each node but the root is pointed to by only one other node, called its parent. Each node contains as many links as the number of characters in the applicable alphabet (although tries tend to have a substantial number of null links). In some cases, the alphabet used is simply that of the character encoding—resulting in, for example, a size of 256 in the case of (unsigned) ASCII.{{cite book|title=Algorithms|edition=4|first1=Robert|last1=Sedgewick|first2=Kevin|last2=Wayne|author1-link= Robert Sedgewick (computer scientist) |publisher=Addison-Wesley, Princeton University|date=3 April 2011|isbn= 978-0321573513 |url=https://algs4.cs.princeton.edu/home/}}{{rp|p=732}}
The null links within the children of a node emphasize the following characteristics:{{r|robert11|p=734}}{{r|brass|p=336}}
- Characters and string keys are implicitly stored in the trie, and include a character sentinel value indicating string termination.
- Each node contains one possible link to a prefix of strong keys of the set.
A basic structure type of nodes in the trie is as follows; may contain an optional , which is associated with each key stored in the last character of string, or terminal node.
style="vertical-align:top"
| structure Node Children Node[Alphabet-Size] Is-Terminal Boolean Value Data-Type end structure |
= Searching =
Searching for a value in a trie is guided by the characters in the search string key, as each node in the trie contains a corresponding link to each possible character in the given string. Thus, following the string within the trie yields the associated value for the given string key. A null link during the search indicates the inexistence of the key.{{r| robert11|p=732-733}}
The following pseudocode implements the search procedure for a given string {{mono|key}} in a rooted trie {{mono|x}}.{{r|gonnet91|p=135}}
style="vertical-align:top"
| Trie-Find(x, key) for 0 ≤ i < key.length do if x.Children[key[i]] = nil then return false end if x := x.Children[key[i]] repeat return x.Value |
In the above pseudocode, {{mono|x}} and {{mono|key}} correspond to the pointer of trie's root node and the string key respectively. The search operation, in a standard trie, takes time, where is the size of the string parameter , and corresponds to the alphabet size.{{cite book|first=Varsha H.|last=Patil|date=10 May 2012|isbn= 9780198066231|publisher=Oxford University Press|url=https://global.oup.com/academic/product/data-structures-using-c-9780198066231|title=Data Structures using C++}}{{rp|p=754}} Binary search trees, on the other hand, take in the worst case, since the search depends on the height of the tree () of the BST (in case of balanced trees), where and being number of keys and the length of the keys.{{r|reema18|p=358}}
The trie occupies less space in comparison with a BST in the case of a large number of short strings, since nodes share common initial string subsequences and store the keys implicitly.{{r|reema18|p=358}} The terminal node of the tree contains a non-null value, and it is a search hit if the associated value is found in the trie, and search miss if it is not.{{r|robert11|p=733}}
= Insertion =
Insertion into trie is guided by using the character sets as indexes to the children array until the last character of the string key is reached.{{r|robert11|p=733-734}} Each node in the trie corresponds to one call of the radix sorting routine, as the trie structure reflects the execution of pattern of the top-down radix sort.{{r| gonnet91|p=135}}
style="vertical-align:top"
| 1 2 3 4 5 6 7 8 9 | Trie-Insert(x, key, value) for 0 ≤ i < key.length do if x.Children[key[i]] = nil then x.Children[key[i]] := Node() end if x := x.Children[key[i]] repeat x.Value := value x.Is-Terminal := True |
If a null link is encountered prior to reaching the last character of the string key, a new node is created (line 3).{{r|robert11|p=745}} The value of the terminal node is assigned to the input value; therefore, if the former was non-null at the time of insertion, it is substituted with the new value.
= Deletion =
Deletion of a key–value pair from a trie involves finding the terminal node with the corresponding string key, marking the terminal indicator and value to false and null correspondingly.{{r|robert11|p=740}}
The following is a recursive procedure for removing a string {{mono|key}} from rooted trie ({{mono|x}}).
style="vertical-align:top"
| style="text-align: right" | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | Trie-Delete(x, key) if key = nil then if x.Is-Terminal = True then x.Is-Terminal := False x.Value := nil end if for 0 ≤ i < x.Children.length if x.Children[i] != nil return x end if repeat return nil end if x.Children[key[0]] := Trie-Delete(x.Children[key[0]], key[1:]) return x |
The procedure begins by examining the {{mono|key}}; null denotes the arrival of a terminal node or end of a string key. If the node is terminal it has no children, it is removed from the trie (line 14). However, an end of string key without the node being terminal indicates that the key does not exist, thus the procedure does not modify the trie. The recursion proceeds by incrementing {{mono|key}}'s index.
Replacing other data structures
= Replacement for hash tables =
A trie can be used to replace a hash table, over which it has the following advantages:{{r|reema18|p=358}}
- Searching for a node with an associated key of size has the complexity of , whereas an imperfect hash function may have numerous colliding keys, and the worst-case lookup speed of such a table would be , where denotes the total number of nodes within the table.
- Tries do not need a hash function for the operation, unlike a hash table; there are also no collisions of different keys in a trie.
- Buckets in a trie, which are analogous to hash table buckets that store key collisions, are necessary only if a single key is associated with more than one value.
- String keys within the trie can be sorted using a predetermined alphabetical ordering.
However, tries are less efficient than a hash table when the data is directly accessed on a secondary storage device such as a hard disk drive that has higher random access time than the main memory.{{cite journal | author=Edward Fredkin| author-link=Edward Fredkin| title=Trie Memory| journal=Communications of the ACM| year=1960| volume=3| issue=9| pages=490–499| doi=10.1145/367390.367400 | s2cid=15384533| doi-access=free}} Tries are also disadvantageous when the key value cannot be easily represented as string, such as floating point numbers where multiple representations are possible (e.g. 1 is equivalent to 1.0, +1.0, 1.00, etc.),{{r|reema18|p=359}} however it can be unambiguously represented as a binary number in IEEE 754, in comparison to two's complement format.{{cite web|publisher=Department of Mathematics and Computer Science, Emory University|title=The IEEE 754 Format|url=http://mathcenter.oxford.emory.edu/site/cs170/ieee754/|access-date=17 April 2022|author1=S. Orley|author2=J. Mathews|url-status=live|archive-date=28 March 2022|archive-url=https://web.archive.org/web/20220328093853/http://mathcenter.oxford.emory.edu/site/cs170/ieee754/}}
Implementation strategies
File:Pointer implementation of a trie.svg: vertical arrows are {{mono|child}} pointers, dotted horizontal arrows are {{mono|next}} pointers. The set of strings stored in this trie is {{mono|{baby, bad, bank, box, dad, dance}}}. The lists are sorted to allow traversal in lexicographic order.]]
Tries can be represented in several ways, corresponding to different trade-offs between memory use and speed of the operations.{{r| brass|p=341}} Using a vector of pointers for representing a trie consumes enormous space; however, memory space can be reduced at the expense of running time if a singly linked list is used for each node vector, as most entries of the vector contains .{{r| KnuthVol3|p=495}}
Techniques such as alphabet reduction may reduce the large space requirements by reinterpreting the original string as a longer string over a smaller alphabet i.e. a string of {{mvar|n}} bytes can alternatively be regarded as a string of {{math|2n}} four-bit units and stored in a trie with 16 instead of 256 pointers per node. Although this can reduce memory usage by up to a factor of eight, lookups need to visit twice as many nodes in the worst case.{{r| brass|p= 347–352}} Other techniques include storing a vector of 256 ASCII pointers as a bitmap of 256 bits representing ASCII alphabet, which reduces the size of individual nodes dramatically.{{cite book|last1=Bellekens|first1=Xavier|title=Proceedings of the 7th International Conference on Security of Information and Networks - SIN '14|chapter=A Highly-Efficient Memory-Compression Scheme for GPU-Accelerated Intrusion Detection Systems|date=2014|publisher=ACM|location=Glasgow, Scotland, UK|isbn=978-1-4503-3033-6|pages=302:302–302:309|doi=10.1145/2659651.2659723|arxiv=1704.02272|s2cid=12943246}}
= Bitwise tries =
{{see also| x-fast trie| Bitwise trie with bitmap}}
Bitwise tries are used to address the enormous space requirement for the trie nodes in a naive simple pointer vector implementations. Each character in the string key set is represented via individual bits, which are used to traverse the trie over a string key. The implementations for these types of trie use vectorized CPU instructions to find the first set bit in a fixed-length key input (e.g. GCC's __builtin_clz()
intrinsic function). Accordingly, the set bit is used to index the first item, or child node, in the 32- or 64-entry based bitwise tree. Search then proceeds by testing each subsequent bit in the key.{{cite journal|title=Log-logarithmic worst-case range queries are possible in space O(n)|doi=10.1016/0020-0190(83)90075-3|url=https://www.sciencedirect.com/science/article/abs/pii/0020019083900753|volume=17|issue=2|date=27 January 1983|pages=81–84|first=Dan E.|last=Willar|journal=Information Processing Letters}}
This procedure is also cache-local and highly parallelizable due to register independency, and thus performant on out-of-order execution CPUs.
= Compressed tries =
{{main|Radix tree}}
Radix tree, also known as a compressed trie, is a space-optimized variant of a trie in which any node with only one child gets merged with its parent; elimination of branches of the nodes with a single child results in better metrics in both space and time.{{cite web|url=https://www.cise.ufl.edu/~sahni/dsaac/enrich/c16/tries.htm|publisher=University of Florida|access-date=17 April 2022|archive-url=https://web.archive.org/web/20160703161316/http://www.cise.ufl.edu/~sahni/dsaac/enrich/c16/tries.htm|archive-date=3 July 2016|url-status=live|author=Sartaj Sahni|title=Data Structures, Algorithms, & Applications in C++: Tries|year=2004}}{{cite book|title=Handbook of Data Structures and Applications|first1=Dinesh P.|last1=Mehta|first2=Sartaj|last2=Sahni|isbn= 978-1498701853 |publisher=Chapman & Hall, University of Florida|url=https://www.routledge.com/Handbook-of-Data-Structures-and-Applications/Mehta-Sahni/p/book/9780367572006|edition=2|date=7 March 2018|chapter=Tries}}{{rp|p=452}} This works best when the trie remains static and set of keys stored are very sparse within their representation space.{{cite journal|title=Incremental Construction of Minimal Acyclic Finite-State Automata|volume=26|issue=1|date=1 March 2000|author1=Jan Daciuk |author2=Stoyan Mihov |author3=Bruce W. Watson |author4=Richard E. Watson |journal = Computational Linguistics |pages=3–16|publisher=MIT Press|doi=10.1162/089120100561601|arxiv=cs/0007009|bibcode=2000cs........7009D|url=https://direct.mit.edu/coli/article/26/1/3/1628/Incremental-Construction-of-Minimal-Acyclic-Finite|doi-access=free}}{{rp|p=3–16}}
One more approach is to "pack" the trie, in which a space-efficient implementation of a sparse packed trie applied to automatic hyphenation, in which the descendants of each node may be interleaved in memory.{{cite thesis|degree=Doctor of Philosophy|title=Word Hy-phen-a-tion By Com-put-er|url=http://www.tug.org/docs/liang/liang-thesis.pdf|author=Franklin Mark Liang|year=1983|publisher=Stanford University|access-date=2010-03-28|archive-url=https://web.archive.org/web/20051111105124/http://www.tug.org/docs/liang/liang-thesis.pdf|url-status=live|archive-date=2005-11-11}}
== Patricia trees ==
{{multiple image
| direction = vertical
| image1 = Patricia tree.png
| image2 = Patricia tree ASCII to binary.png
| footer = Patricia tree representation of the string set
{{mono|{{(}}in, integer, interval, string, structure{{)}}}}.
| width = 400
}}
Patricia trees are a particular implementation of the compressed binary trie that uses the binary encoding of the string keys in its representation.{{cite web|url=https://xlinux.nist.gov/dads/HTML/patriciatree.html|publisher=National Institute of Standards and Technology|archive-date=14 February 2022|archive-url=https://web.archive.org/web/20220214182428/https://xlinux.nist.gov/dads/HTML/patriciatree.html|url-status=live|access-date=17 April 2022|title=Patricia tree}}{{cite book|title=Handbook of algorithms and data structures: in Pascal and C|edition=2|date=January 1991|isbn=978-0-201-41607-7|publisher=Addison-Wesley|location=Boston, United States|first1=G. H.|last1=Gonnet|first2=R. Baeza|last2=Yates|url=https://dl.acm.org/doi/book/10.5555/103324}}{{rp|p=140}} Every node in a Patricia tree contains an index, known as a "skip number", that stores the node's branching index to avoid empty subtrees during traversal.{{r|gonnet91|p=140-141}} A naive implementation of a trie consumes immense storage due to larger number of leaf-nodes caused by sparse distribution of keys; Patricia trees can be efficient for such cases.{{r|gonnet91|p=142}}{{r|maxime09|p=3}}
A representation of a Patricia tree is shown to the right. Each index value adjacent to the nodes represents the "skip number"—the index of the bit with which branching is to be decided.{{cite book|title=Encyclopedia of Database Systems|first1=Maxime|last1=Crochemore|first2=Thierry|last2=Lecroq|url=https://link.springer.com/referencework/10.1007/978-0-387-39940-9|doi=10.1007/978-0-387-39940-9|isbn=978-0-387-49616-0|publisher=Springer Publishing|location=Boston, United States|year=2009|chapter=Trie|bibcode=2009eds..book.....L |via=HAL (open archive)}}{{rp|p=3}} The skip number 1 at node 0 corresponds to the position 1 in the binary encoded ASCII where the leftmost bit differed in the key set {{mvar|X}}.{{r|maxime09|p=3-4}} The skip number is crucial for search, insertion, and deletion of nodes in the Patricia tree, and a bit masking operation is performed during every iteration.{{r|gonnet91|p=143}}
Applications
Trie data structures are commonly used in predictive text or autocomplete dictionaries, and approximate matching algorithms.{{Cite journal|last1=Aho|first1=Alfred V.|last2=Corasick|first2=Margaret J.|date=Jun 1975|title=Efficient String Matching: An Aid to Bibliographic Search|journal=Communications of the ACM|volume=18|issue=6|pages=333–340|doi=10.1145/360825.360855|s2cid=207735784|doi-access=free}} Tries enable faster searches, occupy less space, especially when the set contains large number of short strings, thus used in spell checking, hyphenation applications and longest prefix match algorithms.{{cite book|title= Data Structures Using C|date=13 October 2018|edition=2|first=Reema|last=Thareja|publisher=Oxford University Press|url=https://global.oup.com/academic/product/data-structures-using-c-9780198099307|isbn= 9780198099307|url-access=subscription|chapter=Hashing and Collision}}{{rp|p=358}} However, if storing dictionary words is all that is required (i.e. there is no need to store metadata associated with each word), a minimal deterministic acyclic finite state automaton (DAFSA) or radix tree would use less storage space than a trie. This is because DAFSAs and radix trees can compress identical branches from the trie which correspond to the same suffixes (or parts) of different words being stored. String dictionaries are also utilized in natural language processing, such as finding lexicon of a text corpus.{{cite journal|journal=Information Systems|first1=Miguel A.|last1=Martinez-Prieto|first2=Nieves|last2=Brisaboa|first3=Rodrigo|last3=Canovas|first4=Francisco|last4=Claude|first5=Gonzalo|last5=Navarro|publisher=Elsevier|volume=56|doi=10.1016/j.is.2015.08.008|url=https://www.sciencedirect.com/science/article/abs/pii/S0306437915001672|date=March 2016|title=Practical compressed string dictionaries|pages=73–108|issn= 0306-4379 }}{{rp|p=73}}
= Sorting =
Lexicographic sorting of a set of string keys can be implemented by building a trie for the given keys and traversing the tree in pre-order fashion;{{cite web |url=https://www.cs.helsinki.fi/u/tpkarkka/opetus/12s/spa/lecture02.pdf |title=Lecture 2 |first=Juha |last=Kärkkäinen |quote="The preorder of the nodes in a trie is the same as the lexicographical order of the strings they represent assuming the children of a node are ordered by the edge labels." |publisher=University of Helsinki}} this is also a form of radix sort.{{Cite web|url=https://www.ifi.uzh.ch/dam/jcr:27d15f69-2a44-40f9-8b41-6d11b5926c67/ReportKallisMScBasis.pdf|title=The Adaptive Radix Tree (Report #14-708-887)|last=Kallis|first=Rafael|date=2018|website=University of Zurich: Department of Informatics, Research Publications}} Tries are also fundamental data structures for burstsort, which is notable for being the fastest string sorting algorithm as of 2007,{{cite journal | url=https://people.eng.unimelb.edu.au/jzobel/fulltext/acmjea06.pdf | doi=10.1145/1187436.1187439 | author=Ranjan Sinha and Justin Zobel and David Ring | title=Cache-Efficient String Sorting Using Copying | journal=ACM Journal of Experimental Algorithmics | volume=11 | pages=1–32 | date=Feb 2006 | s2cid=3184411 }} accomplished by its efficient use of CPU cache.{{cite book | doi=10.1007/978-3-540-89097-3_3 | author=J. Kärkkäinen and T. Rantala | chapter=Engineering Radix Sort for Strings | editor=A. Amir and A. Turpin and A. Moffat | title=String Processing and Information Retrieval, Proc. SPIRE | publisher=Springer | series=Lecture Notes in Computer Science | volume=5280 | pages=3–14 | year=2008 | isbn=978-3-540-89096-6 }}
= Full-text search =
A special kind of trie, called a suffix tree, can be used to index all suffixes in a text to carry out fast full-text searches.{{cite journal|journal=SIAM Journal on Computing|doi=10.1137/S0097539792231982|volume=24|issue=3|url=https://epubs.siam.org/doi/abs/10.1137/S0097539792231982|title=A Generalization of the Suffix Tree to Square Matrices, with Applications|pages=520–562|issn= 0097-5397 |publisher=Society for Industrial and Applied Mathematics|date=28 May 1992|last1=Giancarlo|first1=Raffaele}}
= Web search engines =
A specialized kind of trie called a compressed trie, is used in web search engines for storing the indexes - a collection of all searchable words.{{cite journal|title=An enhanced dynamic hash TRIE algorithm for lexicon search|first1=Lai|last1=Yang|first2=Lida|last2=Xu|first3=Zhongzhi|last3=Shi|doi=10.1080/17517575.2012.665483|date=23 March 2012|pages=419–432|volume=6|issue=4|journal=Enterprise Information Systems|bibcode=2012EntIS...6..419Y |s2cid=37884057 }} Each terminal node is associated with a list of URLs—called occurrence list—to pages that match the keyword. The trie is stored in the main memory, whereas the occurrence is kept in an external storage, frequently in large clusters, or the in-memory index points to documents stored in an external location.{{cite journal|first1=Frederik|last1=Transier|first2=Peter|last2=Sanders|volume=29|issue=1|date=December 2010|pages=1–37|doi=10.1145/1877766.1877768|title=Engineering basic algorithms of an in-memory text search engine|url=https://dl.acm.org/doi/10.1145/1877766.1877768|publisher=Association for Computing Machinery|journal=ACM Transactions on Information Systems|s2cid=932749 }}
= Bioinformatics =
Tries are used in Bioinformatics, notably in sequence alignment software applications such as BLAST, which indexes all the different substring of length k (called k-mers) of a text by storing the positions of their occurrences in a compressed trie sequence databases.{{r|prieto16|p=75}}
= Internet routing =
{{see also|Luleå algorithm}}
Compressed variants of tries, such as databases for managing Forwarding Information Base (FIB), are used in storing IP address prefixes within routers and bridges for prefix-based lookup to resolve mask-based operations in IP routing.{{r|prieto16|p=75}}
See also
{{div col|colwidth=22em}}
{{div col end}}
References
{{reflist|30em}}
External links
{{Commons category}}
{{wiktionary}}
- [https://xlinux.nist.gov/dads/HTML/trie.html NIST's Dictionary of Algorithms and Data Structures: Trie]
{{CS-Trees}}
{{Data structures}}
{{Strings}}
Category:Trees (data structures)
Category:Finite-state machines
Category:Articles with example Python (programming language) code