Suffix Trees: Part II Substring search
OK, so lets recall the substring search problem that we discussed in the previous post. We are now going to see an awesome application of Tries to solve this problem.
What is a suffix Tree?
A suffix tree is a Trie of all possible suffixes of a string.
.
E.g. For the string "test", the suffix tree is a trie constructed from the set of strings(suffixes) "test","est","st" and "t".
How this solve our substring problem?
There is an interesting property to note about substrings. Every substring of a given string, is a prefix of some suffix of the string. Hence, searching for a string q in the suffix tree will take only O(|q|) time(since its a Trie). Isn't that mind blowing?
Implementation
My implementation of suffix trees can be found at
https://github.com/varrunr/algo-lib/tree/master/src/stree
Drawbacks
The drawbacks of Tries apply here as well.
1.The implementation complexity is higher
2. The cost of construction of a suffix tree is of the order O(n^2) in terms of time and space. Of course the space complexity can be reduced by a few tricks, like collapsing a path containing 1-degree nodes into references to the original string. There also do exist sophisticated algorithms that can build the entire tree in linear time.(Weiner's, Ukkonen's, McCreight's). I will not delve into these, they are a topic for another day.
PS: My next couple of posts will cover some extra applications of suffix trees like finding the longest Common subsequence and longest palindrome.









