An Online Algorithm for Finding the Longest Previous Factors
(Citations: 3)
We present a novel algorithm for finding the longest factors in a text, for which the working space is proportional to the
history text size. Moreover, our algorithm is online and exact; in that, unlike the previous batch algorithms [4, 5, 6, 7,
14], which needs to read the entire input beforehand, our algorithm reports the longest match just after reading each character.
This algorithm can be directly used for data compression, pattern analysis, and data mining. Our algorithm also supports the
window buffer, in that we can bound the working space by discarding the history from the oldest character. Using the dynamic
rank/select dictionary [17], our algorithm requires n logσ + O(n logσ) + O(n) bits of working space, and O(log3
n) time per character, O(n log3
n) total time, n is the length of the history, and σ is the alphabet size. We implemented our algorithm and compared it with the recent algorithms [4, 5, 14] in terms of speed
and the working space. We found that our algorithm can work with a smaller working space, less than 1/2 of those for the previous
methods in real-world data, and with a reasonable decline in speed.