Abstract: Data compression is an indispensable technology for the handling of large-scale data available today. The traditional objective of compression has been to save storage and communication costs, whereas actually using the data normally requires a decompression step which can require enormous computational resources. However, recent advances in compressed string processing algorithms give us an intriguing new perspective in which compression can be regarded as a form of pre-processing which not only reduces space requirements for storage, but allows efficient processing of the strings, including compressed pattern matching, string indices, edit distance and its variants, and various other applications. These methods assume a compressed representation of the text as input, and process them without explicit decompression. An interesting property of these methods is that they can be theoretically – and sometimes even practically – faster than algorithms which work on an uncompressed representation of the same data. In this talk we give an overview of recent progress in this area.
|