Sub-linear algorithms for shortest unique substring and maximal unique matches

Loading...
Thumbnail Image

Authors

Senanayaka, Sanjeewa Bandara

License

DOI

Type

Thesis

Journal Title

Journal ISSN

Volume Title

Publisher

University of Wisconsin--Whitewater

Grantor

Abstract

Living in an era where we produce quintillions of data on a daily basis means that there are many intriguing (possibly unrevealed) facts behind them. Some of the data that exists in large quantities is encountered in Computational Biology; our DNA strands are extremely long ( 3.2 billion characters). Genomic problems such as comparing similarities between two strands of DNA or finding the uniqueness between two similar strands of DNA can hugely benefit in the understanding of evolution and possibly create medical advances. A couple of avenues of addressing a problem of similar flavor (known as gene alignment) are using Shortest Unique Substrings and Maximal Unique Matches. Existing techniques for these problems either necessitate the use of high performance computers, or using techniques that have large memory footprints, or use of (theoretical) techniques that are complicated to construct. Prior to our work, Ganguly et al. [TCS’ 17] showed that these problems can be solved in O(n2 log n ) time and O(n=) space for any parameter = !(1) and = o(n). However, these techniques require the use of complex data structures and techniques, which makes them practically intractable. Presented here are sub-linear algorithms for the Shortest Unique Substring problem and the Maximal Unique Matches problem. Besides using minimal amounts of work space and being (reasonably) fast, our algorithms are a combination of two simple techniques – Rabin-Karp fingerprint and Bloom Filters; this makes our work easily amenable to implementations. However, our algorithms suffer from a bounded error probability, which can be reduced using (slightly) more space.

Description

This file was last viewed in Adobe Acrobat Pro.

Related Material and Data

Citation

Sponsorship

Endorsement

Review

Supplemented By

Referenced By