|Discrete Mathematics and Optimization Seminar
Monday September 19th at 4.30pm
Title. Ubiquitous pattern matching and its applications
Repeated patterns and related phenomena in words are known to play a
central role in many facets of computer science with applications in
multimedia compression, information security, and computational biology.
One of the most fundamental questions arising in such studies is the
frequency of pattern occurrences in a string known as the text. Pattern matching
comes in many flavors: In the string matching problem, for a given string
(viewed as a consecutive sequence of symbols) one counts the number of pattern
(string) occurrences in the text. In subsequence pattern matching,
also known as the hidden word problem, we search for a given subsequence
rather than a string. Finally, in self-repetitive pattern matching we
aim to determine when a prefix of the text will appear again.
We apply probabilistic and analytic tools of combinatorics and
analysis of algorithms to discover general laws of pattern occurrences.
An immediate consequence of our results is the possibility to set thresholds
at which the appearance of a pattern in given text starts being `meaningful'.
In this talk, we first demonstrate the application of string matching
methodology to biological sequence analysis; in particular, to the problem
of finding weak signals and avoiding artifacts. We then use the approach
for hidden words to construct a reliable threshold for intrusion detection
in detecting anomalies. Finally, we present a video compression scheme
based on two-dimensional self-repetitive pattern matching (i.e.,
a lossy extension of the Lempel-Ziv scheme). We conclude this talk with
a demo illustrating our real-time multimedia decoder based on mobile code
that has potential application for wireless communication.