| | |
Summary: Inverse Pattern Matching
Amihood Amir \Lambda Alberto Apostolico y Moshe Lewenstein z
Georgia Tech Purdue BarIlan University
and and
BarIlan University Universit`a di Padova
August 4, 1996
Abstract
Let a textstring T of n symbols from some alphabet \Sigma and an integer m ! n be given. A
pattern P of length m over \Sigma is sought such that P minimizes (alternatively, maximizes) the
total number of pairwise character mismatches generated when P is compared with all m
character substrings of T . Two additional variants of the problem are obtained by adding the
constraint that P be (respectively, not be) a substring of T . Efficient sequential algorithms are
proposed in this paper for the problem and its variants.
Key Words: Design and analysis of algorithms, combinatorial algorithms on words, pattern
matching, inverse pattern matching, Hamming distance, digital signature.
1 Introduction
Inverse pattern matching refers to the task of inferring from a given textstring T a short pattern
string P such that P is, by some measure, most typical (or, alternatively, most anomalous) in the
context of T . This problem arises in a wide variety of applications and takes up numerous flavors,
among which most common is probably the one based on frequencies of pattern occurrences. When
|