Summary: Separable Attributes:
a Technique for Solving the
Sub Matrices Character Count Problem
Amihood Amir Kenneth W. Church y Emanuel Dar z
AT&T Labs AT&T Labs BarIlan University
and
BarIlan University
Abstract
The subsequence character count problem has as its input an array S = s 1 ; :::; s n of symbols
over alphabet and a natural number m. Its output is: for every i; i = 1; :::; n m + 1;
the number of dierent alphabet symbols occurring in the subsequence s i ; s i+1 ; :::; s i+m 1 . The
subsequence character count problem is a natural problem that has many uses. It can be solved
in linear time for xed nite alphabets and in time O(n log m) for innite alphabets.
The character count problem can be generalized to two dimensions and becomes the sub
matrix character count problem. Its input is an n n matrix T over alphabet and a natural
number m. Its output is: for every i; j; i; j = 1; :::; n m+ 1; the number of dierent alphabet
symbols occurring in the submatrix T [i + k; j + `]; k = 0; :::; m 1; ` = 0; :::; m 1.
The straightforward one dimensional solution slides a window along the text adding an
element and deleting an element at every step. The problem with two dimensions is that at
every move of the window there are m elements added and m deleted.
