| | |
Summary: Forming grammars for structured documents:
an application of grammatical inference
Helena Ahonen
Heikki Mannila
Department of Computer Science
University of Helsinki
P.O. Box 26 (Teollisuuskatu 23)
FIN00014 University of Helsinki, Finland
Email: fhahonen,mannilag@cs.helsinki.fi
Erja Nikunen
Research Centre for Domestic Languages
Sšornšaisten rantatie 25
FIN00500 Helsinki, Finland
Email: enikunen@domlang.fi
Abstract
We consider the problem of generating grammars for classes of structured doc
uments --- dictionaries, encyclopedias, user manuals, and so on --- from examples.
The examples consist of structures of individual documents, and they can be col
lected either by converting typographical tagging of documents prepared for print
ing into structural tags, or by using document recognition techniques. Our method
|