| | |
Summary: Disambiguation of SGML content models \Lambda
Helena Ahonen
Department of Computer Science
University of Helsinki
P.O. Box 26 (Teollisuuskatu 23)
FIN00014 University of Helsinki, Finland
Email: helena.ahonen@helsinki.fi
Tel. +358070844218
Abstract
A Standard Generalized Markup Language (SGML) document has a document type def
inition (DTD) that specifies the allowed structures for the document. The basic components
of a DTD are element declarations that contain for each element a content model, i.e., a regu
lar expression that defines the allowed content for this element. The SGML standard requires
that the content models of element declarations are unambiguous in the following sense: a
content model is ambiguous if an element or character string occurring in the document
instance can satisfy more than one primitive token in the content model without lookahead.
Br¨uggemannKlein and Wood have studied the unambiguity of content models, and they
have presented an algorithm that decides whether a content model is unambiguous. In this
paper we present a disambiguation algorithm that, based on the work of Br¨uggemannKlein
and Wood, transform an ambiguous content model into an unambiguous one by generalizing
|