Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Visual Representation of Text in Web Documents and Its Interpretation

Summary: Visual Representation of Text in Web Documents
and Its Interpretation
D. Karatzas and A. Antonacopoulos
PRImA Group, Department of Computer Science, University of Liverpool
Peach Street, Liverpool, L69 7ZF, United Kingdom
This paper examines the uses of text and its representation on Web documents
in terms of the challenges in its interpretation. Particular attention is paid to the
significant problem of non-uniform representation of text. This non-uniformity is
mainly due to the presence of semantically important text in image form as opposed to
the standard encoded text. The issues surrounding text representation in Web
documents are discussed in the context of colour perception and spatial
representation. The characteristics of the representation of text in image form are
examined and research towards interpreting these images of text is briefly described.
1 Introduction
A Web document, like many other types of documents in electronic form, comprises
two components: the code and the view. The code, is typically a file containing
markup language tags, program instructions and various types of text. To be more
precise, text in this instance refers to anything that it is not a keyword or part of a


Source: Antonacopoulos, Apostolos - School of Computing, Science and Engineering, University of Salford


Collections: Computer Technologies and Information Sciences