Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Visual Representation of Text in Web Documents and Its Interpretation
 

Summary: Visual Representation of Text in Web Documents
and Its Interpretation
D. Karatzas and A. Antonacopoulos
PRImA Group, Department of Computer Science, University of Liverpool
Peach Street, Liverpool, L69 7ZF, United Kingdom
http://www.csc.liv.ac.uk/~prima
Abstract
This paper examines the uses of text and its representation on Web documents
in terms of the challenges in its interpretation. Particular attention is paid to the
significant problem of non-uniform representation of text. This non-uniformity is
mainly due to the presence of semantically important text in image form as opposed to
the standard encoded text. The issues surrounding text representation in Web
documents are discussed in the context of colour perception and spatial
representation. The characteristics of the representation of text in image form are
examined and research towards interpreting these images of text is briefly described.
1 Introduction
A Web document, like many other types of documents in electronic form, comprises
two components: the code and the view. The code, is typically a file containing
markup language tags, program instructions and various types of text. To be more
precise, text in this instance refers to anything that it is not a keyword or part of a

  

Source: Antonacopoulos, Apostolos - School of Computing, Science and Engineering, University of Salford

 

Collections: Computer Technologies and Information Sciences