Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
AIDAS: Incremental Logical Structure Discovery in PDF Documents Anjo Anjewierden
 

Summary: AIDAS: Incremental Logical Structure Discovery in PDF Documents
Anjo Anjewierden
Social Science Informatics, University of Amsterdam
Roetersstraat 15, 1018 WB Amsterdam, The Netherlands
anjo@swi.psy.uva.nl
Abstract
We describe the approach AIDAS uses to extract the logi-
cal document structure from PDF documents. The approach
is based on the idea that the layout structure contains cues
about the logical structure and that the logical structure can
be discovered incrementally.
1. Introduction
AIDAS is part of a research project in which the aim is to
turn technical manuals into a database of indexed training
material. The role AIDAS plays in this project is to take
a PDF file, extract the logical structure and assign indexes
to each element in this logical structure. The indexes can
either be about the content of the element (e.g. "this section
is about the rear part of a car", "this is a schematic drawing
of the control unit") or about how the element can be used

  

Source: Anjewierden, Anjo - Instituut voor Informatica, Universiteit van Amsterdam

 

Collections: Computer Technologies and Information Sciences