Summary: The PISAB Question Answering System
Giuseppe Attardi and Cristian Burrini
Dipartimento di Informatica
UniversitÓ di Pisa - Italy
The PISAB Question Answering system is based on a combination of Information Extraction and
Information Retrieval techniques. Knowledge extracted from documents is modeled as a set of entities
extracted from text and by relations between them.
During the learning phase we index documents using the entities they contain. In the answering
phase we exploit the index previously built in order to focus the search for the answer to just the most
relevant documents. As answers to a question we select from these documents the paragraphs
containing entities most similar to those in the question.
PISAB has been submitted to the TREC-9 Conference, achieving encouraging results despite it
current prototypical development stage.
The problem of finding answers to questions on a large document collection, could in principle be
solved by creating a knowledge base with the information extracted from documents and then querying
such knowledge base. Unfortunately this approach is not yet feasible, since it requires advanced
techniques of natural language processing, knowledge extraction, knowledge representation and