| | |
Summary: Answer Extraction
Towards better Evaluations of NLP Systems
Rolf Schwitter and Diego Mollá and Rachel Fournier and Michael Hess
Department of Information Technology
Computational Linguistics Group
University of Zurich
CH-8057 Zurich
{schwitter, molla, fournier, hess}@i.unizh.ch
Abstract
We argue that reading comprehension tests are
not particularly suited for the evaluation of
NLP systems. Reading comprehension tests are
specically designed to evaluate human reading
skills, and these require vast amounts of world
knowledge and common-sense reasoning capa-
bilities. Experience has shown that this kind of
full-edged question answering (QA) over texts
from a wide range of domains is so dicult for
machines as to be far beyond the present state
of the art of NLP. To advance the eld we pro-
|