Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Multi-Align: Combining Linguistic and Statistical Techniques to Improve Alignments

Summary: Multi-Align: Combining Linguistic and
Statistical Techniques to Improve Alignments
for Adaptable MT
Necip Fazil Ayan, Bonnie J. Dorr, Nizar Habash
Institute for Advanced Computer Studies Department of Computer Science
University of Maryland Columbia University
College Park, MD 20742 New York, NY 10027
{nfa,bonnie,habash}@umiacs.umd.edu habash@cs.columbia.edu
Abstract. An adaptable statistical or hybrid MT system relies heav-
ily on the quality of word-level alignments of real-world data. Statisti-
cal alignment approaches provide a reasonable initial estimate for word
alignment. However, they cannot handle certain types of linguistic phe-
nomena such as long-distance dependencies and structural differences
between languages. We address this issue in Multi-Align, a new frame-
work for incremental testing of different alignment algorithms and their
combinations. Our design allows users to tune their systems to the prop-
erties of a particular genre/domain while still benefiting from general
linguistic knowledge associated with a language pair. We demonstrate
that a combination of statistical and linguistically-informed alignments
can resolve translation divergences during the alignment process.


Source: Ayan, Necip Fazil - Speech Technology & Research Laboratory , SRI International


Collections: Computer Technologies and Information Sciences