Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Large-scale Discriminative n-gram Language Models for Statistical Machine Translation

Summary: Large-scale Discriminative n-gram Language Models
for Statistical Machine Translation
Zhifei Li and Sanjeev Khudanpur
Department of Computer Science and Center for Language and Speech Processing
Johns Hopkins University, Baltimore, MD 21218, USA
zhifei.work@gmail.com and khudanpur@jhu.edu
We extend discriminative n-gram language
modeling techniques originally proposed for
automatic speech recognition to a statistical
machine translation task. In this context, we
propose a novel data selection method that
leads to good models using a fraction of the
training data. We carry out systematic ex-
periments on several benchmark tests for Chi-
nese to English translation using a hierarchical
phrase-based machine translation system, and
show that a discriminative language model
significantly improves upon a state-of-the-art
baseline. The experiments also highlight the


Source: Amir, Yair - Department of Computer Science, Johns Hopkins University


Collections: Computer Technologies and Information Sciences