| | |
Summary: Unsupervised Discriminative Language Model Training
for Machine Translation using Simulated Confusion Sets
Zhifei Li and Ziyuan Wang and Sanjeev Khudanpur and Jason Eisner
Center for Language and Speech Processing
Johns Hopkins University
zhifei.work@gmail.com,{zwang40, khudanpur, eisner}@jhu.edu
Abstract
An unsupervised discriminative training
procedure is proposed for estimating a
language model (LM) for machine trans-
lation (MT). An English-to-English syn-
chronous context-free grammar is derived
from a baseline MT system to capture
translation alternatives: pairs of words,
phrases or other sentence fragments that
potentially compete to be the translation
of the same source-language fragment.
Using this grammar, a set of impostor
sentences is then created for each En-
glish sentence to simulate confusions that
|