Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Unsupervised Translation Induction for Chinese Abbreviations using Monolingual Corpora
 

Summary: Unsupervised Translation Induction for Chinese Abbreviations
using Monolingual Corpora
Zhifei Li and David Yarowsky
Department of Computer Science and Center for Language and Speech Processing
Johns Hopkins University, Baltimore, MD 21218, USA
zhifei.work@gmail.com and yarowsky@cs.jhu.edu
Abstract
Chinese abbreviations are widely used in
modern Chinese texts. Compared with
English abbreviations (which are mostly
acronyms and truncations), the formation of
Chinese abbreviations is much more complex.
Due to the richness of Chinese abbreviations,
many of them may not appear in available par-
allel corpora, in which case current machine
translation systems simply treat them as un-
known words and leave them untranslated. In
this paper, we present a novel unsupervised
method that automatically extracts the relation
between a full-form phrase and its abbrevia-

  

Source: Amir, Yair - Department of Computer Science, Johns Hopkins University

 

Collections: Computer Technologies and Information Sciences