U.S. Department of Energy Office of Science Office of Scientific and Technical Information


MAVIS technology

• Indexing automatic transcripts as text
– Automatic transcription accuracy is only 50-80%
• MAVIS techniques
– Word-level lattice indexing
• index word alternatives – robust to recognizer errors
• 50-140% accuracy improvement
• index timing – navigate to exact point in video
– Vocabulary Adaptation
• Use NLP and Bing Search to expand word dictionary
– Automatic keywords to expose to search engines
• Enables discovery of speech content through search engines
• Bi-product of vocabulary adaptation
– See http://research.microsoft.com/mavis