ScienceCinema
Slide 1: ScienceCinema
Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info
Behrooz Chitsaz
Microsoft Research
Lorrie Apple Johnson
U.S. Department of Energy
Microsoft® Research
osti.gov
Slide 2: Multimedia Research
- Speech Search
- Face identification
- Object recognition
- Video browsing
- Semantic extraction
- (3D) Segmentation
- (3D) Image search
Slide 3: Speech Applications
Speech as interface
Mobile access
- Directory services
Automation
- PC application
- Web service
Text input
- Dictation
Speech as 1st class content
Indexing
- Search
- Keyword extraction
Transcription
- Meetings
- Voicemails
- Closed Caption
Translation
- Translating phone
Slide 4: Speech recognition
Spectral Analysis
o1..oT
Matching (Decoding)
time → alignment most likely hypothesis
W'=argmax(w1..wN)p(ot..oT|w1..wN) P(w1..wN)
(w1..wN)^
"Hello World"
Acoustic Models p(ot..oτ|phoneme)
Dictionary P(phonemes|w)
Grammar (Language Model) P(w1..wN)
Slide 5: MAVIS technology
- Indexing automatic transcripts as text
- MAVIS techniques
- index word alternatives – robust to recognizer errors
- 50-140% accuracy improvement
- index timing – navigate to exact point in video
- Use NLP and Bing Search to expand word dictionary
- Enables discovery of speech content through search engines
- Bi-product of vocabulary adaptation
– Automatic transcription accuracy is only 50-80%
– Word-level lattice indexing
–Vocabulary Adaptation
– Automatic keywords to expose to search engines
Slide 6: MAVIS Architecture
Microsoft Azure
- Store content to be processed in temporary Azure storage
- Do vocabulary adaptation using Bing
- Run recognition engine on content
- Store results or recognition process (AIB)
1. Submit audio/video RSS
2. Retrieve AIB
3. Import AIB in SQL
4. Search/Retrieve results
Web server(s)
SQL Server(s)
Slide 7: U.S. Department of Energy Office of Scientific and Technical Information (OSTI) Mission
- DOE invests > $10 billion/year in basic sciences, clean energy technology, and nuclear research.
- The immediate output from this investment is Information…Knowledge… R&D results
- OSTI's mission is to accelerate scientific progress by accelerating access to this information.
Slide 8:
OSTI's Core Products
- Information Bridge
- Science Accelerator
- Science.gov
Slide 9: WorldWideScience.org
Slide 10: Emerging Forms of Scientific Information Require New Tools
- Numeric data, multimedia, and social media are emerging forms of scientific information
- Each form presents special opportunities and challenges
Slide 11: Search and Retrieval Challenges with Multimedia Science Information
- Lack of written transcripts, i.e. no "full text" to search
- Metadata, if available, is often minimal
- Scientific, technical, and medical terminology/vocabulary
- Videos can be long, often up to an hour or more
Slide 12: OSTI and Microsoft Research Partnership
- Video files collected from DOE's National Laboratories
- RSS feeds with metadata and URLs sent to Microsoft Research
- Audio indexing performed via MAVIS
- Audio index blob (AIB) returned to OSTI and integrated with SQL servers
- Users can search for a precise term within the video, and be directed to the exact point in the video where the term was spoken
Slide 13: Demonstration of ScienceCinema
Slide 14: Looking to the Future
- Additional content from DOE researchers
- Integration of multimedia searches into WorldWideScience.org by June
- High quality automatic closed captions
- Multilingual translation capabilities


