ScienceCinema

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    ScienceCinema


    Behrooz Chitsaz

    Microsoft Research

    Lorrie Apple Johnson

    U.S. Department of Energy

    Microsoft® Research

    osti.gov

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    Multimedia Research


    • Speech Search
    • Face identification
    • Object recognition
    • Video browsing
    • Semantic extraction
    • (3D) Segmentation
    • (3D) Image search

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    Speech Applications


    Speech as interface

    Mobile access

    • Directory services

    Automation

    • PC application

    • Web service

    Text input

    • Dictation

    Speech as 1st class content

    Indexing

    • Search

    • Keyword extraction

    Transcription

    • Meetings

    • Voicemails

    • Closed Caption

    Translation

    • Translating phone

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    Speech recognition


    Spectral Analysis
    o1..oT
    Matching (Decoding)
    time → alignment most likely hypothesis
    W'=argmax(w1..wN)p(ot..oT|w1..wN) P(w1..wN)
    (w1..wN)^
    "Hello World"
    Acoustic Models p(ot..oτ|phoneme)
    Dictionary P(phonemes|w)
    Grammar (Language Model) P(w1..wN)

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    MAVIS technology


    • Indexing automatic transcripts as text

    – Automatic transcription accuracy is only 50-80%

    • MAVIS techniques

    – Word-level lattice indexing

    • index word alternatives robust to recognizer errors

    • 50-140% accuracy improvement

    • index timing navigate to exact point in video

    – Vocabulary Adaptation

    • Use NLP and Bing Search to expand word dictionary

    – Automatic keywords to expose to search engines

    • Enables discovery of speech content through search engines

    • Bi-product of vocabulary adaptation

    – See http://research.microsoft.com/mavis

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    MAVIS Architecture


    Microsoft Azure


    • Store content to be processed in temporary Azure storage

    • Do vocabulary adaptation using Bing

    • Run recognition engine on content

    • Store results or recognition process (AIB)


    1. Submit audio/video RSS

    2. Retrieve AIB

    3. Import AIB in SQL

    4. Search/Retrieve results

    Web server(s)

    SQL Server(s)

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    U.S. Department of Energy Office of Scientific and Technical Information (OSTI) Mission


    • DOE invests > $10 billion/year in basic sciences, clean energy technology, and nuclear research.

    • The immediate output from this investment is Information…Knowledge… R&D results

    • OSTI's mission is to accelerate scientific progress by accelerating access to this information.

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    OSTI's Core Products


    • Information Bridge

    • Science Accelerator

    • Science.gov

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    WorldWideScience.org

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    Emerging Forms of Scientific Information Require New Tools


    • Numeric data, multimedia, and social media are emerging forms of scientific information

    • Each form presents special opportunities and challenges

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    Search and Retrieval Challenges with Multimedia Science Information


    • Lack of written transcripts, i.e. no "full text" to search

    • Metadata, if available, is often minimal

    • Scientific, technical, and medical terminology/vocabulary

    • Videos can be long, often up to an hour or more

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    OSTI and Microsoft Research Partnership


    • Video files collected from DOE's National Laboratories

    • RSS feeds with metadata and URLs sent to Microsoft Research

    • Audio indexing performed via MAVIS

    • Audio index blob (AIB) returned to OSTI and integrated with SQL servers

    • Users can search for a precise term within the video, and be directed to the exact point in the video where the term was spoken

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    Demonstration of ScienceCinema

  • Workshop jointly hosted by Microsoft and the International Council for Scientific and Technical Info

    Looking to the Future


    • Additional content from DOE researchers

    • Integration of multimedia searches into WorldWideScience.org by June

    • High quality automatic closed captions

    • Multilingual translation capabilities

Thumbnail panels:
Now Loading