Federated Search (Emphasizing WorldWideScience.org)
as a Transformational Technology
Enabling Knowledge Discovery


Read the ILDS paper: May 2010 – Federated search as a transformational technology enabling knowledge discovery: the role of WorldWideScience.org
published in: Interlending & Document Supply Vol 38 Issue 2 pp. 82-92

  • InterLending and Document Supply Conference, October 20-22, 2009

    Federated Search (Emphasizing WorldWideScience.org)
    as a Transformational Technology Enabling Knowledge Discovery

    InterLending and Document Supply Conference
    October 20-22, Hannover, Germany

    Walt L. Warnick, Ph.D.
    Director
    Office of Scientific & Technical Information
    United States Department of Energy

  • InterLending and Document Supply Conference, October 20-22, 2009

    OSTI Mission


    To advance science and sustain technological creativity by making R&D findings available and useful to DOE researchers and the public.

  • InterLending and Document Supply Conference, October 20-22, 2009

    Science progresses as knowledge is shared


    OSTI Corollary:

    If the sharing of knowledge is accelerated, then discovery is accelerated.

    "If I have seen further, it is by standing on the shoulders of giants."

    – Isaac Newton 1676

    Profound implications for everyone in the information business.
  • InterLending and Document Supply Conference, October 20-22, 2009

    Knowledge Investment Curve


    Pace of Scientific Discovery

    Vertical Axis = the pace of discovery

    Horizontal Axis = the %, from zero to 100, of R&D funding for sharing scientific knowledge.

  • InterLending and Document Supply Conference, October 20-22, 2009

    Knowledge Investment Curve


    Pace of Scientific Discovery

    Percentage of R&D Funding for Sharing of Scientific Knowledge at 0%.

    If there were no sharing, there would be no progress.

  • InterLending and Document Supply Conference, October 20-22, 2009

    Knowledge Investment Curve


    Pace of Scientific Discovery

    Percentage of R&D Funding for Sharing of Scientific Knowledge at 100%.

    If all resources went to sharing, there would be no resources for research itself, and no progress.

  • InterLending and Document Supply Conference, October 20-22, 2009

    Knowledge Investment Curve


    Pace of Scientific Discovery

    Decision makers affect the pace of discovery when they determine the fraction of R&D funding dedicated to sharing.

    Optimum Sharing is between 0% and 100%.

  • InterLending and Document Supply Conference, October 20-22, 2009

    But before we can accelerate the sharing of knowledge


    … we must dispel the misperception that popular search engines are already doing the job.

    Google

    Yahoo

    MSN

  • InterLending and Document Supply Conference, October 20-22, 2009

    Much of science is non-Googleable


    The Deep Web is Huge

    In fact, the vast majority of science information is in databases within the deep web – or the non-Googleable Web – where popular search engines cannot go.

    We in the information business need to recognize this gap between availability and need, and seize the opportunity to …
    Provide science information consumers with better tools.
  • InterLending and Document Supply Conference, October 20-22, 2009

    The web is transformational technology for sharing knowledge


    The web is still young and will certainly hold surprises as it evolves.

    Just as another well-known transformational technology held surprises …

    1903

    1918

    2010
  • InterLending and Document Supply Conference, October 20-22, 2009

    Eclipsing Current Search Technology


    Google is capitalizing on this early era of web technology and is hugely successful, powering more than half the world’s searching.

    But we must remember that we are just in the beginning of this transformation. Further technological transformations may very well eclipse today’s search technology!

    A new, promising technology is now emerging: federated search.
  • InterLending and Document Supply Conference, October 20-22, 2009

    We need systems, such as federated search, that probe the deep web


    Federated search drills down to the deep web where scientific databases reside.

    Surface Web

    Deep Web Databases

    Unlike the Google sitemap protocol solution, federated search places no burden on the database owners.
  • InterLending and Document Supply Conference, October 20-22, 2009

    Our emerging solution: federated search


    Science Accelerator: Integrates key DOE databases

    Science.gov: Integrates 14 U.S. science agencies – 200 million pages of science information

    WorldWideScience.org: Integrates science information issued by over 60 Nations – 400 million pages of global science information
  • InterLending and Document Supply Conference, October 20-22, 2009

    WorldWideScience.org History


    Concept introduced by OSTI Director, Walt Warnick, June 2006, Bethesda, Maryland

    Bilateral U.S.(DOE)/U.K. (British Library) partnership, January 2007, London

    Demonstration of first prototype, June 2007, Nancy, France

    Multilateral governance structure WorldWideScience Alliance, established June 2008, Seoul

    Dr. Jan Brase, German National Library of Science and Technology

    Common ingredient: International Council for Scientific and Technical Information (ICSTI)

  • InterLending and Document Supply Conference, October 20-22, 2009
    • Searches 61 science databases and portals sponsored by governments and national institutions in 61 countries
    • Covers scientific literature from over three-fourths of the world’s population
    • Includes a vast quantity of science (over 400 million pages), much of which is grey literature
    • Proving WWS "deep web" value, recent analysis shows only 3.5% overlap with Google and Google Scholar
  • InterLending and Document Supply Conference, October 20-22, 2009
    • Current research in multi-lingual translations technologies will enable searching of non-English databases from within applications such as WWS
    • Prototype allows users to select their preferred language. Queries are translated into the languages of the databases being searched and results are then returned in the user's language
    • We are committed to launching Multi-lingual WorldWideScience.org at the ICSTI Meeting in Helsinki in June 2010
  • InterLending and Document Supply Conference, October 20-22, 2009

    OSTI, through federated search, ensures access to non-Googleable science


    Volume of Content Made Searchable by OSTI

    WorldWideScience.org:
    400,000,000 pages of Global Scientific and Technical Information (STI)
    These web-available pages would fill 62,000 traditional 2-feet deep file drawers.

    Science.gov:
    200,000,000 pages of U.S. Government STI
    These web-available pages would fill 33,000 traditional 2-feet deep file drawers.

    STIP Collection:
    11,400,000 pages of U.S. Department of Energy STI
    These web-available pages would fill 1,900 traditional 2-feet deep file drawers.

    Amount of Data Transferred in FY08: 9.95 terabytes

    Through OSTI products, librarians, researchers and the public can access a science page count comparable to, but not duplicative of, Google's entire science content
  • InterLending and Document Supply Conference, October 20-22, 2009

    Is there a better solution for a high quality science search tool just over the horizon?


    We think so…

    Live Federated Search Tools + Crawled Indexes

    For Example:

    WorldWideScience.org + crawled indexes
  • InterLending and Document Supply Conference, October 20-22, 2009

    The stage is set for the future


    We are ready to scale up our efforts in federated search.

    A billion-page, high quality science search tool may be available soon to spread ideas, increase learning, and further accelerate the progress of science.
  • InterLending and Document Supply Conference, October 20-22, 2009

    Cognition Budget


    • Making more info available is not enough
    • It must be presented more conveniently – easier and faster to find
    • To this end, relevancy ranking is being reinvented for federated searching

    Try WorldWideScience.org!
  • InterLending and Document Supply Conference, October 20-22, 2009
    Simply put, we intend to make more science accessible to more people more conveniently than has ever been done before.
Thumbnail panels:
Now Loading