U.S. Department of Energy

Office of Scientific & Technical Information

www.osti.gov

The Stage Is Set for the Future


Walter L. Warnick, Ph.D., Director
Office of Scientific and Technical Information, U.S. Department of Energy


The Stage is Set for the Future. Link to larger image.

The Stage Is Set for the Future
CENDI Meeting
September 2007

(Slide 1)

OSTI Mission. Link to larger image.

Slide 2: OSTI Mission

To advance science and sustain technological creativity by making R&D findings available to Department of Energy researchers and the American public.

Information fuels discovery.

Superior access to quality information speeds discovery.

Slide 3: Advancing Science Discovery: From the '40s to the Future

Advancing Science Discovery: From the '40s to the Future.  Link to larger image.

From 1947 to 2007 – from Nuclear Science Abstracts to WorldWideScience.org – mission accomplished!

Whether by print or by pixel, OSTI has long been committed to ensuring appropriate access to research results.

Link to larger image.

Slide 4:

OSTI’s creation 60 years ago signified a sea change from the Secret City of the Manhattan Project toward an openness to share S&T knowledge with the public.

Of course, we continue to this day to keep secret all the information that has military applications. Whether peaceful or national security related, the S&T legacy of this agency is captured in this building.

Science Progresses as Knowledge Is Shared.  Link to larger image.

Slide 5: Science Progresses as Knowledge Is Shared


OSTI corollary: If the sharing of knowledge – or knowledge diffusion – is accelerated, scientific progress is accelerated.

Science can be advanced by hiring more researchers and giving them better equipment; and science can be advanced by accelerating the sharing of knowledge.

Link to larger image.

Slide 6:

We consciously seek to exploit new technology to accelerate the spread of scientific and technical knowledge.

Larry Page, speaking to scientists, AAAS 2007. Link to larger image.

Slide 7: Larry Page, speaking to scientists, AAAS 2007

"Virtually all economic growth (in the world) was due to technological progress. I think as a society we're not really paying attention to that."

He called on the scientists to make more of their research available digitally. “We have to unlock the wealth of scientific knowledge and get it to everyone.”

Slide 8: The stage is set for the future


The stage is set for the future.  Link to larger image.

We are ready to scale up our efforts in metasearch, or federated search. Simply put, we intend to make science searchable via one portal.



We must ensure access to science information that is Non-Googleable. Link to larger image.

Slide 9: We must ensure access to science information that is Non-Googleable

Google: v., to search for information through Google.
Googleable: adj., information found by Googling.
Non-Googleable: adj., information that cannot be found by Googling.


True or False? Link to larger image.

Slide 10: True or False?

Most useful information is available via familiar search engines such as Google and Yahoo!

The vast majority of science information in databases is not crawled by popular search engines.

Scientific databases stump Google. Link to larger image.

Slide 11: Scientific databases stump Google

Systems that crawl the Web do not typically reach below the surface.

Google “crawls” the surface Web, but scientific databases are largely found in the deep Web.



Link to larger image.

Slide 12:

Google works to solve the problem, but there’s a better way ...

Google moves ahead with plan to open up federal Web sites Google is making strides on an initiative to make information stored on public government Web sites more accessible to people looking for it, but challenges remain, officials with the search engine company said Wednesday. Three federal organizations recently agreed to structure their sites to make them accessible for nearly all Internet searches, the officials said. Information on the Plain Language Web site aimed at eliminating jargon in government communications, and on sites by the Energy Department's Office of Scientific and Technical Information and the Education Department's National Center for Education Statistics, has been opened up to the three most popular search engines: Google, Yahoo and MSN.


Link to larger image.

Slide 13:

Federated search drills down to the deep Web where scientific databases reside.

We need systems that probe the deep Web.

Unlike the Google solution, federated search places no burden on the database owners.


Federated search yields one-stop portals. Link to larger image.

Slide 14: Federated search yields one-stop portals

Science.gov 50 million pages.

ScienceAccelerator Key DOE databases.

WorldWideScience.org 200 million pages.

19 sources, 17 countries, all inhabited continents.

Harvesting. Link to larger image.

Slide 15: Harvesting

Harvesting and federated search are useful when full bibliographical control is not feasible.

Analogous to Google – crawls and mines data that does not reside in databases.

but ...

Different from Google – directed, selective crawling.

Federated Search: Advantages. Link to larger image.

Slide 16: Federated Search: Advantages

  • Current, real-time results
  • No burden for database owner
  • Inexpensive to implement
  • No need-to-know for user
  • No searching door-to-door
  • Allows for fielded searching
  • Interoperability is automatically achieved
Additional Points. Link to larger image.

Slide 17: Additional Points

  • Federated search has limitations
  • Neither crawling nor federated search is a panacea
  • Federated searching does things crawling cannot do, and vise versa. They are complementary technologies
  • Federated searching has advanced rapidly and should continue to advance

 

Science as a noble enterprise. Link to larger image.

Slide 18: Science as a noble enterprise