U.S. Department of Energy Office of Science Office of Scientific and Technical Information

OSTI Slideshows and Speeches

The Stage Is Set for the Future

Slide01

Slide01

The Stage Is Set for the Future
CENDI Meeting
September 2007

Slide02

Slide02

OSTI Mission
To advance science and sustain technological creativity by making R&D findings available to Department of Energy researchers and the American public.
Information fuels discovery.
Superior access to quality information speeds discovery.

Slide03

Slide03

Advancing Science Discovery: From the '40s to the Future
From 1947 to 2007 – from Nuclear Science Abstracts to WorldWideScience.org – mission accomplished!
Whether by print or by pixel, OSTI has long been committed to ensuring appropriate access to research results

Slide04

Slide04

OSTI’s creation 60 years ago signified a sea change from the Secret City of the Manhattan Project toward an openness to share S&T knowledge with the public.
Of course, we continue to this day to keep secret all the information that has military applications. Whether peaceful or national security related, the S&T legacy of this agency is captured in this building.

Slide05

Slide05

Science Progresses as Knowledge Is Shared
OSTI corollary: If the sharing of knowledge – or knowledge diffusion – is accelerated, scientific progress is accelerated.
Science can be advanced by hiring more researchers and giving them better equipment; and science can be advanced by accelerating the sharing of knowledge.

Slide06

Slide06

We consciously seek to exploit new technology to accelerate the spread of scientific and technical knowledge.

Slide07

Slide07

Larry Page, speaking to scientists, AAAS 2007
"Virtually all economic growth (in the world) was due to technological progress. I think as a society we're not really paying attention to that."
He called on the scientists to make more of their research available digitally. “We have to unlock the wealth of scientific knowledge and get it to everyone.”

Slide08

Slide08

The stage is set for the future
We are ready to scale up our efforts in metasearch, or federated search. Simply put, we intend to make science searchable via one portal.

Slide09

Slide09

We must ensure access to science information that is Non-Googleable
Google: v., to search for information through Google
Googleable: adj., information found by Googling
Non-Googleable: adj., information that cannot be found by Googling

Slide10

Slide10

True or False?
Most useful information is available via familiar search engines such as Google and Yahoo!
The vast majority of science information in databases is not crawled by popular search engines.

Slide11

Slide11

Scientific databases stump Google
Systems that crawl the Web do not typically reach below the surface.
Google “crawls” the surface Web, but scientific databases are largely found in the deep Web.

Slide12

Slide12

Google works to solve the problem, but there’s a better way ...
Google moves ahead with plan to open up federal Web sites.
Google is making strides on an initiative to make information stored on public government Web sites more accessible to people looking for it, but challenges remain, officials with the search engine company said Wednesday. Three federal organizations recently agreed to structure their sites to make them accessible for nearly all Internet searches, the officials said. Information on the Plain Language Web site aimed at eliminating jargon in government communications, and on sites by the Energy Department's Office of Scientific and Technical Information and the Education Department's National Center for Education Statistics, has been opened up to the three most popular search engines: Google, Yahoo and MSN.

Slide13

Slide13

Federated search drills down to the deep Web where scientific databases reside.
We need systems that probe the deep Web.
Unlike the Google solution, federated search places no burden on the database owners.

Slide14

Slide14

Federated search yields one-stop portals
Science.gov 50 million pages
ScienceAccelerator Key DOE databases
WorldWideScience.org 200 million pages
19 sources, 17 countries, all inhabited continents

Slide15

Slide15

Harvesting
Harvesting and federated search are useful when full bibliographical control is not feasible.
Analogous to Google – crawls and mines data that does not reside in databases.
but ...
Different from Google – directed, selective crawling.

Slide16

Slide16

Federated Search: Advantages
• Current, real-time results
• No burden for database owner
• Inexpensive to implement
• No need-to-know for user
• No searching door-to-door
• Allows for fielded searching
• Interoperability is automatically achieved

Slide17

Slide17

Additional Points
• Federated search has limitations
• Neither crawling nor federated search is a panacea
• Federated searching does things crawling cannot do, and vise versa. They are complementary technologies
• Federated searching has advanced rapidly and should continue to advance

Slide18

Slide18

Science as a noble enterprise