Accelerating Science Discovery - Join the Discussion

OSTIblog Articles in the Technology Topic

Surviving a Technological Transformation

by Dr. Walt Warnick 02 Sep, 2008 in Technology

The life of every person in the world today has been shaped by successive technological transformations. The printing press transformed communication and education, beginning in the mid 15th century. Sailing and navigation technology of the late 15th century allowed Europeans to learn about other continents, beginning the global network of trade. Metal tools and firearms technology of the early 17th century enabled Europeans to colonize other continents and spread the fruits of European technology around the world. Railroads transformed transportation beginning in the early 19th century, and the telephone transformed communication in the latter part of that century. The automobile transformed transportation beginning early in the 20th century. These are but a few of the notable transformations that profoundly reshaped the way people live.

Today it is the Internet transformation, especially the Web. As a leader in making the Web work for DOE science, OSTI is embedded in the Internet transformation and OSTI itself is being transformed. Our dual core mission -- getting DOE results out to the scientific community and beyond, and getting the community's results into DOE -- has not changed. But the technology we apply to that mission has changed a lot. By carefully adopting Internet technology, and even pioneering new advances in that technology to meet our needs, OSTI achieves its mission better than ever before and has achieved a series of impressive "firsts."

I think all of us at OSTI would agree that getting this far has not been easy. If there is one word that describes what it has been like to be embedded in the Internet transformation, it is "turbulent." In this regard, the Internet transformation is much like the technological transformations that preceded it. Those embedded in transformations find themselves in a rapidly changing world which challenges them to find their own way through...

Related Topics: DOE Data Explorer (DDE), milestones, Science Conference Proceedings


OSTI and Reference Linking

by Daphne Evans 13 May, 2008 in Technology

OSTI actively supports the practice of Reference Linking. Also referred to as citation linking, reference linking adds value to technical reports and journal articles by hyperlinking the references at the end of the document. Authors frequently cite numerous supporting reports and articles. However, locating these cited works can be difficult. If these references can be hyperlinked to online full text, or availability information, that opens up all kinds of possibilities for the discovery and reuse of related research.

When authors submit technical reports to OSTI, they can request that their references be hyperlinked. We will identify hyperlinks for as many of the references as possible, and create an enhanced file. The authors review the new document, and once approved it is made available per their specifications. This greatly enhances the technical report and the accessibility of the references.  This service is provided at no cost to the author or their organization.

Another great benefit for the author is that DOE technical reports referenced in journal articles can also be hyperlinked. In 2005, OSTI entered into an agreement with CrossRef, a nationally recognized reference-linking service. Now OSTI and CrossRef use Digital Object Idenitifers (DOIs) to facilitate access to DOE's vast collection of science research reports. DOIs are persistent links,...

Related Topics: CrossRef, doe, hyperlinking, reference linking


Author Notification

by Jim Littlepage 12 May, 2008 in Technology

Authors of  DOE scientific and technical reports are getting their research results made electronically available worldwide courtesy of the Office of Scientific and Technical Information,

OSTI is making research results from work performed under DOE-sponsored contracts available via an array of web based outlets including powerful federated searching products such as Science Accelerator,, and World Wide Science.

Whenever OSTI receives a scientific and technical report from a facility doing work for the Department of Energy, OSTI processes that report and makes it publicly available.  Authors listed on the report are notified that their work is publicly available and are given the URL where they can view their report.  Moreover, authors receiving this notification are eligible to request reference linking that OSTI provides.  Using this service, authors can request OSTI to add hyperlinks, where available, to the references at the end of their report. 

Authors are encouraged to submit additional research reports to OSTI in order to increase awareness of their research activities, to provide their findings to a broad and diverse audience of potential beneficiaries, and to add to the body of scientific knowledge in their field of study.

This service is one of the many activities OSTI conducts as a part of its ongoing efforts to ensure that research results from billions of dollars of DOE sponsored research and development contracts are made available to the world's scientific community.

For additional information, contact Debbie Nuchols,

Jim Littlepage


Related Topics: osti, reference linking


Navigating Technological Transformation

by Dr. Walt Warnick 07 Apr, 2008 in Technology

Today, all of OSTI's information products are on the web. This is in sharp contrast to the situation as recently as the mid-1990s, when OSTI had no products on the web.

First becoming popular in 1994, the web quickly emerged as a transformational technology, and its potential for reshaping OSTI was apparent. Recognizing the opportunity to advance the OSTI mission, OSTI set out to capitalize on it as quickly as resources would allow by producing web applications to disseminate all manner of scientific and technical information (STI). A steady progression of new OSTI products addressed the various forms of STI: technical reports, e-prints, conference proceedings, accomplishments, patents, and project descriptions . To make it easy for users who want to search through all these products at once, we introduced the DOE Science Accelerator, which is powered by our special web architecture called federated search. Reaching out beyond DOE, we initiated a collaboration with other agencies to allow users to search their R&D results along with DOE's; thus emerged Most recently, we took collaboration world wide by federating the best information sources from governments around the world, WorldWideScience which makes searchable about the same quantity of science as does Google.

Over the years, OSTI has upgraded each of its products, so that, today, they offer more to users than ever before. Such upgrades are made possible...

Related Topics: federated search, osti


Sophisticated Yet Simple - The Technology Behind OSTI's E-print Network: Part 3

by Sol Lederman 21 Mar, 2008 in Technology

This is the third, and final, article in a series. The first article provided an overview of the E-print Network. The second article discussed the special harvested component of the E-print Network in depth. This article provides a tour of the E-print collections which are federated. Hopefully, once you finish reading this article and this series, you will appreciate the innovation and hard work that has gone into producing the premier federated search application for searching E-prints.

The E-print Network can simultaneously search 52 databases plus the special harvest collection, discussed in Part 2, from a single query. That single search has the effect of searching approximately 4 million documents from the federated sources plus another 1.3 million documents from the harvested collection for a total of roughly 5.3 million documents. This search executes in real time. A user can select all databases to search, individual databases, categories of databases, or combinations of individual databases and categories. The databases are divided into eight categories:

  1. Biology
  2. Computer Technologies & Information Sciences
  3. Environmental Sciences and Ecology
  4. Institutional Repositories and Multidisciplinary Collections
  5. Mathematics
  6. Nonlinear Sciences
  7. Physics
  8. Renewable Energy

The relationship between categories and databases can be seen on the...

Related Topics: E-Print Network (EPN), federated search


Federated Search - The Wave of the Future?: Part 2

by Dr. Walt Warnick 13 Mar, 2008 in Technology

by Walt Warnick and Sol Lederman

This is the second in a three part series of articles about the deficiencies of web crawling and indexing, the superiority of federated search to the serious researcher, and the value of OSTI federated search applications in advancing science. Part 1 identified a number of serious limitations of Google and the other crawlers. This article shows how federated search overcomes these limitations. The final article in the series highlights a number of federated search applications and databases that OSTI makes available to the public.

In Part 1, we explained that Google, being a surface web crawler, cannot access the deep web, which consists of content that resides in databases. We also noted that the deep web is several hundred times larger than the surface web and that a large percent of the highly sought after scientific and technical information resides in the deep web. We also explained that there is no way to determine the quality of any particular document in the surface web. Any web citizen can post a document to the web and it will likely be indexed.

Federated search applications overcome the two aforementioned limitations of surface crawlers - (1) limited access to content, and (2) the difficulty in determining its quality. Limited access is overcome by the federated search engine's specialized knowledge of how to query a database and how to retrieve its documents. The quality concern is overcome by the complementary efforts of database owners and creators of federated search applications. First, databases that are made available to federated search applications are managed by owners, or organizations, who have criteria for...

Related Topics: doe, federated search, osti, web crawling


Federated Search - The Wave of the Future?: Part 1

by Dr. Walt Warnick 12 Mar, 2008 in Technology

by Walt Warnick and Sol Lederman

The web is growing.

For providing searchable access to the content that matters the most to scientists and researchers, Google and the other web crawlers can't keep up. Instead, growing numbers of scientists, researchers, and science attentive citizens turn to OSTI's federated search applications for high quality research material that Google can't find. And, given fundamental limitations on how web crawlers find content, those conducting research will derive even more benefit from OSTI's innovation and investment in federated search in the coming years.

This is the first of three articles that discuss and compare the strengths and weaknesses of two web search architectures: the crawling and indexing architecture as used today by Google and the federated search architecture used by and This article points out the limitations of the crawling architecture for serious researchers. The second article explains how federated search overcomes these obstacles. The third article highlights a number of OSTI's federated search offerings that advance science, and suggests that federated search may someday become the dominant web search architecture. 

Google is a "surface web" crawler; it discovers content by taking a list of known web pages and following links to new web pages and to documents. This approach finds documents that have links referencing them. It finds none of the majority of web content that is contained in the "deep web."

The deep web...

Related Topics: doe, federated search, osti, web crawling


Sophisticated Yet Simple - The Technology Behind OSTI's E-print Network: Part 2

by Sol Lederman 05 Mar, 2008 in Technology

In Part 1 of this series I provided an overview of the technology that drives the E-print Network. In this article I will provide some detail about how the harvested collection, the "E-prints on Web Sites" component of the E-print Network, is constructed. In Part 3, I will discuss the technology of the portion of the E-print Network that relies on federated search of databases.

In Part 1 I explained that the E-print Network combines federated sources searched in real-time with harvested content. The harvested content, consisting of over 1.3 million e-prints, is found by directing a crawler to 28,000 web sites belonging to scientists, researchers, and members of the academic community. In OSTI terminology, harvesting is synonymous with conducting a directed crawl of web sites.

Before we look at the technology behind the harvesting, let's consider the question of why the content is harvested at all. Why not search the contributors' web sites in real-time in the same way that other collections are searched in real-time via federated search? There are several reasons for harvesting the content. First, a large number of e-prints are not found in databases. They are predominantly stored as document files in web server directories. Accessing files stored this way is the job of a web crawler, not that of a federated search engine. This is the case because, a crawler, once it locates the index page for a set of e-prints, easily harvests all e-prints referenced in that index page. The second reason...

Related Topics: doe, E-Print Network (EPN), federated search, osti


Sophisticated Yet Simple - The Technology Behind OSTI's E-print Network: Part 1

by Sol Lederman 26 Feb, 2008 in Technology

The E-print Network is one of OSTI's most popular and powerful research offerings yet few of its users know about the advanced technology that drives it and makes it simple to use. Professional researchers in basic and applied science are able to access over 5 million e-prints gathered from nearly 28,000 world-wide databases and web-sites. Numerous OSTI innovations ensure that the E-print Network's documents are of extremely high quality, are highly relevant to researchers, and are easy and quick to find. This is the first in a series of articles about the technology behind this very important component of the Science Accelerator. This article serves as an overview; subsequent articles will provide more technical information.

The E-print Network is a federated search application. It federates (aggregates) search results from over 50 content databases in a number of scientific disciplines from a single user query. The E-print Network, however, uses federated search in an innovative way; One of the databases it searches is a special collection formed by harvesting over 1.3 million E-prints from nearly 28,000 hand-picked web-sites. A custom-designed crawler is responsible for performing the harvesting and custom software is used to build an index of the 1.3 million E-prints so that they can be searched quickly together with the non-harvested databases. Most E-print Network users are unaware that the application is, in fact, a blend of federated search and Google-like crawling technologies. This marriage of the two technologies reflects OSTI's insight in realizing that e-prints not only reside in certain well...

Related Topics: doe, E-Print Network (EPN), federated search, osti, Science Accelerator


The Role of Federated Search at OSTI

by Sol Lederman 19 Feb, 2008 in Technology

Federated search is very much at the heart of OSTI's ability to realize its mission. OSTI provides a simple description of what federated search is and how it works in the OSTI environment. The best way to experience the tremendous value of federated search at OSTI is to try several of OSTI's flagship applications:

These, and all, federated search applications search databases "live", which means there is no delay or "lag time" between when a collection is updated by its owner and when the new content can be searched. Science Accelerator provides searchable access to a number of science databases that OSTI manages. Its aim is to accelerate science discovery by greatly reducing the time and effort required for researchers to find relevant science information. was OSTI's break-through federated search product; the first version was launched in December 2002. provides access to more than 50 million pages of science information from 17 scientific and technical organizations via the collaboration of 13 federal agencies. WorldWideScience is a global science gateway to national and international scientific databases.

The technology used to mine content from the deep web is called "federated search." While federated search is not the only search technology...

Related Topics: federated search, osti, Science Accelerator,, (WWS)