by Dr. Walt Warnick on Thu, Oct 9, 2008
by Walt Warnick and Sol Lederman
While federated search is a core technology that OSTI employs to tackle challenges of sharing knowledge, the technology isn't perfect. OSTI aggressively uses federated search because it does what no other search technology can do--inexpensively making dozens of non-Googleable databases searchable via a single query. Two nagging limitations of federated search are that it can take 30 seconds to execute a search--which seems slow in the digital age--and hit lists are not exhaustive.
To drive progress, we at OSTI are constantly striving to design affordable information systems that work as well as we can make them. We call this the "art of the possible." At the same time, we confidently recognize that tomorrow's information technology will make our systems work better. By deploying first-of-a-kind systems, we not only advance our mission, we also call attention to the need for new technology to address our limitations. OSTI thus makes an important contribution to technological progress, even if it is not OSTI itself that develops tomorrow's technology. By pushing the state-of-the-art today, we highlight needs and hasten the arrival of tomorrow's information technology.
History shows that an unwillingness to be deterred by the limitations of the day leads to ultimate success. Consider the case of the first general purpose electronic computer, ENIAC, unveiled in 1946:
ENIAC contained 17,468, vacuum tubes7,200 crystal diodes, 1,500 relays, 70,000 resistors, 10,000 capacitors and around 5 million hand-soldered joints. It weighed 30 short tons (27 t), was roughly 8.5 feet by 3 feet by 80 feet (2.6 m by 0.9 m by 26 m), took up 680 square feet (63 m?), and consumed 150 kW of power.
Looking back, one could make the obvious observation that ENIAC faced daunting technological limitations--size, power demands, unreliable components, cost--that had to be overcome before it could ever have important applications. But, did that stop ENIAC designers from building ENIAC? No. They persevered, and today ENIAC is remembered not so much for solving difficult problems, but for providing a proof-of-principle for electronic computing. The technological limitations were overcome by others who came later, so that today computers are central to our daily lives.
Federated search, still in its infancy, suffers from technological limitations, much like ENIAC did, but to a lesser degree. OSTI's pioneering work on Science.gov and WorldWideScience.org break new ground in the sharing of knowledge, while showing the need for further progress in federated search technology. The federated search limitations are mostly with speed and completeness of search results. The two limitations are interconnected in that limitations of speed cause database owners to limit the amount of info they provide.
What can OSTI do to speed up federated search? The first step is to compile a list of the many bottlenecks in federated search information flows. Some of the bottlenecks have to do with networks, some have to do with computing power at subordinate databases or at the federated search server, and some have to do with the speed of handing off info from computer to network and back. The next step is to take measurements so we have ground truth about the relative size of the bottlenecks. It will then be a simple matter to determine who owns the big bottlenecks--the database owner, the network owner, or the federated search engine owner. The next step will be to figure out what the various owners can do cost-beneficially to reduce bottlenecks under their control. OSTI cannot remedy all of the bottlenecks, but we can at least identify them and lay out a path for them to be overcome.
OSTI, like the visionaries who created ENIAC, isn't discouraged with the limitations within our core technologies. We push the envelope--we improve our hardware and network infrastructure as best we can, we invest in advancing the state-of-the-art, and we work hard to convince content providers that providing faster access to their content is in their best interest. In doing these things we will move past today's limitations, we may well see federated search emerge as the wave of the future, and we help to accelerate science.