by Sol Lederman on Fri, Apr 3, 2009
Dr. Walt Warnick, Director of OSTI, recently had the honor of speaking at two events at the Computers In Libraries Conference. I asked Dr. Warnick to share some of his experience and perceptions from the talks through a short interview:
Dr. Warnick, you travel quite a bit and make numerous presentations about OSTI's innovative work. What drew you to speak at the Computers In Libraries (CIL) Conference?
I was invited to make two presentations which you describe below in your third question. I have visited the Conference in previous years, but this is the first time that I made presentations. Computers In Libraries is a natural forum for OSTI, as everything we do today is computer based and librarians are very important customers.
Congratulations! How would you categorize the attendees at CIL? Were they mostly librarians?
My impression is that most CIL attendees were librarians. Sprinkled among them were computer techies. For example, the moderator at my first session was a highly accomplished computer techie from the San Francisco Chronicle.
You had the distinction of co-leading two sessions at the conference, one on information dissemination, the other on the future of federated search as you see it. Let's start with the first. What was the gist of your message on how OSTI is spreading knowledge and advancing science?
My speech is posted at the OSTI site for speeches. It's titled The Science Knowledge Imperative: Making Non-Googleable Science Findable. The gist is that with very small investments enormous collections of science from government agencies and countries around the world have been virtually integrated and become searchable in a practical way for the first time in human history.
That's a very major accomplishment, integration of global science collections.
The second session was about challenges in federated search and how OSTI is meeting them. What do you see as the major challenges of federated search?
The second session was actually a combination of two back-to-back sessions on challenges. I was surprised at comments made by previous speakers, so I abandoned my prepared remarks in favor of responding to the previous speakers. As a consequence of this experience, I am concluding that the application of federated search to libraries, while extremely important and powerful if done right, pales in importance to its applications to geographically dispersed open access databases. For example, WorldWideScience (which makes searchable about the same quantity of science as does Google, only WorldWideScience content is deemed authoritative by the national governments who post it and much of that content is non-Googleable) would be a practical impossibility were it not for federated search. I did agree with previous speakers that federated search faces challenges. I disagreed about the prospects for meeting those challenges. My view is that federated search is a wave of the future, not a temporary stepping stone that is useful only until something else comes along that is not yet defined.
The main challenge with federated search is speed. One speaker contended that his federated search could take 7 minutes. I agree with him that 7 minutes is a show stopper. I pointed out that my federated searches typically take 20 seconds or so, with early results presented sooner.
Stephen Abram, moderator of the two sessions, contended that even 20 seconds is unacceptably slow. He contends that users begin to drop off if search results take more than a second. I am hopeful, but not yet convinced, that the bottlenecks in federated search can be identified and technology available either today or soon can be found to speed it up.
If federated search can be radically speeded up, other benefits will accrue. That would go a long way to further improving relevance ranking, as greater quantities of material, especially more full text, can be subjected to the ranking algorithm. In addition, still more databases could be searched via a single query.
There is a competing view about how to make Federate search faster. It is proposed that future federated searches not include all sources, but rather would be selective about which subordinate databases are searched. Software accompanying the federate search engine would analyze the users query and try to pick those subordinate databases most likely to meet the user's need. In my opinion, this approach would bring with it its own set of problems, namely incompleteness of searches.
How is OSTI meeting these challenges?
OSTI is working with other programs in DOE to encourage research into identifying the bottlenecks and devising the technological fixes.
How were your sessions received?
Both of my sessions were very well received. About 150 people attended the first session, and about 250 people attended the second. A number of attendees approached me later.
You speak a lot about federated search. What do you think were your audiences perceptions of the technology?
I was surprised about the negativity aimed at federated search, both among the audience and among previous speakers. Abram started the session with a show of hands. Who has a federated search system up and running? About half the audience raised their hands. Who is considering adopting a federated search system? The other half of the audience raised their hands. Who is satisfied with their federated search system. I was the only person who raised a hand.
If you had just one message to give to your two audiences about OSTI and its innovative work, what would that message be?
OSTI has proved that federated search can make enormous quantities of authoritative, non-Googleable science searchable, which is a feat that cannot be accomplished in any practical way except via federated search.
Thank you very much for your time and for your very insightful remarks.