Building the Information Superhighway
Summer 1993By Jeffery Kahn, JBKahn@lbl.gov
In 1989, LBL researcher Bill Johnston was called to Washington for a U.S. Senate hearing. Its purpose: to explore the potential of a national information superhighway.
Johnston and his colleagues showed Washington the future. During the first live computer demonstration ever conducted before a Senate hearing, they exhibited the possibilities of a high-speed, transcontinental computer network. The researchers plugged in a computer and displayed data processed, analyzed, and assembled into animated scientific "movies" by devices and researchers distributed thousands of miles apart. They demonstrated how equipment such as a magnetic resonance imaging unit, supercomputers, data-storage devices, and computer workstations could be temporarily bridged together, linking individuals and resources in ways never before possible.
Four years later, President Bill Clinton and Vice President Al Gore who, as senator, had chaired the 1989 hearing, flew to California's Silicon Valley. Meeting at Silicon Graphics Inc., they were briefed on the status of the emerging information superhighway and, in turn, shared their vision for its future. Gore has championed this project since its infancy.
On the eve of the Clinton-Gore visit, Silicon Graphics' Andrew Cherenson sent out a spur-of-the-moment message on a spreading, spiderweb-like computer network that now links many academic and research institutions around the world: Anybody interested in the Clinton/Gore conference? Logging in later, he checked the response. The next morning, Cherenson plugged in a Sony minicamera and tagged along with Clinton and Gore during their brainstorming session with the planners and players who are helping build the information superhighway. Around the world, 200 rapt "participants" sat at their desks in front of computer workstations, watching the events at Silicon Graphics while commenting among themselves. Cherenson's impromptu video conference--a harbinger of the ease with which we will be able to talk, visit together, and share information in the near future--was broadcast in 11 countries and 22 time zones.
A decade ago, such communication was conventional within the world of science fiction but new and alien as a national policy goal. Beginning in 1968, the federal government through its Defense Advanced Research Projects Agency (DARPA) provided seed money to establish first one and then several experimental networks that could move data between research institutions at high speed. These prototype networks have evolved, proliferated, and rapidly linked. Today, a fused infrastructure of more than 11,000 networks known as the Internet, or just "the net," joins upwards of 10 million people around the world.
From the outset, Lawrence Berkeley Laboratory has been one of the principal architects of network computing, helping "to grow the network" since 1972. LBL "went on the air" in 1975, becoming one of a handful of institutions connected to the network. When the Internet overloaded, bogged down, and was on the verge of self- destruction in 1986, LBL researcher Van Jacobson was part of a two- man team that helped rescue it, saving it from those who recommended that it should be abandoned. More recently, Jacobson and his team have made other key contributions, engineering the metamorphosis of what had been a data and e-mail pipeline into a network that now allows many people to speak and interact instantly via audio and video network conferencing. Today, this formerly mail-dominated culture is flourishing, allowing people around the world to interact routinely, just as they did during the Clinton/Gore Silicon Graphics visit. LBL also is a pioneer in distributed scientific computing, forging new links that make the location of expensive computing resources immaterial.
Looking at what has been accomplished and what is in the works, LBL Information and Computing Sciences Division Director Stu Loken says we are at the dawn of a new Information Age.
"The federal government has a long history of investment in the nation's infrastructure," notes Loken. "It built canals in the 18th century, railroads in the 19th century, and interstate highways in the 20th century. Then, about 10 years ago, it began the construction of high-speed computer networks. These networks are the highways of the Information Age."
Loken and almost every other researcher in the field say the information superhighways will result in the inevitable convergence of television, telephone, cable television, computer, consumer electronics, publishing and information enterprises into a single interactive information industry. Vice President Gore predicts this will be "the most important and lucrative marketplace of the 21st century." AT&T says it expects the global information market to be worth $1.4 trillion by 1996; Apple Computer estimates the market will grow to $3.5 trillion by the year 2001.
The vision of information superhighways almost crashed in 1986.
Then almost two decades old, the Internet had 10,000 users. They had come to rely on the network for it had already become much more than just a means to exchange electronic mail and move data. The network served as a virtual office hallway, intimately connecting distant collaborators.
In October of 1986, the Internet experienced what its many designers diagnosed as "congestion collapse." Communications--a digital data-stream consisting of everything from written messages to raw scientific data--had been flowing through the system at up to 56 kilobits per second (56,000 bits, or about two typed pages, per second). Then one day, this 21st-century information system suddenly slowed to the pace of the telegraph. That day, the transmission rate between Lawrence Berkeley Laboratory and the University of California at Berkeley only a quarter-mile away slowed to 320 bits per second. Users of the system were mystified and dismayed.
Internet users all over the country, as reliant on the network as most of us are on our telephones, puzzled over how to revive it. Van Jacobson from LBL's Engineering Division was among those who became involved.
"The network had slowed by a factor of a thousand," recalls Jacobson. "Mail that had gone through in minutes now took a full day. People started giving up on this. The whole idea of network communication was imperiled.
"I was working with Mike Karels (of the Berkeley Unix development group at the University of California at Berkeley). For six months, we had been asking why the Internet was failing, beating our heads against a brick wall. Then one night in a Berkeley coffee house, there was a moment of enlightenment. We turned the question around. The real question was, `How had the Internet ever worked?'
"Think about it," says Jacobson: A workstation can transmit data at 10 megabits per second (10 million bits) and a router puts it on the Internet, which has a capacity of 56 kilobits per second. You start with this bottleneck and then must contend with thousands of people using the network simultaneously. Given this, he says, a traffic jam on the Internet was inevitable.
As the traffic had increased on the Internet, the system's many users had relied on what amounted to self-destructive behavior in their attempts to break through the network gridlock. Packets of information would be transmitted to the network by a computer and subsequently returned to the sender because of the congestion. Computers had been programmed to deal with this by immediately trying again, repeatedly resending the message until it went through. Jacobson likens the situation to pouring gasoline on a fire.
The solution, he says, was to make the network users more polite.
"If too many people try to communicate at once," explains Jacobson, "the network can't deal with that and rejects the packets, sending them back. When a workstation retransmits immediately, this aggravates the situation. What we did was write polite protocols that require a slight wait before a packet is retransmitted. Everybody has to use these polite protocols or the Internet doesn't work for anybody."
Jacobson and Karels' protocols, now a universal part of the Internet, are called "Slow Start." Slow Start avoids congestion by monitoring the network and, when congestion appears imminent, delaying the transmission of packets anywhere from milliseconds to a second. Slow Start delays transmission rates based on factors that include the current available capacity of the network as well as a multiple of the round-trip transmission time (essentially, distance) between the sender and the chosen destination. Six years after it was introduced, Slow Start continues to avoid network congestion even though both the speed of the network and the number of users have grown a thousand-fold.
About two years ago, Jacobson and researchers at Xerox's Palo Alto Research Center (PARC) took on a project to add audio and video conferencing to the Internet. As with telephone systems, audio/video conferencing between multiple parties via computer was an old, yet unattained, vision.
In terms of conferencing, a computer network starts with an inherent advantage over a telephone system. Whereas a telephone line connects two points and carries one conversation, the Internet connects each party on the line and carries multiple simultaneous "conversations." To support this huge flow of information, it breaks communications down into small packets which are mixed into the ongoing stream of packets traversing the network. Each packet is wrapped with shipping and assembly instructions (called protocols), which give the destination, return address, and how the receiving computer can reorder all the packets back into the original communication.
Because of the slight delays inherent in the Internet, several research groups charged with bringing audio and video conferencing to the network concluded that they had been given an impossible mission. They advised that a new network be built.
"We felt this was ridiculous," recalls Jacobson. "The Internet supported communication between two Cray supercomputers, which transmit at one gigabit per second (one billion bits). It also worked for somebody sitting at a keyboard, typing at 20 bits per second. This robustness and dynamic range seemed too good to abandon. So we looked hard. There should have been no reason why we couldn't do audio and video."
Delays actually are more disruptive to people talking than they are to video conferencing. Conferees can tolerate an occasional still picture during a video transmission whereas voices heard in uneven staccato bursts sound like gibberish. Jacobson and Steve Deering of Xerox PARC concentrated on devising a system that would preserve the global connectivity of the Internet yet also allow a smooth and prompt audio flow.
To allow the listener to hear continuous speech, Jacobson and Deering first added a time-stamp to each audio packet. The receiver reads the time-stamps, chronologically orders the packets, and then plays them while continuing to receive and order additional incoming packets for subsequent replay. That averts inside-out Pig Latin speech, but it does not deal with the uneven nature of network packet-flow and the bursts of audio that result.
To remedy this, the two researchers took advantage of the difference between the incredible speed at which the network moves packets and the relatively long two-tenths to half-second delay that humans can handle without conversation being disrupted. They created an algorithm that computes how long packets are taking to arrive and then slows down the voice replay enough to allow even the slowest packets sufficient time to arrive. The playback delays introduced by the algorithm actually are very short, typically less than one-tenth of a second. Thanks to the imperceptible controlled delays introduced by Jacobson and Deering, voice conferencing between Internet users with microphones and computer speakers now is commonplace.
Seated in his office at LBL, Jacobson demonstrated how he is connected with networkers worldwide. Logging onto the Internet, he called up Lightweight Sessions, a window interface with a simple form to announce or sign up for an audio or video conference. Network users routinely access Lightweight Sessions for notices about upcoming conferences and to sign up for those conferences of interest. Some conferences are voice only, whereas others include video, which was developed by Xerox PARC researchers. During an audio/video conference, tiny, inexpensive cameras, usually connected to the side of a participant's computer, transmit a live picture of each conferee. Jacobson's screen was split into multiple windows with one window showing a picture of the individual speaking. A second window displayed data that was under discussion. A third window is scheduled to debut on the Internet in the near future.
Jacobson calls this new graphic display window a "whiteboard." People will be able to use it much like a conventional computer drawing program to share information or collaborate in a design project. Any conference participant can see and, in turn, modify whatever is being depicted on the whiteboard or page back to a previous version of the image. Any image--for instance, computer- assisted designs or x-ray images--also can be imported into the whiteboard window.
The whiteboard is indicative of the growing importance of visual data in science.
Stu Loken says that still images and video will be integral to most of the research conducted at LBL. "Science is experiencing a proliferation of visual data, everything from skymaps charting the structure of the early universe to medical images showing the neurochemistry in Alzheimer's patients, to images of the human genome," he says.
Loken acknowledges that pictures create both possibilities and problems. Huge amounts of digital information are required to create images: A video camera generates 30 frames per second, or the equivalent of more than 2000 pages of words. But computers are now fast enough and storage devices massive enough so that scientists are beginning to use video cameras, capturing data for analysis by computer. This emerging capacity is causing a shift in how scientists design experiments, opening new windows into what can be learned.
To enable researchers to create data through video, LBL Information and Computing Sciences Division researchers have taken on a multi-faceted mission. Bill Johnston heads a team that is creating new hardware and software for the processing and analysis of a visual data stream over high-speed networks. The idea is to allow a scientist to be able to take output from, for instance, an electron microscope and to connect that video stream to a network just as routinely as a workstation can be connected to a network today.
Connection to the network is just the first step. Johnston's group is dedicated to the development of distributed scientific computing. Up until now, resource location has dictated the course of science. Projects proceed at sites where the right people, experimental, and computing resources can be brought together. In the distributed scientific computing environments now being pioneered at LBL, machines, databases, and people scattered across the globe can be quickly and temporarily linked. For example, a video stream from an ongoing experiment can be routed to a supercomputer for processing and the instant analysis used to interactively control the experimental apparatus. Or the data flow can be processed, analyzed, and then serve as the input in a companion experiment.
Johnston notes that network video conferencing already is available but cautions that this should not be confused with high- speed network transmission of scientific video data. The difference is the quality of the picture.
For instance, in order to squeeze video conference transmissions through the still narrow Internet pipeline, the standard broadcast rate of 30 frames per second has been reduced to six to 12 frames per second. Additionally, rather than transmit a succession of full frames, compression algorithms are used that transmit only that part of the picture which has changed from the prior frame. These partial pictures are then assembled into full images by software at the receiving end. The net effect is that people can see decent images of one another as they speak even though the digital flow rate is reduced thousands of times over that of a standard video broadcast.
While this works for video conferencing -- a sense of presence is created, if not an Ansel Adams-quality picture -- scientific data cannot be concentrated like this and survive.
Explains Johnston, "Video data typically consists of images produced by sensors that are pushing the limits of technology. Often, we have lots of fuzzy, low-contrast images with features that are difficult to distinguish from the background noise. To be able to analyze and extract information, we can't afford to lose any of the original video detail. Video transmitted over the Internet has been compressed to 8-16 kilobits per second. Contrast that with a conventional monochrome instrumentation camera, which generates 120,000 kilobits per second."
Johnston's group uses laboratory science as a driver to develop technology ultimately destined for the masses. For example, consider the case of LBL biochemist Marcos Maestre, who is videotaping small strands of DNA vibrating in a microscopic electric grid in order to study DNA's physical chemistry. Currently, researchers vibrate the DNA, create a videotape, and then walk the tape over to an animation system where still images of each frame are painstakingly produced. Hours and hours are required to extract about 200 frames, or seven seconds of data. Then the images are scanned into a computer frame by frame, which tracks and measures the changing shape of the DNA string, yielding new insights into its structure.
In the distributed system under development at LBL, researchers will be able to look at a workstation monitor and see the data culled from a live video even as the experiment is running. The video camera will be connected to a network where a storage device will save the images while also transmitting them to LBL's new supercomputer, a MasPar massively parallel processing system. The MasPar can process and analyze a video input of 30 frames a second, providing an instant display of data at any engineering workstation on the network.
Unfortunately, to make this a reality, more is involved than simply plugging the components of this system into one another. As the digital stream flows from the experiment to its storage, analysis, and display, several bottlenecks occur. Before the raw signal streaming out of the monochrome camera at 120,000 kilobits per second is transmitted onto the network, an intermediary computer must translate the output into a digital packet configured for the network. Digital traffic begins to back up at this stage.
"When the only reason for a computer is to mediate between the network and a camera," comments Johnston, "in effect, you have created a bureaucracy. The computer does the job but not efficiently. A computer is designed to do many tasks rather than this specialized task. What we need is a controller, a stripped- down computer dedicated to just that one job. We're now building a network controller for a video camera in a collaboration with PsiTech Corp. of Fountain Valley, Calif."
Slaying bureaucrats and opening bottlenecks -- Johnston says this is a recurring mission of his distributed computing group. For instance, data being routed through a network must be saved in a digital archive before it is analyzed. To store the high data rates from sources like video, Redundant Arrays of Inexpensive Disks (RAID) were developed on the Berkeley campus. The first generation of RAID required an intermediary computer, which slowed the flow, so a network controller has been built and a new generation, RAID II, has been developed. RAID II has now been attached to HiPPI, a 800 megabits/second network typically used for linking supercomputers, and LBL is working to attach it to the Internet. This work is a collaboration with several electrical engineering and computer science groups. In a related Department of Energy- funded project, Johnston's group has worked with Berkeley professors Domenico Ferrari and Randy Katz and LBL's Bob Fink and Ted Sopher to build an optical-fiber gigabit (billion bit) network that connects LBL and the Berkeley campus. All of the high-speed hardware at LBL and on campus has been attached to this network, creating a local spur to the national information superhighway.
At LBL, the new MasPar computer will play a central role in the Lab's distributed computing environment, opening the door to the era of visual data. While it will allow volumes of images to be created and stored, researchers soon will encounter a major obstacle. Even with its 4,096 processors delivering peak performance of 17,000 million instructions per second, nobody yet knows how to instruct the supercomputer to find a particular stored image.
"Searching a text database for a word or string of characters and searching a video database for an object is a very different problem," says Johnston. "A computer easily can find every reference in a text database to `fish' but there is no ready way to look through an archival set of video images and find all those with fish. We're collaborating with the MasPar Computer Corporation to develop technology to do this."
The symbolism of this mission--searching the proverbial haystack of images and, at long last, finding the needle--should not be overlooked. To LBL's computer scientists, this is a time of imminent possibilities.
Traffic on the Internet is accelerating and new spurs are growing and connecting up. Since Clinton and Gore took office, a cascade of multibillion-dollar corporate investments in network infrastructure has been announced. Telephone, cable TV, and cellular phone companies, publishers, and computer makers are rushing to stake a claim. America is being wired for the future.
Speaking for his colleagues, Johnston says: "We are about to experience changes in this country as profound as those experienced by our ancestors at the advent of the Industrial Revolution. Within the decade, computers, communications, and entertainment will be merged. Scientists, doctors, business people, and schoolchildren will be connected not only to their peers but to everybody else. The way we learn and interact is about to be revolutionized."
Note: Immediately after writing this article during the summer of 1993, Jeffery Kahn helped create a Web site for Lawrence Berkeley National Laboratory - one of the first 250 Web sites in the world. Today, he is editor-in-chief of the Web at the Laboratory.