Middleware Case Study: MeDICi

Wynne, Adam S

Title: Middleware Case Study: MeDICi

Book · Thu May 05 00:00:00 EDT 2011

OSTI ID:1040989

Wynne, Adam S

In many application domains in science and engineering, data produced by sensors, instruments and networks is naturally processed by software applications structured as a pipeline . Pipelines comprise a sequence of software components that progressively process discrete units of data to produce a desired outcome. For example, in a Web crawler that is extracting semantics from text on Web sites, the first stage in the pipeline might be to remove all HTML tags to leave only the raw text of the document. The second step may parse the raw text to break it down into its constituent grammatical parts, such as nouns, verbs and so on. Subsequent steps may look for names of people or places, interesting events or times so documents can be sequenced on a time line. Each of these steps can be written as a specialized program that works in isolation with other steps in the pipeline. In many applications, simple linear software pipelines are sufficient. However, more complex applications require topologies that contain forks and joins, creating pipelines comprising branches where parallel execution is desirable. It is also increasingly common for pipelines to process very large files or high volume data streams which impose end-to-end performance constraints. Additionally, processes in a pipeline may have specific execution requirements and hence need to be distributed as services across a heterogeneous computing and data management infrastructure. From a software engineering perspective, these more complex pipelines become problematic to implement. While simple linear pipelines can be built using minimal infrastructure such as scripting languages, complex topologies and large, high volume data processing requires suitable abstractions, run-time infrastructures and development tools to construct pipelines with the desired qualities-of-service and flexibility to evolve to handle new requirements. The above summarizes the reasons we created the MeDICi Integration Framework (MIF) that is designed for creating high-performance, scalable and modifiable software pipelines. MIF exploits a low friction, robust, open source middleware platform and extends it with component and service-based programmatic interfaces that make implementing complex pipelines simple. The MIF run-time automatically handles queues between pipeline elements in order to handle request bursts, and automatically executes multiple instances of pipeline elements to increase pipeline throughput. Distributed pipeline elements are supported using a range of configurable communications protocols, and the MIF interfaces provide efficient mechanisms for moving data directly between two distributed pipeline elements.

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Cite

Export

Save

Research Organization:: Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-76RL01830

OSTI ID:: 1040989

Report Number(s):: PNNL-SA-82585; TRN: US201211%%319

Resource Relation:: Related Information: Essential Software Architecture, 2nd Edition, 147-164

Country of Publication:: United States

Language:: English

Similar Records

Engineering High Performance Service-Oriented Pipeline Applications with MeDICi

Conference · Fri Jan 07 00:00:00 EST 2011 · OSTI ID:1040989

Gorton, Ian; Wynne, Adam S; Liu, Yan

41. DISCOVERY, SEARCH, AND COMMUNICATION OF TEXTUAL KNOWLEDGE RESOURCES IN DISTRIBUTED SYSTEMS a. Discovering and Utilizing Knowledge Sources for Metasearch Knowledge Systems

Technical Report · Tue Mar 18 00:00:00 EDT 2008 · OSTI ID:1040989

Zamora, Antonio

The MeDICi Integration Framework: A Platform for High Performance Data Streaming Applications

Conference · Fri Feb 22 00:00:00 EST 2008 · OSTI ID:1040989

Gorton, Ian; Wynne, Adam S; Almquist, Justin P; +1 more

Related Subjects

99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE
COMMUNICATIONS
COMPUTERS
COMPUTER CODES
DATA PROCESSING
FLEXIBILITY
FRICTION
MANAGEMENT
PERFORMANCE
PIPELINES
QUEUES
SENSORS
Data Intensive Computing

Title: Middleware Case Study: MeDICi

Citation Formats

Similar Records

Related Subjects