skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Exploiting Data Parallelism in the Image Content Engine

Abstract

The Image Content Engine (ICE) is a framework of software and underlying mathematical and physical models that enable scientists and analysts to extract features from Terabytes of imagery and search the extracted features for content relevant to their problem domain. The ICE team has developed a set of tools for feature extraction and analysis of image data, primarily based on the image content. The scale and volume of imagery that must be searched presents a formidable computation and data bandwidth challenge, and a search of moderate to large scale imagery quickly becomes intractable without exploiting high degrees of data parallelism in the feature extraction engine. In this paper we describe the software and hardware architecture developed to build a data parallel processing engine for ICE. We discuss our highly tunable parallel process and job scheduling subsystem, remote procedure invocation, parallel I/O strategy, and our experience in running ICE on a 16 node, 32 processing element (CPU) Linux Cluster. We present performance and benchmark results, and describe how we obtain excellent speedup for the imagery searches in our test-bed prototype.

Authors:
; ; ;
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
889965
Report Number(s):
UCRL-CONF-219867
TRN: US200620%%170
DOE Contract Number:
W-7405-ENG-48
Resource Type:
Conference
Resource Relation:
Conference: Presented at: SPIE Defense and Security Symposium, Kissimmee, FL, United States, Apr 17 - Apr 21, 2006
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; ARCHITECTURE; BENCHMARKS; ENGINES; PARALLEL PROCESSING; PERFORMANCE; PROCESSING; SECURITY

Citation Formats

Miller, W M, Garlick, J E, Weinert, G F, and Abdulla, G M. Exploiting Data Parallelism in the Image Content Engine. United States: N. p., 2006. Web.
Miller, W M, Garlick, J E, Weinert, G F, & Abdulla, G M. Exploiting Data Parallelism in the Image Content Engine. United States.
Miller, W M, Garlick, J E, Weinert, G F, and Abdulla, G M. Thu . "Exploiting Data Parallelism in the Image Content Engine". United States. doi:. https://www.osti.gov/servlets/purl/889965.
@article{osti_889965,
title = {Exploiting Data Parallelism in the Image Content Engine},
author = {Miller, W M and Garlick, J E and Weinert, G F and Abdulla, G M},
abstractNote = {The Image Content Engine (ICE) is a framework of software and underlying mathematical and physical models that enable scientists and analysts to extract features from Terabytes of imagery and search the extracted features for content relevant to their problem domain. The ICE team has developed a set of tools for feature extraction and analysis of image data, primarily based on the image content. The scale and volume of imagery that must be searched presents a formidable computation and data bandwidth challenge, and a search of moderate to large scale imagery quickly becomes intractable without exploiting high degrees of data parallelism in the feature extraction engine. In this paper we describe the software and hardware architecture developed to build a data parallel processing engine for ICE. We discuss our highly tunable parallel process and job scheduling subsystem, remote procedure invocation, parallel I/O strategy, and our experience in running ICE on a 16 node, 32 processing element (CPU) Linux Cluster. We present performance and benchmark results, and describe how we obtain excellent speedup for the imagery searches in our test-bed prototype.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Mar 09 00:00:00 EST 2006},
month = {Thu Mar 09 00:00:00 EST 2006}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • The Image Content Engine (ICE) is being developed to provide cueing assistance to human image analysts faced with increasingly large and intractable amounts of image data. The ICE architecture includes user configurable feature extraction pipelines which produce intermediate feature vector and match surface files which can then be accessed by interactive relational queries. Application of the feature extraction algorithms to large collections of images may be extremely time consuming and is launched as a batch job on a Linux cluster. The query interface accesses only the intermediate files and returns candidate hits nearly instantaneously. Queries may be posed for individualmore » objects or collections. The query interface prompts the user for feedback, and applies relevance feedback algorithms to revise the feature vector weighting and focus on relevant search results. Examples of feature extraction and both model-based and search-by-example queries are presented.« less
  • Human analysts are often unable to meet time constraints on analysis and interpretation of large volumes of remotely sensed imagery. To address this problem, the Image Content Engine (ICE) system currently under development is organized into an off-line component for automated extraction of image features followed by user-interactive components for content detection and content-based query processing. The extracted features are vectors that represent attributes of three entities, namely image tiles, image regions and shapes, or suspected matches to models of objects. ICE allows users to interactively specify decision thresholds so that content (consisting of entities whose features satisfy decision criteria)more » can be detected. ICE presents detected content to users as a prioritized series of thumbnail images. Users can either accept the detection results or specify a new set of decision thresholds. Once accepted, ICE stores the detected content in database tables and semantic graphs. Users can interactively query the tables and graphs for locations at which prescribed relationships between detected content exist. New queries can be submitted repeatedly until a satisfactory series of prioritized thumbnail image cues is produced. Examples are provided to demonstrate how ICE can be used to assist users in quickly finding prescribed collections of entities (both natural and man-made) in a set of large USGS aerial photos retrieved from TerraserverUSA.« less
  • In the solution of large-scale numerical prob- lems, parallel computing is becoming simultaneously more important and more difficult. The complex organization of today's multiprocessors with several memory hierarchies has forced the scientific programmer to make a choice between simple but unscalable code and scalable but extremely com- plex code that does not port to other architectures. This paper describes how the SMARTS runtime system and the POOMA C++ class library for high-performance scientific computing work together to exploit data parallelism in scientific applications while hiding the details of manag- ing parallelism and data locality from the user. We present innovativemore » algorithms, based on the macro -dataflow model, for detecting data parallelism and efficiently executing data- parallel statements on shared-memory multiprocessors. We also desclibe how these algorithms can be implemented on clusters of SMPS.« less
  • Opus is a set of Fortran language extensions that provides shared data abstractions (SDAs) as a method for communication and synchronization among coarse-grain data parallel tasks. In this paper we focus on a simplified multidisciplinary optimization application (the design of a full aircraft configuration) and present a version of the code using the features of Opus. We also briefly describe the runtime system needed to support Opus.
  • We propose IMPACC, an MPI+OpenACC framework for heterogeneous accelerator clusters. IMPACC tightly integrates MPI and OpenACC, while exploiting the shared memory parallelism in the target system. IMPACC dynamically adapts the input MPI+OpenACC applications on the target heterogeneous accelerator clusters to fully exploit target system-specific features. IMPACC provides the programmers with the unified virtual address space, automatic NUMA-friendly task-device mapping, efficient integrated communication routines, seamless streamlining of asynchronous executions, and transparent memory sharing. We have implemented IMPACC and evaluated its performance using three heterogeneous accelerator systems, including Titan supercomputer. Results show that IMPACC can achieve easier programming, higher performance, and bettermore » scalability than the current MPI+OpenACC model.« less