skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Large-scale seismic signal analysis with Hadoop

Journal Article · · Computers and Geosciences

In seismology, waveform cross correlation has been used for years to produce high-precision hypocenter locations and for sensitive detectors. Because correlated seismograms generally are found only at small hypocenter separation distances, correlation detectors have historically been reserved for spotlight purposes. However, many regions have been found to produce large numbers of correlated seismograms, and there is growing interest in building next-generation pipelines that employ correlation as a core part of their operation. In an effort to better understand the distribution and behavior of correlated seismic events, we have cross correlated a global dataset consisting of over 300 million seismograms. This was done using a conventional distributed cluster, and required 42 days. In anticipation of processing much larger datasets, we have re-architected the system to run as a series of MapReduce jobs on a Hadoop cluster. In doing so we achieved a factor of 19 performance increase on a test dataset. We found that fundamental algorithmic transformations were required to achieve the maximum performance increase. Whereas in the original IO-bound implementation, we went to great lengths to minimize IO, in the Hadoop implementation where IO is cheap, we were able to greatly increase the parallelism of our algorithms by performing a tiered series of very fine-grained (highly parallelizable) transformations on the data. Each of these MapReduce jobs required reading and writing large amounts of data.

Research Organization:
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
Grant/Contract Number:
AC52-07NA27344; LLNL-JRNL-644626
OSTI ID:
1209709
Alternate ID(s):
OSTI ID: 1201566
Journal Information:
Computers and Geosciences, Journal Name: Computers and Geosciences Vol. 66 Journal Issue: C; ISSN 0098-3004
Publisher:
ElsevierCopyright Statement
Country of Publication:
United Kingdom
Language:
English
Citation Metrics:
Cited by: 41 works
Citation information provided by
Web of Science

References (14)

The Google file system conference January 2003
High precision relative locations of earthquakes at Mount St. Helens, Washington journal September 1987
Streaks of microearthquakes along creeping faults journal August 1999
High-Resolution Surface-Wave Tomography from Ambient Seismic Noise journal March 2005
R-trees: a dynamic index structure for spatial searching conference January 1984
Monitoring velocity variations in the crust using earthquake doublets: An application to the Calaveras Fault, California journal July 1984
A study of the seismic noise from its long-range correlation properties journal January 2006
Four similar earthquakes in central California journal October 1980
Deep fault plane geometry inferred from multiplet relative relocation beneath the south flank of Kilauea journal January 1994
Repeating Seismic Events in China journal February 2004
Southern California Hypocenter Relocation with Waveform Cross-Correlation, Part 1: Results Using the Double-Difference Method journal June 2005
An Autonomous System for Grouping Events in a Developing Aftershock Sequence journal March 2011
Long-Range Correlations in the Diffuse Seismic Coda journal January 2003
A Faster Algorithm for Finding the Minimum Cut in a Directed Graph journal November 1994