Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Scalable pattern recognition for large-scale scientific data mining

Technical Report ·
DOI:https://doi.org/10.2172/310913· OSTI ID:310913

Our ability to generate data far outstrips our ability to explore and understand it. The true value of this data lies not in its final size or complexity, but rather in our ability to exploit the data to achieve scientific goals. The data generated by programs such as ASCI have such a large scale that it is impractical to manually analyze, explore, and understand it. As a result, useful information is overlooked, and the potential benefits of increased computational and data gathering capabilities are only partially realized. The difficulties that will be faced by ASCI applications in the near future are foreshadowed by the challenges currently facing astrophysicists in making full use of the data they have collected over the years. For example, among other difficulties, astrophysicists have expressed concern that the sheer size of their data restricts them to looking at very small, narrow portions at any one time. This narrow focus has resulted in the loss of ``serendipitous`` discoveries which have been so vital to progress in the area in the past. To solve this problem, a new generation of computational tools and techniques is needed to help automate the exploration and management of large scientific data. This whitepaper proposes applying and extending ideas from the area of data mining, in particular pattern recognition, to improve the way in which scientists interact with large, multi-dimensional, time-varying data.

Research Organization:
Lawrence Livermore National Lab., CA (United States)
Sponsoring Organization:
USDOE, Washington, DC (United States)
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
310913
Report Number(s):
UCRL-ID--130245; ON: DE98058345; BR: YN0100000
Country of Publication:
United States
Language:
English

Similar Records

LDRD 99-ERI-010 Final Report: Sapphire: Scalable Pattern Recognition for Large-Scale Scientific Data Mining
Technical Report · Tue Jan 29 23:00:00 EST 2002 · OSTI ID:15003138

The Scientific Data Management Center: Available Technologies and Highlights
Conference · Fri Sep 30 00:00:00 EDT 2011 · OSTI ID:1036433

Data and Visualization Corridors: Report on the 1998 DVC Workshop Series
Conference · Tue Sep 01 00:00:00 EDT 1998 · OSTI ID:801444