Scalable pattern recognition for large-scale scientific data mining
Our ability to generate data far outstrips our ability to explore and understand it. The true value of this data lies not in its final size or complexity, but rather in our ability to exploit the data to achieve scientific goals. The data generated by programs such as ASCI have such a large scale that it is impractical to manually analyze, explore, and understand it. As a result, useful information is overlooked, and the potential benefits of increased computational and data gathering capabilities are only partially realized. The difficulties that will be faced by ASCI applications in the near future are foreshadowed by the challenges currently facing astrophysicists in making full use of the data they have collected over the years. For example, among other difficulties, astrophysicists have expressed concern that the sheer size of their data restricts them to looking at very small, narrow portions at any one time. This narrow focus has resulted in the loss of ``serendipitous`` discoveries which have been so vital to progress in the area in the past. To solve this problem, a new generation of computational tools and techniques is needed to help automate the exploration and management of large scientific data. This whitepaper proposes applying and extending ideas from the area of data mining, in particular pattern recognition, to improve the way in which scientists interact with large, multi-dimensional, time-varying data.
- Research Organization:
- Lawrence Livermore National Lab., CA (United States)
- Sponsoring Organization:
- USDOE, Washington, DC (United States)
- DOE Contract Number:
- W-7405-ENG-48
- OSTI ID:
- 310913
- Report Number(s):
- UCRL-ID--130245; ON: DE98058345; BR: YN0100000
- Country of Publication:
- United States
- Language:
- English
Similar Records
The Scientific Data Management Center: Available Technologies and Highlights
Data and Visualization Corridors: Report on the 1998 DVC Workshop Series