Data mining
This is the final report of a one-year, Laboratory Directed Research and Development (LDRD) project at the Los Alamos National Laboratory (LANL). The objective of this project was to develop and implement data mining technology suited to the analysis of large collections of unstructured data. This has taken the form of a software tool, PADMA (Parallel Data Mining Agents), which incorporates parallel data accessing, parallel scalable hierarchical clustering algorithms, and a web-based user interface for submitting Structured Query Language (SQL) queries and interactive data visualization. The authors have demonstrated the viability and scalability of PADMA by applying it to an unstructured text database of 25,000 documents running on an IBM SP2 at Argonne National Laboratory. The utility of PADMA for discovering patterns in data has also been demonstrated by applying it to laboratory test data for Hepatitis C patients and autopsy reports in collaboration with the University of New Mexico School of Medicine.
- Research Organization:
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE Assistant Secretary for Human Resources and Administration, Washington, DC (United States)
- DOE Contract Number:
- W-7405-ENG-36
- OSTI ID:
- 334314
- Report Number(s):
- LA-UR-98-3261; ON: DE99002233; TRN: AHC29914%%118
- Resource Relation:
- Other Information: PBD: [1998]
- Country of Publication:
- United States
- Language:
- English
Similar Records
Scalable, distributed data mining using an agent based architecture
PADMA: PArallel Data Mining Agents for scalable text classification