Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Clustering-Based Predictive Analytics to Improve Scientific Data Discovery

Conference ·

Given the sheer volume of scientific data archived within the data-intensive projects at the US Department of Energy's Oak Ridge National Laboratory, finding precisely what data we are looking for may not be a trivial task; conversely, we may also miss a more prominent data product. To address such issues, we propose improving the data discovery system and using data analytics methods to comprehend what specific users might be interested in based on their physiological state, search patterns, and past data usage history. This work's primary goal is to prune the complexity, increase the visibility of popular data products, and direct users toward the data that best meet their needs. The proposed algorithm constructs a user profile based on the user's explicit or implicit interactions with the system, such as items they are currently looking at on-site and the key metadata mappings related to the data set. The pattern is then used to build a training data set, which will help find relevant data to recommend to the user.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1777750
Country of Publication:
United States
Language:
English

Similar Records

Metadata's Role in a Scientific Archive
Journal Article · 2003 · Computer, 36(12):27-34 · OSTI ID:15010333

Enabling modern data discovery for atmospheric measurements
Journal Article · 2021 · Earth Science Informatics · OSTI ID:1807242

Constellation: A science graph network for scalable data and knowledge discovery in extreme-scale scientific collaborations
Conference · 2016 · 2016 IEEE International Conference on Big Data (Big Data) · OSTI ID:1567564

Related Subjects