skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: CLaSPS: A NEW METHODOLOGY FOR KNOWLEDGE EXTRACTION FROM COMPLEX ASTRONOMICAL DATA SETS

Abstract

In this paper, we present the Clustering-Labels-Score Patterns Spotter (CLaSPS), a new methodology for the determination of correlations among astronomical observables in complex data sets, based on the application of distinct unsupervised clustering techniques. The novelty in CLaSPS is the criterion used for the selection of the optimal clusterings, based on a quantitative measure of the degree of correlation between the cluster memberships and the distribution of a set of observables, the labels, not employed for the clustering. CLaSPS has been primarily developed as a tool to tackle the challenging complexity of the multi-wavelength complex and massive astronomical data sets produced by the federation of the data from modern automated astronomical facilities. In this paper, we discuss the applications of CLaSPS to two simple astronomical data sets, both composed of extragalactic sources with photometric observations at different wavelengths from large area surveys. The first data set, CSC+, is composed of optical quasars spectroscopically selected in the Sloan Digital Sky Survey data, observed in the x-rays by Chandra and with multi-wavelength observations in the near-infrared, optical, and ultraviolet spectral intervals. One of the results of the application of CLaSPS to the CSC+ is the re-identification of a well-known correlation between themore » {alpha}{sub OX} parameter and the near-ultraviolet color, in a subset of CSC+ sources with relatively small values of the near-ultraviolet colors. The other data set consists of a sample of blazars for which photometric observations in the optical, mid-, and near-infrared are available, complemented for a subset of the sources, by Fermi {gamma}-ray data. The main results of the application of CLaSPS to such data sets have been the discovery of a strong correlation between the multi-wavelength color distribution of blazars and their optical spectral classification in BL Lac objects and flat-spectrum radio quasars, and a peculiar pattern followed by blazars in the WISE mid-infrared colors space. This pattern and its physical interpretation have been discussed in detail in other papers by one of the authors.« less

Authors:
; ;  [1]; ; ;  [2]
  1. Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States)
  2. Department of Astronomy, California Institute of Technology, MC 249-17 1200 East California Blvd, Pasadena, CA 91125 (United States)
Publication Date:
OSTI Identifier:
22039105
Resource Type:
Journal Article
Journal Name:
Astrophysical Journal
Additional Journal Information:
Journal Volume: 755; Journal Issue: 2; Other Information: Country of input: International Atomic Energy Agency (IAEA); Journal ID: ISSN 0004-637X
Country of Publication:
United States
Language:
English
Subject:
79 ASTROPHYSICS, COSMOLOGY AND ASTRONOMY; ASTRONOMY; ASTROPHYSICS; CATALOGS; CLASSIFICATION; COLOR; CORRELATIONS; DATASETS; INFRARED SPECTRA; QUASARS; SPACE; ULTRAVIOLET RADIATION; WAVELENGTHS; X RADIATION

Citation Formats

D'Abrusco, R, Fabbiano, G, Laurino, O, Djorgovski, G, Donalek, C, and Longo, G. CLaSPS: A NEW METHODOLOGY FOR KNOWLEDGE EXTRACTION FROM COMPLEX ASTRONOMICAL DATA SETS. United States: N. p., 2012. Web. doi:10.1088/0004-637X/755/2/92.
D'Abrusco, R, Fabbiano, G, Laurino, O, Djorgovski, G, Donalek, C, & Longo, G. CLaSPS: A NEW METHODOLOGY FOR KNOWLEDGE EXTRACTION FROM COMPLEX ASTRONOMICAL DATA SETS. United States. https://doi.org/10.1088/0004-637X/755/2/92
D'Abrusco, R, Fabbiano, G, Laurino, O, Djorgovski, G, Donalek, C, and Longo, G. 2012. "CLaSPS: A NEW METHODOLOGY FOR KNOWLEDGE EXTRACTION FROM COMPLEX ASTRONOMICAL DATA SETS". United States. https://doi.org/10.1088/0004-637X/755/2/92.
@article{osti_22039105,
title = {CLaSPS: A NEW METHODOLOGY FOR KNOWLEDGE EXTRACTION FROM COMPLEX ASTRONOMICAL DATA SETS},
author = {D'Abrusco, R and Fabbiano, G and Laurino, O and Djorgovski, G and Donalek, C and Longo, G},
abstractNote = {In this paper, we present the Clustering-Labels-Score Patterns Spotter (CLaSPS), a new methodology for the determination of correlations among astronomical observables in complex data sets, based on the application of distinct unsupervised clustering techniques. The novelty in CLaSPS is the criterion used for the selection of the optimal clusterings, based on a quantitative measure of the degree of correlation between the cluster memberships and the distribution of a set of observables, the labels, not employed for the clustering. CLaSPS has been primarily developed as a tool to tackle the challenging complexity of the multi-wavelength complex and massive astronomical data sets produced by the federation of the data from modern automated astronomical facilities. In this paper, we discuss the applications of CLaSPS to two simple astronomical data sets, both composed of extragalactic sources with photometric observations at different wavelengths from large area surveys. The first data set, CSC+, is composed of optical quasars spectroscopically selected in the Sloan Digital Sky Survey data, observed in the x-rays by Chandra and with multi-wavelength observations in the near-infrared, optical, and ultraviolet spectral intervals. One of the results of the application of CLaSPS to the CSC+ is the re-identification of a well-known correlation between the {alpha}{sub OX} parameter and the near-ultraviolet color, in a subset of CSC+ sources with relatively small values of the near-ultraviolet colors. The other data set consists of a sample of blazars for which photometric observations in the optical, mid-, and near-infrared are available, complemented for a subset of the sources, by Fermi {gamma}-ray data. The main results of the application of CLaSPS to such data sets have been the discovery of a strong correlation between the multi-wavelength color distribution of blazars and their optical spectral classification in BL Lac objects and flat-spectrum radio quasars, and a peculiar pattern followed by blazars in the WISE mid-infrared colors space. This pattern and its physical interpretation have been discussed in detail in other papers by one of the authors.},
doi = {10.1088/0004-637X/755/2/92},
url = {https://www.osti.gov/biblio/22039105}, journal = {Astrophysical Journal},
issn = {0004-637X},
number = 2,
volume = 755,
place = {United States},
year = {Mon Aug 20 00:00:00 EDT 2012},
month = {Mon Aug 20 00:00:00 EDT 2012}
}