skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Reverse image search for scientific data within and beyond the visible spectrum

Abstract

The explosion in the rate, quality and diversity of image acquisition instruments has propelled the development of expert systems to organize and query image collections more efficiently. Recommendation systems that handle scientific images are rare, particularly if records lack metadata. This paper introduces new strategies to enable fast searches and image ranking from large pictorial datasets with or without labels. The main contribution is the development of pyCBIR, a deep neural network software to search scientific images by content. This tool exploits convolutional layers with locality sensitivity hashing for querying images across domains through a user-friendly interface. Our results report image searches over databases ranging from thousands to millions of samples. We test pyCBIR search capabilities using three convNets against four scientific datasets, including samples from cell microscopy, microtomography, atomic diffraction patterns, and materials photographs to demonstrate 95% accurate recommendations in most cases. Furthermore, all scientific data collections are released.

Authors:
ORCiD logo [1];  [1]; ORCiD logo [2];  [3];  [3];  [4];  [5]
  1. Univ. of California, Berkeley, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Federal Univ. of Ceara, Fortaleza (Brazil); Federal Univ. of Piaui, Picos (Brazil)
  2. Federal Univ. of Ceara, Fortaleza (Brazil)
  3. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  4. Federal Univ. of Ouro Preto, Minas Gerais (Brazil)
  5. Univ. of California, Berkeley, CA (United States); Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21); Moore-Sloan Foundation; Fapemig
OSTI Identifier:
1526548
Alternate Identifier(s):
OSTI ID: 1548250
Grant/Contract Number:  
AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
Expert Systems with Applications
Additional Journal Information:
Journal Volume: 109; Journal Issue: C; Journal ID: ISSN 0957-4174
Publisher:
Elsevier
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; reverse image search; content-based image retrieval; scientific image recommendation; convolutional neural network

Citation Formats

Araujo, Flavio H. D., Silva, Romuere R. V., Medeiros, Fatima N. S., Parkinson, Dilworth D., Hexemer, Alexander, Carneiro, Claudia M., and Ushizima, Daniela M. Reverse image search for scientific data within and beyond the visible spectrum. United States: N. p., 2018. Web. doi:10.1016/j.eswa.2018.05.015.
Araujo, Flavio H. D., Silva, Romuere R. V., Medeiros, Fatima N. S., Parkinson, Dilworth D., Hexemer, Alexander, Carneiro, Claudia M., & Ushizima, Daniela M. Reverse image search for scientific data within and beyond the visible spectrum. United States. doi:10.1016/j.eswa.2018.05.015.
Araujo, Flavio H. D., Silva, Romuere R. V., Medeiros, Fatima N. S., Parkinson, Dilworth D., Hexemer, Alexander, Carneiro, Claudia M., and Ushizima, Daniela M. Fri . "Reverse image search for scientific data within and beyond the visible spectrum". United States. doi:10.1016/j.eswa.2018.05.015. https://www.osti.gov/servlets/purl/1526548.
@article{osti_1526548,
title = {Reverse image search for scientific data within and beyond the visible spectrum},
author = {Araujo, Flavio H. D. and Silva, Romuere R. V. and Medeiros, Fatima N. S. and Parkinson, Dilworth D. and Hexemer, Alexander and Carneiro, Claudia M. and Ushizima, Daniela M.},
abstractNote = {The explosion in the rate, quality and diversity of image acquisition instruments has propelled the development of expert systems to organize and query image collections more efficiently. Recommendation systems that handle scientific images are rare, particularly if records lack metadata. This paper introduces new strategies to enable fast searches and image ranking from large pictorial datasets with or without labels. The main contribution is the development of pyCBIR, a deep neural network software to search scientific images by content. This tool exploits convolutional layers with locality sensitivity hashing for querying images across domains through a user-friendly interface. Our results report image searches over databases ranging from thousands to millions of samples. We test pyCBIR search capabilities using three convNets against four scientific datasets, including samples from cell microscopy, microtomography, atomic diffraction patterns, and materials photographs to demonstrate 95% accurate recommendations in most cases. Furthermore, all scientific data collections are released.},
doi = {10.1016/j.eswa.2018.05.015},
journal = {Expert Systems with Applications},
number = C,
volume = 109,
place = {United States},
year = {2018},
month = {5}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 5 works
Citation information provided by
Web of Science

Save / Share: