skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Less is More: Bigger Data from Compressive Measurements

Abstract

Compressive sensing approaches are beginning to take hold in (scanning) transmission electron microscopy (S/TEM) [1,2,3]. Compressive sensing is a mathematical theory about acquiring signals in a compressed form (measurements) and the probability of recovering the original signal by solving an inverse problem [4]. The inverse problem is underdetermined (more unknowns than measurements), so it is not obvious that recovery is possible. Compression is achieved by taking inner products of the signal with measurement weight vectors. Both Gaussian random weights and Bernoulli (0,1) random weights form a large class of measurement vectors for which recovery is possible. The measurements can also be designed through an optimization process. The key insight for electron microscopists is that compressive sensing can be used to increase acquisition speed and reduce dose. Building on work initially developed for optical cameras, this new paradigm will allow electron microscopists to solve more problems in the engineering and life sciences. We will be collecting orders of magnitude more data than previously possible. The reason that we will have more data is because we will have increased temporal/spatial/spectral sampling rates, and we will be able ability to interrogate larger classes of samples that were previously too beam sensitive to survivemore » the experiment. For example consider an in-situ experiment that takes 1 minute. With traditional sensing, we might collect 5 images per second for a total of 300 images. With compressive sensing, each of those 300 images can be expanded into 10 more images, making the collection rate 50 images per second, and the decompressed data a total of 3000 images [3]. But, what are the implications, in terms of data, for this new methodology? Acquisition of compressed data will require downstream reconstruction to be useful. The reconstructed data will be much larger than traditional data, we will need space to store the reconstructions during analysis, and the computational demands for analysis will be higher. Moreover, there will be time costs associated with reconstruction. Deep learning [5] is an approach to address these problems. Deep learning is a hierarchical approach to find useful (for a particular task) representations of data. Each layer of the hierarchy is intended to represent higher levels of abstraction. For example, a deep model of faces might have sinusoids, edges and gradients in the first layer; eyes, noses, and mouths in the second layer, and faces in the third layer. There has been significant effort recently in deep learning algorithms for tasks beyond image classification such as compressive reconstruction [6] and image segmentation [7]. A drawback of deep learning, however, is that training the model requires large datasets and dedicated computational resources (to reduce training time to a few days). A second issue is that deep learning is not user-friendly and the meaning behind the results is usually not interpretable. We have shown it is possible to reduce the data set size while maintaining model quality [8] and developed interpretable models for image classification [9], but the demands are still significant. The key to addressing these problems is to NOT reconstruct the data. Instead, we should design computational sensors that give answers to specific problems. A simple version of this idea is compressive classification [10], where the goal is to classify signal type from a small number of compressed measurements. Classification is a much simpler problem than reconstruction, so 1) much fewer measurements will be necessary, and 2) these measurements will probably not be useful for reconstruction. Other simple examples of computational sensing include determining object volume or the number of objects present in the field of view [11].« less

Authors:
;
Publication Date:
Research Org.:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1379443
Report Number(s):
PNNL-SA-124102
Journal ID: ISSN 1431-9276; applab
DOE Contract Number:
AC05-76RL01830
Resource Type:
Journal Article
Resource Relation:
Journal Name: Microscopy and Microanalysis; Journal Volume: 23; Journal Issue: S1
Country of Publication:
United States
Language:
English

Citation Formats

Stevens, Andrew, and Browning, Nigel D. Less is More: Bigger Data from Compressive Measurements. United States: N. p., 2017. Web. doi:10.1017/S1431927617001519.
Stevens, Andrew, & Browning, Nigel D. Less is More: Bigger Data from Compressive Measurements. United States. doi:10.1017/S1431927617001519.
Stevens, Andrew, and Browning, Nigel D. Sat . "Less is More: Bigger Data from Compressive Measurements". United States. doi:10.1017/S1431927617001519.
@article{osti_1379443,
title = {Less is More: Bigger Data from Compressive Measurements},
author = {Stevens, Andrew and Browning, Nigel D.},
abstractNote = {Compressive sensing approaches are beginning to take hold in (scanning) transmission electron microscopy (S/TEM) [1,2,3]. Compressive sensing is a mathematical theory about acquiring signals in a compressed form (measurements) and the probability of recovering the original signal by solving an inverse problem [4]. The inverse problem is underdetermined (more unknowns than measurements), so it is not obvious that recovery is possible. Compression is achieved by taking inner products of the signal with measurement weight vectors. Both Gaussian random weights and Bernoulli (0,1) random weights form a large class of measurement vectors for which recovery is possible. The measurements can also be designed through an optimization process. The key insight for electron microscopists is that compressive sensing can be used to increase acquisition speed and reduce dose. Building on work initially developed for optical cameras, this new paradigm will allow electron microscopists to solve more problems in the engineering and life sciences. We will be collecting orders of magnitude more data than previously possible. The reason that we will have more data is because we will have increased temporal/spatial/spectral sampling rates, and we will be able ability to interrogate larger classes of samples that were previously too beam sensitive to survive the experiment. For example consider an in-situ experiment that takes 1 minute. With traditional sensing, we might collect 5 images per second for a total of 300 images. With compressive sensing, each of those 300 images can be expanded into 10 more images, making the collection rate 50 images per second, and the decompressed data a total of 3000 images [3]. But, what are the implications, in terms of data, for this new methodology? Acquisition of compressed data will require downstream reconstruction to be useful. The reconstructed data will be much larger than traditional data, we will need space to store the reconstructions during analysis, and the computational demands for analysis will be higher. Moreover, there will be time costs associated with reconstruction. Deep learning [5] is an approach to address these problems. Deep learning is a hierarchical approach to find useful (for a particular task) representations of data. Each layer of the hierarchy is intended to represent higher levels of abstraction. For example, a deep model of faces might have sinusoids, edges and gradients in the first layer; eyes, noses, and mouths in the second layer, and faces in the third layer. There has been significant effort recently in deep learning algorithms for tasks beyond image classification such as compressive reconstruction [6] and image segmentation [7]. A drawback of deep learning, however, is that training the model requires large datasets and dedicated computational resources (to reduce training time to a few days). A second issue is that deep learning is not user-friendly and the meaning behind the results is usually not interpretable. We have shown it is possible to reduce the data set size while maintaining model quality [8] and developed interpretable models for image classification [9], but the demands are still significant. The key to addressing these problems is to NOT reconstruct the data. Instead, we should design computational sensors that give answers to specific problems. A simple version of this idea is compressive classification [10], where the goal is to classify signal type from a small number of compressed measurements. Classification is a much simpler problem than reconstruction, so 1) much fewer measurements will be necessary, and 2) these measurements will probably not be useful for reconstruction. Other simple examples of computational sensing include determining object volume or the number of objects present in the field of view [11].},
doi = {10.1017/S1431927617001519},
journal = {Microscopy and Microanalysis},
number = S1,
volume = 23,
place = {United States},
year = {Sat Jul 01 00:00:00 EDT 2017},
month = {Sat Jul 01 00:00:00 EDT 2017}
}
  • The rational design of proteins is performed by computer modeling of individual changes, followed by time-consuming construction of the molecule, and then testing and evaluation. This iterative approach seeks to design proteins for specific tasks by drawing correlations between amino acid sequence and specific shapes. At its root, it is the attempt to understand an extremely complex system through its topography. This article analyzes the advantages and disadvantages of approaches used to date and suggests areas for further inquiry. 27 refs., 2 figs.
  • Multijet plus missing energy searches provide universal coverage for theories that have new colored particles that decay into a dark matter candidate and jets. These signals appear at the LHC further out on the missing energy tail than two-to-two scattering indicates. The simplicity of the searches at the LHC contrasts sharply with the Tevatron where more elaborate searches are necessary to separate signal from background. The searches presented in this article effectively distinguish signal from background for any theory where the LSP is a daughter or granddaughter of the pair-produced colored parent particle without ever having to consider missing energiesmore » less than 400 GeV.« less
  • A novel mass spectrometric imaging method is developed to reduce the data acquisition time and provide rich chemical information using a hybrid linear ion trap-orbitrap mass spectrometer. In this method, the linear ion trap and orbitrap are used in tandem to reduce the acquisition time by incorporating multiple linear ion trap scans during an orbitrap scan utilizing a spiral raster step plate movement. The data acquisition time was decreased by 43-49% in the current experiment compared to that of orbitrap-only scans; however, 75% or more time could be saved for higher mass resolution and with a higher repetition rate laser.more » Using this approach, a high spatial resolution of 10 {micro}m was maintained at ion trap imaging, while orbitrap spectra were acquired at a lower spatial resolution, 20-40 {micro}m, all with far less data acquisition time. Furthermore, various MS imaging methods were developed by interspersing MS/MS and MSn ion trap scans during orbitrap scans to provide more analytical information on the sample. This method was applied to differentiate and localize structural isomers of several flavonol glycosides from an Arabidopsis flower petal in which MS/MS, MSn, ion trap, and orbitrap images were all acquired in a single data acquisition.« less
  • Newly available data from extended weather stations and time period reveal that much of China has experienced statistically significant decreases in total cloud cover and low cloud cover over roughly the last half of the Twentieth century. This conclusion is supported by our recent analysis of the more reliably observed frequency of cloud-free sky and overcast sky. The total cloud cover and low cloud cover have decreased 0.88% and 0.33% per decade, respectively, and cloud-free days have increased 0.60% and overcast days decreased 0.78% per decade in China from 1954-2001. Meanwhile, both solar radiation and pan evaporation have decreased inmore » most parts of China, with solar radiation decreasing 3.1 W/m2 and pan evaporation decreasing 39 mm per decade. Combined with other evidences documented in previous studies, we conjectured that increased air pollution may have produced a fog-like haze that reflected/absorbed radiation from the sun and resulted in less solar radiation reaching the surface, despite concurrent upward trends in cloud-free skies over China.« less
  • The network of interferometric detectors that is under construction at various locations on Earth is expected to start searching for gravitational waves in a few years. The number of search templates that is needed to be cross correlated with the noisy output of the detectors is a major issue since computing power capabilities are restricted. By choosing higher and higher post-Newtonian order expansions for the family of search templates we make sure that our filters are more accurate copies of the real waves that hit our detectors. However, this is not the only criterion for choosing a family of searchmore » templates. To make the process of detection as efficient as possible, one needs a family of templates with a relatively small number of members that manages to pick up any detectable signal with only a tiny reduction in signal-to-noise ratio. Evidently, one family is better than another if it accomplishes its goal with a smaller number of templates. Following the geometric language of Owen, we have studied the performance of the post{sup 1.5}-Newtonian family of templates on detecting post{sup 2}-Newtonian signals for binaries. Several technical issues arise from the fact that the two types of waveforms cannot be made to coincide by a suitable choice of parameters. In general, the parameter space of the signals is not identical with the parameter space of the templates, although in our case they are of the same dimension, and one has to take into account all such peculiarities before drawing any conclusion. An interesting result we have obtained is that the post{sup 1.5}-Newtonian family of templates happens to be more economical for detecting post{sup 2}-Newtonian signals than the perfectly accurate post{sup 2}-Newtonian family of templates itself. The number of templates is reduced by 20-30%, depending on the acceptable level of reduction in signal-to-noise ratio due to discretization of the family of templates. This makes the post{sup 1.5}-Newtonian family of templates more favorable for detecting gravitational waves from inspiraling, compact, nonspinning, binaries. Apart from this useful quantitative result, this study constitutes an application of the template-numbering technique, introduced by Owen, for families of templates that are not described by the same mathematical expression as the assumed signals. For example, this analysis will be very useful when constructing sufficiently simple templates for detecting precessing spinning binaries.« less