skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Experiences with making diffraction image data available: what metadata do we need to archive?

Journal Article · · Acta Crystallographica. Section D: Biological Crystallography
 [1];  [2]
  1. Utrecht University, Padualaan 8, 3584 CH Utrecht (Netherlands)
  2. University of Manchester, Brunswick Street, Manchester M14 9PL (United Kingdom)

A local raw ‘diffraction data images’ archive was made available and some data sets were retrieved and reprocessed, which led to analysis of the anomalous difference densities of two partially occupied Cl atoms in cisplatin as well as a re-evaluation of the resolution cutoff in these diffraction data. General questions on storing raw data are discussed. It is also demonstrated that often one needs unambiguous prior knowledge to read the (binary) detector format and the setup of goniometer geometries. Recently, the IUCr (International Union of Crystallography) initiated the formation of a Diffraction Data Deposition Working Group with the aim of developing standards for the representation of raw diffraction data associated with the publication of structural papers. Archiving of raw data serves several goals: to improve the record of science, to verify the reproducibility and to allow detailed checks of scientific data, safeguarding against fraud and to allow reanalysis with future improved techniques. A means of studying this issue is to submit exemplar publications with associated raw data and metadata. In a recent study of the binding of cisplatin and carboplatin to histidine in lysozyme crystals under several conditions, the possible effects of the equipment and X-ray diffraction data-processing software on the occupancies and B factors of the bound Pt compounds were compared. Initially, 35.3 GB of data were transferred from Manchester to Utrecht to be processed with EVAL. A detailed description and discussion of the availability of metadata was published in a paper that was linked to a local raw data archive at Utrecht University and also mirrored at the TARDIS raw diffraction data archive in Australia. By making these raw diffraction data sets available with the article, it is possible for the diffraction community to make their own evaluation. This led to one of the authors of XDS (K. Diederichs) to re-integrate the data from crystals that supposedly solely contained bound carboplatin, resulting in the analysis of partially occupied chlorine anomalous electron densities near the Pt-binding sites and the use of several criteria to more carefully assess the diffraction resolution limit. General arguments for archiving raw data, the possibilities of doing so and the requirement of resources are discussed. The problems associated with a partially unknown experimental setup, which preferably should be available as metadata, is discussed. Current thoughts on data compression are summarized, which could be a solution especially for pixel-device data sets with fine slicing that may otherwise present an unmanageable amount of data.

OSTI ID:
22347752
Journal Information:
Acta Crystallographica. Section D: Biological Crystallography, Vol. 70, Issue Pt 10; Other Information: PMCID: PMC4187998; PMID: 25286836; PUBLISHER-ID: dz5309; OAI: oai:pubmedcentral.nih.gov:4187998; Copyright (c) Kroon-Batenburg & Helliwell 2014; This is an open-access article distributed under the terms of the Creative Commons Attribution Licence, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are cited.; Country of input: International Atomic Energy Agency (IAEA); ISSN 0907-4449
Country of Publication:
Denmark
Language:
English