skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: Zero-Inflated Poisson regression models to predict abundance of undetected proteins

Journal Article · · Bioinformatics, 22(13):1641-1647

Abstract Advances in DNA microarray and proteomics technologies have enabled high-throughput measurement of mRNA expression and protein abundance. Parallel profiling of mRNA and protein on a global scale and integrative analysis of these two data types could provide additional insight into the metabolic mechanisms underlying complex biological systems. However, because protein abundance and mRNA expression are affected by many cellular and physical processes, there have been conflicting results on the correlation of these two measurements. In addition, as current proteomic methods can detect only a small fraction of proteins present in cells, no correlation study of these two data types has been done thus far at the whole-genome level. In this study, we describe a novel data-driven statistical model to integrate whole-genome microarray and proteomic data collected from Desulfovibrio vulgaris grown under three different conditions. Based on the Poisson distribution pattern of proteomic data and the fact that a large number of proteins were undetected (excess zeros), Zero-inflated Poisson models were used to define the correlation pattern of mRNA and protein abundance. The models assumed that there is a probability mass at zero representing some of the undetected proteins because of technical limitations. The models thus use abundance measurements of transcripts and proteins experimentally detected as input to generate predictions of protein abundances as output for all genes in the genome. We demonstrated the statistical models by comparatively analyzing D. vulgaris grown on lactate-based versus formate-based media. The increased expressions of Ech hydrogenase and alcohol dehydrogenase (Adh)-periplasmic Fe-only hydrogenase (Hyd) pathway for ATP synthesis were predicted for D. vulgaris grown on formate.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States). Environmental Molecular Sciences Lab. (EMSL)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
889037
Report Number(s):
PNNL-SA-45108; 15891; TRN: US200619%%334
Journal Information:
Bioinformatics, 22(13):1641-1647, Vol. 22, Issue 13
Country of Publication:
United States
Language:
English