Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis
Abstract
Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset and found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. Asmore »
- Authors:
-
- Brookhaven National Lab. (BNL), Upton, NY (United States)
- Brookhaven National Lab. (BNL), Upton, NY (United States); Univ. of Illinois at Urbana-Champaign, Champaign, IL (United States)
- Brookhaven National Lab. (BNL), Upton, NY (United States); Stony Brook Univ., Stony Brook, NY (United States)
- Yale Univ., New Haven, CT (United States)
- Cold Spring Harbor Lab., Cold Spring Harbor, NY (United States)
- Cold Spring Harbor Lab., Cold Spring Harbor, NY (United States); USDA ARS NEA Plant, Ithaca, NY (United States)
- Publication Date:
- Research Org.:
- Brookhaven National Laboratory (BNL), Upton, NY (United States)
- Sponsoring Org.:
- USDOE Office of Science (SC), Basic Energy Sciences (BES)
- OSTI Identifier:
- 1257945
- Report Number(s):
- BNL-112108-2016-JA
Journal ID: ISSN 0960-7412
- Grant/Contract Number:
- SC00112704
- Resource Type:
- Accepted Manuscript
- Journal Name:
- The Plant Journal
- Additional Journal Information:
- Journal Name: The Plant Journal; Journal ID: ISSN 0960-7412
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 59 BASIC BIOLOGICAL SCIENCES
Citation Formats
He, Fei, Maslov, Sergei, Yoo, Shinjae, Wang, Daifeng, Kumari, Sunita, Gerstein, Mark, and Ware, Doreen. Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis. United States: N. p., 2016.
Web. doi:10.1111/tpj.13175.
He, Fei, Maslov, Sergei, Yoo, Shinjae, Wang, Daifeng, Kumari, Sunita, Gerstein, Mark, & Ware, Doreen. Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis. United States. https://doi.org/10.1111/tpj.13175
He, Fei, Maslov, Sergei, Yoo, Shinjae, Wang, Daifeng, Kumari, Sunita, Gerstein, Mark, and Ware, Doreen. Wed .
"Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis". United States. https://doi.org/10.1111/tpj.13175. https://www.osti.gov/servlets/purl/1257945.
@article{osti_1257945,
title = {Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis},
author = {He, Fei and Maslov, Sergei and Yoo, Shinjae and Wang, Daifeng and Kumari, Sunita and Gerstein, Mark and Ware, Doreen},
abstractNote = {Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset and found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. As a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.},
doi = {10.1111/tpj.13175},
journal = {The Plant Journal},
number = ,
volume = ,
place = {United States},
year = {Wed May 25 00:00:00 EDT 2016},
month = {Wed May 25 00:00:00 EDT 2016}
}
Web of Science
Works referenced in this record:
NCBI GEO: mining millions of expression profiles--database and tools
journal, December 2004
- Barrett, T.
- Nucleic Acids Research, Vol. 33, Issue Database issue
Slicing across Kingdoms: Regeneration in Plants and Animals
journal, February 2008
- Birnbaum, Kenneth D.; Alvarado, Alejandro Sánchez
- Cell, Vol. 132, Issue 4
A Gene Expression Map of the Arabidopsis Root
journal, December 2003
- Birnbaum, K.
- Science, Vol. 302, Issue 5652
A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
journal, January 2003
- Bolstad, B. M.; Irizarry, R. A.; Astrand, M.
- Bioinformatics, Vol. 19, Issue 2
Growth Stage-Based Phenotypic Analysis of Arabidopsis: A Model for High Throughput Functional Genomics in Plants
journal, July 2001
- Boyes, D. C.
- THE PLANT CELL ONLINE, Vol. 13, Issue 7
A High-Resolution Root Spatiotemporal Map Reveals Dominant Expression Patterns
journal, November 2007
- Brady, S. M.; Orlando, D. A.; Lee, J. -Y.
- Science, Vol. 318, Issue 5851
Fundamentals of experimental design for cDNA microarrays
journal, December 2002
- Churchill, Gary A.
- Nature Genetics, Vol. 32, Issue S4
The Genotype-Tissue Expression (GTEx) project
journal, May 2013
- Lonsdale, John; Thomas, Jeffrey; Salvatore, Mike
- Nature Genetics, Vol. 45, Issue 6
The Plant Ontology as a Tool for Comparative Plant Anatomy and Genomic Analyses
journal, December 2012
- Cooper, Laurel; Walls, Ramona L.; Elser, Justin
- Plant and Cell Physiology, Vol. 54, Issue 2
NASCArrays: a repository for microarray data generated by NASC's transcriptomics service
journal, January 2004
- Craigon, D. J.
- Nucleic Acids Research, Vol. 32, Issue 90001
The effects of genetic variation on gene expression dynamics during development
journal, November 2013
- Francesconi, Mirko; Lehner, Ben
- Nature, Vol. 505, Issue 7482
The Plastidic Bile Acid Transporter 5 Is Required for the Biosynthesis of Methionine-Derived Glucosinolates in Arabidopsis thaliana
journal, June 2009
- Gigolashvili, Tamara; Yatusevich, Ruslan; Rollwitz, Inga
- The Plant Cell, Vol. 21, Issue 6
Elucidating gene function and function evolution through comparison of co-expression networks of plants
journal, August 2014
- Hansen, Bjoern O.; Vaid, Neha; Musialak-Lange, Magdalena
- Frontiers in Plant Science, Vol. 5
Leaf Developmental Age Controls Expression of Genes Encoding Enzymes of Chlorophyll and Heme Biosynthesis in Pea (Pisum sativum L.)
journal, October 1994
- He, Z. H.; Li, J.; Sundqvist, C.
- Plant Physiology, Vol. 106, Issue 2
Robust estimators for expression analysis
journal, December 2002
- Hubbell, E.; Liu, W. -M.; Mei, R.
- Bioinformatics, Vol. 18, Issue 12
The plant immune system
journal, November 2006
- Jones, Jonathan D. G.; Dangl, Jeffery L.
- Nature, Vol. 444, Issue 7117
Gene expression profiling identifies different sub-types of retinoblastoma
journal, June 2013
- Kapatai, G.; Brundler, M-A; Jenkinson, H.
- British Journal of Cancer, Vol. 109, Issue 2
VirtualPlant: A Software Platform to Support Systems Biology Research
journal, December 2009
- Katari, Manpreet S.; Nowicki, Steve D.; Aceituno, Felipe F.
- Plant Physiology, Vol. 152, Issue 2
The Connectivity Map: Using Gene-Expression Signatures to Connect Small Molecules, Genes, and Disease
journal, September 2006
- Lamb, J.
- Science, Vol. 313, Issue 5795
Ontology-aware classification of tissue and cell-type signals in gene expression profiles across platforms and technologies
journal, September 2013
- Lee, Young-suk; Krishnan, Arjun; Zhu, Qian
- Bioinformatics, Vol. 29, Issue 23
The developmental dynamics of the maize leaf transcriptome
journal, October 2010
- Li, Pinghua; Ponnala, Lalit; Gandotra, Neeru
- Nature Genetics, Vol. 42, Issue 12, p. 1060-1067
A global map of human gene expression
journal, April 2010
- Lukk, Margus; Kapushesky, Misha; Nikkilä, Janne
- Nature Biotechnology, Vol. 28, Issue 4
Arabidopsis Co-expression Tool (ACT): web server tools for microarray-based gene expression analysis
journal, July 2006
- Manfield, I. W.; Jen, C. -H.; Pinney, J. W.
- Nucleic Acids Research, Vol. 34, Issue Web Server
Articulation of three core metabolic processes in Arabidopsis: Fatty acid biosynthesis, leucine catabolism and starch metabolism
journal, January 2008
- Mentzen, Wieslawa I.; Peng, Jianling; Ransom, Nick
- BMC Plant Biology, Vol. 8, Issue 1
GeneCAT—novel webtools that combine BLAST and co-expression analyses
journal, May 2008
- Mutwil, Marek; Øbro, Jens; Willats, William G. T.
- Nucleic Acids Research, Vol. 36, Issue suppl_2
Plant and Cell Physiology 2014 Online Database Issue
journal, December 2013
- Obayashi, Takeshi; Yano, Kentaro
- Plant and Cell Physiology, Vol. 55, Issue 1
A gene expression profile of stem cell pluripotentiality and differentiation is conserved across diverse solid and hematopoietic cancers
journal, January 2012
- Palmer, Nathan P.; Schmid, Patrick R.; Berger, Bonnie
- Genome Biology, Vol. 13, Issue 8
Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets
journal, June 2005
- Persson, S.; Wei, H.; Milne, J.
- Proceedings of the National Academy of Sciences, Vol. 102, Issue 24
PLGG1, a plastidic glycolate glycerate transporter, is required for photorespiration and defines a unique class of metabolite transporters
journal, February 2013
- Pick, T. R.; Brautigam, A.; Schulz, M. A.
- Proceedings of the National Academy of Sciences, Vol. 110, Issue 8
Global regulatory architecture of human, mouse and rat tissue transcriptomes
journal, January 2013
- Prasad, Ajay; Kumar, Suchitra; Dessimoz, Christophe
- BMC Genomics, Vol. 14, Issue 1
The impact of quantile and rank normalization procedures on the testing power of gene differential expression analysis
journal, January 2013
- Qiu, Xing; Wu, Hulin; Hu, Rui
- BMC Bioinformatics, Vol. 14, Issue 1
Genome and genetic resources from the Cancer Genome Anatomy Project
journal, April 2001
- Riggins, G. J.
- Human Molecular Genetics, Vol. 10, Issue 7
Reuse of public genome-wide gene expression data
journal, December 2012
- Rung, Johan; Brazma, Alvis
- Nature Reviews Genetics, Vol. 14, Issue 2
A gene expression map of Arabidopsis thaliana development
journal, April 2005
- Schmid, Markus; Davison, Timothy S.; Henz, Stefan R.
- Nature Genetics, Vol. 37, Issue 5
Making sense out of massive data by going beyond differential expression
journal, March 2012
- Schmid, P. R.; Palmer, N. P.; Kohane, I. S.
- Proceedings of the National Academy of Sciences, Vol. 109, Issue 15
Navigating gene expression using microarrays — a technology review
journal, August 2001
- Schulze, Almut; Downward, Julian
- Nature Cell Biology, Vol. 3, Issue 8
Genome-wide atlas of transcription during maize development
journal, March 2011
- Sekhon, Rajandeep S.; Lin, Haining; Childs, Kevin L.
- The Plant Journal, Vol. 66, Issue 4, p. 553-563
Organ regeneration does not require a functional stem cell niche in plants
journal, January 2009
- Sena, Giovanni; Wang, Xiaoning; Liu, Hsiao-Yun
- Nature, Vol. 457, Issue 7233
Global gene expression profiling reveals similarities and differences among mouse pluripotent stem cells of different origins and strains
journal, July 2007
- Sharova, Lioudmila V.; Sharov, Alexei A.; Piao, Yulan
- Developmental Biology, Vol. 307, Issue 2
Next-generation DNA sequencing
journal, October 2008
- Shendure, Jay; Ji, Hanlee
- Nature Biotechnology, Vol. 26, Issue 10
The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements
journal, September 2006
- She, Leming
- Nature Biotechnology, Vol. 24, Issue 9, p. 1151-1161
CressExpress: A Tool For Large-Scale Mining of Expression Data from Arabidopsis
journal, May 2008
- Srinivasasainagendra, Vinodh; Page, Grier P.; Mehta, Tapan
- Plant Physiology, Vol. 147, Issue 3
Arabidopsis Regeneration from Multiple Tissues Occurs via a Root Development Pathway
journal, March 2010
- Sugimoto, Kaoru; Jiao, Yuling; Meyerowitz, Elliot M.
- Developmental Cell, Vol. 18, Issue 3
Regeneration in plants and animals: dedifferentiation, transdifferentiation, or just differentiation?
journal, April 2011
- Sugimoto, Kaoru; Gordon, Sean P.; Meyerowitz, Elliot M.
- Trends in Cell Biology, Vol. 21, Issue 4
The Botany Array Resource: e-Northerns, Expression Angling, and promoter analyses: The Botany Array Resource
journal, June 2005
- Toufighi, Kiana; Brady, Siobhan M.; Austin, Ryan
- The Plant Journal, Vol. 43, Issue 1
Caffeoyl Shikimate Esterase (CSE) Is an Enzyme in the Lignin Biosynthetic Pathway in Arabidopsis
journal, August 2013
- Vanholme, R.; Cesarino, I.; Rataj, K.
- Science, Vol. 341, Issue 6150
Ontologies as integrative tools for plant science
journal, August 2012
- Walls, Ramona L.; Athreya, Balaji; Cooper, Laurel
- American Journal of Botany, Vol. 99, Issue 8
A dynamic gene expression atlas covering the entire life cycle of rice
journal, March 2010
- Wang, Lei; Xie, Weibo; Chen, Ying
- The Plant Journal, Vol. 61, Issue 5
Iterative rank-order normalization of gene expression microarray data
journal, May 2013
- Welsh, Eric A.; Eschrich, Steven A.; Berglund, Anders E.
- BMC Bioinformatics, Vol. 14, Issue 1
A genome-wide transcriptome profiling reveals the early molecular events during callus initiation in Arabidopsis multiple organs
journal, August 2012
- Xu, Ke; Liu, Jing; Fan, Mingzhu
- Genomics, Vol. 100, Issue 2
An assessment of computational methods for estimating purity and clonality using genomic data derived from heterogeneous tumor tissue samples
journal, February 2014
- Yadav, V. K.; De, S.
- Briefings in Bioinformatics, Vol. 16, Issue 2
Identification of a Flavonol 7- O -Rhamnosyltransferase Gene Determining Flavonoid Pattern in Arabidopsis by Transcriptome Coexpression Analysis and Reverse Genetics
journal, February 2007
- Yonekura-Sakakibara, Keiko; Tohge, Takayuki; Niida, Rie
- Journal of Biological Chemistry, Vol. 282, Issue 20
Large scale comparison of global gene expression patterns in human and mouse
journal, January 2010
- Zheng-Bradley, Xiangqun; Rung, Johan; Parkinson, Helen
- Genome Biology, Vol. 11, Issue 12
GENEVESTIGATOR. Arabidopsis Microarray Database and Analysis Toolbox
journal, September 2004
- Zimmermann, Philip; Hirsch-Hoffmann, Matthias; Hennig, Lars
- Plant Physiology, Vol. 136, Issue 1
Works referencing / citing this record:
Gene co-expression network analysis identifies trait-related modules in Arabidopsis thaliana
journal, January 2019
- Liu, Wei; Lin, Liping; Zhang, Zhiyuan
- Planta, Vol. 249, Issue 5
Shifting the limits in wheat research and breeding using a fully annotated reference genome
journal, August 2018
- Appels, Rudi; Eversole, Kellye; Stein, Nils
- Science, Vol. 361, Issue 6403
Combination of novel and public RNA-seq datasets to generate an mRNA expression atlas for the domestic chicken
journal, August 2018
- Bush, Stephen J.; Freem, Lucy; MacCallum, Amanda J.
- BMC Genomics, Vol. 19, Issue 1
Identification of regulatory modules in genome scale transcription regulatory networks
journal, December 2017
- Song, Qi; Grene, Ruth; Heath, Lenwood S.
- BMC Systems Biology, Vol. 11, Issue 1
Combination of novel and public RNA-seq datasets to generate an mRNA expression atlas for the domestic chicken
journal, April 2018
- Bush, Stephen J.; Freem, Lucy; MacCallum, Amanda J.
- BMC Genomics
PlaD: A Transcriptomics Database for Plant Defense Responses to Pathogens, Providing New Insights into Plant Immune System
journal, August 2018
- Qi, Huan; Jiang, Zhenhong; Zhang, Kang
- Genomics, Proteomics & Bioinformatics, Vol. 16, Issue 4
CrY2H-seq: a massively multiplexed assay for deep-coverage interactome mapping
journal, June 2017
- Trigg, Shelly A.; Garza, Renee M.; MacWilliams, Andrew
- Nature Methods, Vol. 14, Issue 8
Differential Coexpression Analysis Reveals Extensive Rewiring of Arabidopsis Gene Coexpression in Response to Pseudomonas syringae Infection
journal, October 2016
- Jiang, Zhenhong; Dong, Xiaobao; Li, Zhi-Gang
- Scientific Reports, Vol. 6, Issue 1
Bioinformatics analysis to identify the critical genes, microRNAs and long noncoding RNAs in melanoma
journal, July 2017
- Zhang, Qian; Wang, Yang; Liang, Jiulong
- Medicine, Vol. 96, Issue 29
Large-Scale Public Transcriptomic Data Mining Reveals a Tight Connection between the Transport of Nitrogen and Other Transport Processes in Arabidopsis
journal, August 2016
- He, Fei; Karve, Abhijit A.; Maslov, Sergei
- Frontiers in Plant Science, Vol. 7