DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis

Abstract

Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset and found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. Asmore » a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.« less

Authors:
 [1];  [2];  [3];  [4];  [5];  [4];  [6]
  1. Brookhaven National Lab. (BNL), Upton, NY (United States)
  2. Brookhaven National Lab. (BNL), Upton, NY (United States); Univ. of Illinois at Urbana-Champaign, Champaign, IL (United States)
  3. Brookhaven National Lab. (BNL), Upton, NY (United States); Stony Brook Univ., Stony Brook, NY (United States)
  4. Yale Univ., New Haven, CT (United States)
  5. Cold Spring Harbor Lab., Cold Spring Harbor, NY (United States)
  6. Cold Spring Harbor Lab., Cold Spring Harbor, NY (United States); USDA ARS NEA Plant, Ithaca, NY (United States)
Publication Date:
Research Org.:
Brookhaven National Laboratory (BNL), Upton, NY (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Basic Energy Sciences (BES)
OSTI Identifier:
1257945
Report Number(s):
BNL-112108-2016-JA
Journal ID: ISSN 0960-7412
Grant/Contract Number:  
SC00112704
Resource Type:
Accepted Manuscript
Journal Name:
The Plant Journal
Additional Journal Information:
Journal Name: The Plant Journal; Journal ID: ISSN 0960-7412
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES

Citation Formats

He, Fei, Maslov, Sergei, Yoo, Shinjae, Wang, Daifeng, Kumari, Sunita, Gerstein, Mark, and Ware, Doreen. Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis. United States: N. p., 2016. Web. doi:10.1111/tpj.13175.
He, Fei, Maslov, Sergei, Yoo, Shinjae, Wang, Daifeng, Kumari, Sunita, Gerstein, Mark, & Ware, Doreen. Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis. United States. https://doi.org/10.1111/tpj.13175
He, Fei, Maslov, Sergei, Yoo, Shinjae, Wang, Daifeng, Kumari, Sunita, Gerstein, Mark, and Ware, Doreen. Wed . "Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis". United States. https://doi.org/10.1111/tpj.13175. https://www.osti.gov/servlets/purl/1257945.
@article{osti_1257945,
title = {Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis},
author = {He, Fei and Maslov, Sergei and Yoo, Shinjae and Wang, Daifeng and Kumari, Sunita and Gerstein, Mark and Ware, Doreen},
abstractNote = {Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset and found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. As a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.},
doi = {10.1111/tpj.13175},
journal = {The Plant Journal},
number = ,
volume = ,
place = {United States},
year = {Wed May 25 00:00:00 EDT 2016},
month = {Wed May 25 00:00:00 EDT 2016}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 24 works
Citation information provided by
Web of Science

Save / Share:

Works referenced in this record:

NCBI GEO: mining millions of expression profiles--database and tools
journal, December 2004

  • Barrett, T.
  • Nucleic Acids Research, Vol. 33, Issue Database issue
  • DOI: 10.1093/nar/gki022

Slicing across Kingdoms: Regeneration in Plants and Animals
journal, February 2008


A Gene Expression Map of the Arabidopsis Root
journal, December 2003


A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
journal, January 2003


A High-Resolution Root Spatiotemporal Map Reveals Dominant Expression Patterns
journal, November 2007


Fundamentals of experimental design for cDNA microarrays
journal, December 2002

  • Churchill, Gary A.
  • Nature Genetics, Vol. 32, Issue S4
  • DOI: 10.1038/ng1031

The Genotype-Tissue Expression (GTEx) project
journal, May 2013

  • Lonsdale, John; Thomas, Jeffrey; Salvatore, Mike
  • Nature Genetics, Vol. 45, Issue 6
  • DOI: 10.1038/ng.2653

The Plant Ontology as a Tool for Comparative Plant Anatomy and Genomic Analyses
journal, December 2012

  • Cooper, Laurel; Walls, Ramona L.; Elser, Justin
  • Plant and Cell Physiology, Vol. 54, Issue 2
  • DOI: 10.1093/pcp/pcs163

NASCArrays: a repository for microarray data generated by NASC's transcriptomics service
journal, January 2004


The effects of genetic variation on gene expression dynamics during development
journal, November 2013


The Plastidic Bile Acid Transporter 5 Is Required for the Biosynthesis of Methionine-Derived Glucosinolates in Arabidopsis thaliana
journal, June 2009

  • Gigolashvili, Tamara; Yatusevich, Ruslan; Rollwitz, Inga
  • The Plant Cell, Vol. 21, Issue 6
  • DOI: 10.1105/tpc.109.066399

Elucidating gene function and function evolution through comparison of co-expression networks of plants
journal, August 2014

  • Hansen, Bjoern O.; Vaid, Neha; Musialak-Lange, Magdalena
  • Frontiers in Plant Science, Vol. 5
  • DOI: 10.3389/fpls.2014.00394

Robust estimators for expression analysis
journal, December 2002


The plant immune system
journal, November 2006

  • Jones, Jonathan D. G.; Dangl, Jeffery L.
  • Nature, Vol. 444, Issue 7117
  • DOI: 10.1038/nature05286

Gene expression profiling identifies different sub-types of retinoblastoma
journal, June 2013

  • Kapatai, G.; Brundler, M-A; Jenkinson, H.
  • British Journal of Cancer, Vol. 109, Issue 2
  • DOI: 10.1038/bjc.2013.283

VirtualPlant: A Software Platform to Support Systems Biology Research
journal, December 2009

  • Katari, Manpreet S.; Nowicki, Steve D.; Aceituno, Felipe F.
  • Plant Physiology, Vol. 152, Issue 2
  • DOI: 10.1104/pp.109.147025

Ontology-aware classification of tissue and cell-type signals in gene expression profiles across platforms and technologies
journal, September 2013


The developmental dynamics of the maize leaf transcriptome
journal, October 2010

  • Li, Pinghua; Ponnala, Lalit; Gandotra, Neeru
  • Nature Genetics, Vol. 42, Issue 12, p. 1060-1067
  • DOI: 10.1038/ng.703

A global map of human gene expression
journal, April 2010

  • Lukk, Margus; Kapushesky, Misha; Nikkilä, Janne
  • Nature Biotechnology, Vol. 28, Issue 4
  • DOI: 10.1038/nbt0410-322

Arabidopsis Co-expression Tool (ACT): web server tools for microarray-based gene expression analysis
journal, July 2006

  • Manfield, I. W.; Jen, C. -H.; Pinney, J. W.
  • Nucleic Acids Research, Vol. 34, Issue Web Server
  • DOI: 10.1093/nar/gkl204

Articulation of three core metabolic processes in Arabidopsis: Fatty acid biosynthesis, leucine catabolism and starch metabolism
journal, January 2008

  • Mentzen, Wieslawa I.; Peng, Jianling; Ransom, Nick
  • BMC Plant Biology, Vol. 8, Issue 1
  • DOI: 10.1186/1471-2229-8-76

GeneCAT—novel webtools that combine BLAST and co-expression analyses
journal, May 2008

  • Mutwil, Marek; Øbro, Jens; Willats, William G. T.
  • Nucleic Acids Research, Vol. 36, Issue suppl_2
  • DOI: 10.1093/nar/gkn292

Plant and Cell Physiology 2014 Online Database Issue
journal, December 2013

  • Obayashi, Takeshi; Yano, Kentaro
  • Plant and Cell Physiology, Vol. 55, Issue 1
  • DOI: 10.1093/pcp/pct193

A gene expression profile of stem cell pluripotentiality and differentiation is conserved across diverse solid and hematopoietic cancers
journal, January 2012


Identification of genes required for cellulose synthesis by regression analysis of public microarray data sets
journal, June 2005

  • Persson, S.; Wei, H.; Milne, J.
  • Proceedings of the National Academy of Sciences, Vol. 102, Issue 24
  • DOI: 10.1073/pnas.0503392102

PLGG1, a plastidic glycolate glycerate transporter, is required for photorespiration and defines a unique class of metabolite transporters
journal, February 2013

  • Pick, T. R.; Brautigam, A.; Schulz, M. A.
  • Proceedings of the National Academy of Sciences, Vol. 110, Issue 8
  • DOI: 10.1073/pnas.1215142110

Global regulatory architecture of human, mouse and rat tissue transcriptomes
journal, January 2013


The impact of quantile and rank normalization procedures on the testing power of gene differential expression analysis
journal, January 2013


Genome and genetic resources from the Cancer Genome Anatomy Project
journal, April 2001


Reuse of public genome-wide gene expression data
journal, December 2012

  • Rung, Johan; Brazma, Alvis
  • Nature Reviews Genetics, Vol. 14, Issue 2
  • DOI: 10.1038/nrg3394

A gene expression map of Arabidopsis thaliana development
journal, April 2005

  • Schmid, Markus; Davison, Timothy S.; Henz, Stefan R.
  • Nature Genetics, Vol. 37, Issue 5
  • DOI: 10.1038/ng1543

Making sense out of massive data by going beyond differential expression
journal, March 2012

  • Schmid, P. R.; Palmer, N. P.; Kohane, I. S.
  • Proceedings of the National Academy of Sciences, Vol. 109, Issue 15
  • DOI: 10.1073/pnas.1118792109

Navigating gene expression using microarrays — a technology review
journal, August 2001

  • Schulze, Almut; Downward, Julian
  • Nature Cell Biology, Vol. 3, Issue 8
  • DOI: 10.1038/35087138

Genome-wide atlas of transcription during maize development
journal, March 2011


Organ regeneration does not require a functional stem cell niche in plants
journal, January 2009

  • Sena, Giovanni; Wang, Xiaoning; Liu, Hsiao-Yun
  • Nature, Vol. 457, Issue 7233
  • DOI: 10.1038/nature07597

Global gene expression profiling reveals similarities and differences among mouse pluripotent stem cells of different origins and strains
journal, July 2007


Next-generation DNA sequencing
journal, October 2008

  • Shendure, Jay; Ji, Hanlee
  • Nature Biotechnology, Vol. 26, Issue 10
  • DOI: 10.1038/nbt1486

The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements
journal, September 2006

  • She, Leming
  • Nature Biotechnology, Vol. 24, Issue 9, p. 1151-1161
  • DOI: 10.1038/nbt1239

CressExpress: A Tool For Large-Scale Mining of Expression Data from Arabidopsis
journal, May 2008

  • Srinivasasainagendra, Vinodh; Page, Grier P.; Mehta, Tapan
  • Plant Physiology, Vol. 147, Issue 3
  • DOI: 10.1104/pp.107.115535

Arabidopsis Regeneration from Multiple Tissues Occurs via a Root Development Pathway
journal, March 2010


Regeneration in plants and animals: dedifferentiation, transdifferentiation, or just differentiation?
journal, April 2011

  • Sugimoto, Kaoru; Gordon, Sean P.; Meyerowitz, Elliot M.
  • Trends in Cell Biology, Vol. 21, Issue 4
  • DOI: 10.1016/j.tcb.2010.12.004

The Botany Array Resource: e-Northerns, Expression Angling, and promoter analyses: The Botany Array Resource
journal, June 2005


Caffeoyl Shikimate Esterase (CSE) Is an Enzyme in the Lignin Biosynthetic Pathway in Arabidopsis
journal, August 2013


Ontologies as integrative tools for plant science
journal, August 2012

  • Walls, Ramona L.; Athreya, Balaji; Cooper, Laurel
  • American Journal of Botany, Vol. 99, Issue 8
  • DOI: 10.3732/ajb.1200222

A dynamic gene expression atlas covering the entire life cycle of rice
journal, March 2010


Iterative rank-order normalization of gene expression microarray data
journal, May 2013

  • Welsh, Eric A.; Eschrich, Steven A.; Berglund, Anders E.
  • BMC Bioinformatics, Vol. 14, Issue 1
  • DOI: 10.1186/1471-2105-14-153

Identification of a Flavonol 7- O -Rhamnosyltransferase Gene Determining Flavonoid Pattern in Arabidopsis by Transcriptome Coexpression Analysis and Reverse Genetics
journal, February 2007

  • Yonekura-Sakakibara, Keiko; Tohge, Takayuki; Niida, Rie
  • Journal of Biological Chemistry, Vol. 282, Issue 20
  • DOI: 10.1074/jbc.M611498200

Large scale comparison of global gene expression patterns in human and mouse
journal, January 2010


GENEVESTIGATOR. Arabidopsis Microarray Database and Analysis Toolbox
journal, September 2004

  • Zimmermann, Philip; Hirsch-Hoffmann, Matthias; Hennig, Lars
  • Plant Physiology, Vol. 136, Issue 1
  • DOI: 10.1104/pp.104.046367

Works referencing / citing this record:

Gene co-expression network analysis identifies trait-related modules in Arabidopsis thaliana
journal, January 2019


Shifting the limits in wheat research and breeding using a fully annotated reference genome
journal, August 2018


Combination of novel and public RNA-seq datasets to generate an mRNA expression atlas for the domestic chicken
journal, August 2018


Identification of regulatory modules in genome scale transcription regulatory networks
journal, December 2017


Combination of novel and public RNA-seq datasets to generate an mRNA expression atlas for the domestic chicken
journal, April 2018

  • Bush, Stephen J.; Freem, Lucy; MacCallum, Amanda J.
  • BMC Genomics
  • DOI: 10.1101/295535

PlaD: A Transcriptomics Database for Plant Defense Responses to Pathogens, Providing New Insights into Plant Immune System
journal, August 2018

  • Qi, Huan; Jiang, Zhenhong; Zhang, Kang
  • Genomics, Proteomics & Bioinformatics, Vol. 16, Issue 4
  • DOI: 10.1016/j.gpb.2018.08.002

CrY2H-seq: a massively multiplexed assay for deep-coverage interactome mapping
journal, June 2017

  • Trigg, Shelly A.; Garza, Renee M.; MacWilliams, Andrew
  • Nature Methods, Vol. 14, Issue 8
  • DOI: 10.1038/nmeth.4343

Differential Coexpression Analysis Reveals Extensive Rewiring of Arabidopsis Gene Coexpression in Response to Pseudomonas syringae Infection
journal, October 2016

  • Jiang, Zhenhong; Dong, Xiaobao; Li, Zhi-Gang
  • Scientific Reports, Vol. 6, Issue 1
  • DOI: 10.1038/srep35064

Bioinformatics analysis to identify the critical genes, microRNAs and long noncoding RNAs in melanoma
journal, July 2017