skip to main content

SciTech ConnectSciTech Connect

This content will become publicly available on March 25, 2017

Title: Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis

Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset and found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. Asmore » a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.« less
 [1] ;  [2] ;  [3] ;  [4] ;  [5] ;  [4] ;  [6]
  1. Brookhaven National Lab. (BNL), Upton, NY (United States)
  2. Brookhaven National Lab. (BNL), Upton, NY (United States); Univ. of Illinois at Urbana-Champaign, Champaign, IL (United States)
  3. Brookhaven National Lab. (BNL), Upton, NY (United States); Stony Brook Univ., Stony Brook, NY (United States)
  4. Yale Univ., New Haven, CT (United States)
  5. Cold Spring Harbor Lab., Cold Spring Harbor, NY (United States)
  6. Cold Spring Harbor Lab., Cold Spring Harbor, NY (United States); USDA ARS NEA Plant, Ithaca, NY (United States)
Publication Date:
OSTI Identifier:
Report Number(s):
Journal ID: ISSN 0960-7412
Grant/Contract Number:
Accepted Manuscript
Journal Name:
The Plant Journal
Additional Journal Information:
Journal Name: The Plant Journal; Journal ID: ISSN 0960-7412
Society for Experimental Biology
Research Org:
Brookhaven National Laboratory (BNL), Upton, NY (United States)
Sponsoring Org:
USDOE Office of Science (SC), Basic Energy Sciences (BES) (SC-22)
Country of Publication:
United States
59 BASIC BIOLOGICAL SCIENCES Arabidopsis; expression data integration; metadata annotation; global transcriptional landscape; automatic reconstruction of missing metadata; re-use of public expression data