skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Identification of functional elements and regulatory circuits by Drosophila modENCODE

Abstract

To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation. Several years after the complete genetic sequencing of many species, it is still unclear how to translate genomic information into a functional map of cellular and developmental programs. The Encyclopedia of DNA Elements (ENCODE) (1) and model organism ENCODE (modENCODE) (2) projects use diverse genomic assays to comprehensively annotate the Homo sapiens (human), Drosophila melanogaster (fruit fly), and Caenorhabditis elegans (worm) genomes, through systematicmore » generation and computational integration of functional genomic data sets. Previous genomic studies in flies have made seminal contributions to our understanding of basic biological mechanisms and genome functions, facilitated by genetic, experimental, computational, and manual annotation of the euchromatic and heterochromatic genome (3), small genome size, short life cycle, and a deep knowledge of development, gene function, and chromosome biology. The functions of {approx}40% of the protein and nonprotein-coding genes [FlyBase 5.12 (4)] have been determined from cDNA collections (5, 6), manual curation of gene models (7), gene mutations and comprehensive genome-wide RNA interference screens (8-10), and comparative genomic analyses (11, 12). The Drosophila modENCODE project has generated more than 700 data sets that profile transcripts, histone modifications and physical nucleosome properties, general and specific transcription factors (TFs), and replication programs in cell lines, isolated tissues, and whole organisms across several developmental stages (Fig. 1). Here, we computationally integrate these data sets and report (i) improved and additional genome annotations, including full-length proteincoding genes and peptides as short as 21 amino acids; (ii) noncoding transcripts, including 132 candidate structural RNAs and 1608 nonstructural transcripts; (iii) additional Argonaute (Ago)-associated small RNA genes and pathways, including new microRNAs (miRNAs) encoded within protein-coding exons and endogenous small interfering RNAs (siRNAs) from 3-inch untranslated regions; (iv) chromatin 'states' defined by combinatorial patterns of 18 chromatin marks that are associated with distinct functions and properties; (v) regions of high TF occupancy and replication activity with likely epigenetic regulation; (vi)mixed TF and miRNA regulatory networks with hierarchical structure and enriched feed-forward loops; (vii) coexpression- and co-regulation-based functional annotations for nearly 3000 genes; (viii) stage- and tissue-specific regulators; and (ix) predictive models of gene expression levels and regulator function.« less

Authors:
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; more »; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; « less
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Org.:
Life Sciences Division
OSTI Identifier:
1010935
Report Number(s):
LBNL-4361E
Journal ID: 0036-8075; TRN: US201109%%186
DOE Contract Number:  
DE-AC02-05CH11231; U01HG004271 (SEC)
Resource Type:
Journal Article
Journal Name:
Science
Additional Journal Information:
Journal Name: Science
Country of Publication:
United States
Language:
English
Subject:
60; AMINO ACIDS; BIOLOGY; CHROMATIN; CHROMOSOMES; DNA; DROSOPHILA; EXONS; FLIES; FUNCTIONALS; GENE MUTATIONS; GENES; GENETICS; HISTONES; LIFE CYCLE; MODIFICATIONS; NUCLEOSOMES; PEPTIDES; PROTEINS; RNA; TRANSCRIPTION FACTORS

Citation Formats

Roy, Sushmita, Ernst, Jason, Kharchenko, Peter V, Kheradpour, Pouya, Negre, Nicolas, Eaton, Matthew L, Landolin, Jane M, Bristow, Christopher A, Ma, Lijia, Lin, Michael F, Washietl, Stefan, Arshinoff, Bradley I, Ay, Ferhat, Meyer, Patrick E, Robine, Nicolas, Washington, Nicole L, Stefano, Luisa Di, Berezikov, Eugene, Brown, Christopher D, Candeias, Rogerio, Carlson, Joseph W, Carr, Adrian, Jungreis, Irwin, Marbach, Daniel, Sealfon, Rachel, Tolstorukov, Michael Y, Will, Sebastian, Alekseyenko, Artyom A, Artieri, Carlo, Booth, Benjamin W, Brooks, Angela N, Dai, Qi, Davis, Carrie A, Duff, Michael O, Feng, Xin, Gorchakov, Andrey A, Gu, Tingting, Henikoff, Jorja G, Kapranov, Philipp, Li, Renhua, MacAlpine, Heather K, Malone, John, Minoda, Aki, Nordman, Jared, Okamura, Katsutomo, Perry, Marc, Powell, Sara K, Riddle, Nicole C, Sakai, Akiko, Samsonova, Anastasia, Sandler, Jeremy E, Schwartz, Yuri B, Sher, Noa, Spokony, Rebecca, Sturgill, David, van Baren, Marijke, Wan, Kenneth H, Yang, Li, Yu, Charles, Feingold, Elise, Good, Peter, Guyer, Mark, Lowdon, Rebecca, Ahmad, Kami, Andrews, Justen, Berger, Bonnie, Brenner, Steven E, Brent, Michael R, Cherbas, Lucy, Elgin, Sarah C. R., Gingeras, Thomas R, Grossman, Robert, Hoskins, Roger A, Kaufman, Thomas C, Kent, William, Kuroda, Mitzi I, Orr-Weaver, Terry, Perrimon, Norbert, Pirrotta, Vincenzo, Posakony, James W, Ren, Bing, Russell, Steven, Cherbas, Peter, Graveley, Brenton R, Lewis, Suzanna, Micklem, Gos, Oliver, Brian, Park, Peter J, Celniker, Susan E, Henikoff, Steven, Karpen, Gary H, Lai, Eric C, MacAlpine, David M, Stein, Lincoln D, White, Kevin P, and Kellis, Manolis. Identification of functional elements and regulatory circuits by Drosophila modENCODE. United States: N. p., 2010. Web.
Roy, Sushmita, Ernst, Jason, Kharchenko, Peter V, Kheradpour, Pouya, Negre, Nicolas, Eaton, Matthew L, Landolin, Jane M, Bristow, Christopher A, Ma, Lijia, Lin, Michael F, Washietl, Stefan, Arshinoff, Bradley I, Ay, Ferhat, Meyer, Patrick E, Robine, Nicolas, Washington, Nicole L, Stefano, Luisa Di, Berezikov, Eugene, Brown, Christopher D, Candeias, Rogerio, Carlson, Joseph W, Carr, Adrian, Jungreis, Irwin, Marbach, Daniel, Sealfon, Rachel, Tolstorukov, Michael Y, Will, Sebastian, Alekseyenko, Artyom A, Artieri, Carlo, Booth, Benjamin W, Brooks, Angela N, Dai, Qi, Davis, Carrie A, Duff, Michael O, Feng, Xin, Gorchakov, Andrey A, Gu, Tingting, Henikoff, Jorja G, Kapranov, Philipp, Li, Renhua, MacAlpine, Heather K, Malone, John, Minoda, Aki, Nordman, Jared, Okamura, Katsutomo, Perry, Marc, Powell, Sara K, Riddle, Nicole C, Sakai, Akiko, Samsonova, Anastasia, Sandler, Jeremy E, Schwartz, Yuri B, Sher, Noa, Spokony, Rebecca, Sturgill, David, van Baren, Marijke, Wan, Kenneth H, Yang, Li, Yu, Charles, Feingold, Elise, Good, Peter, Guyer, Mark, Lowdon, Rebecca, Ahmad, Kami, Andrews, Justen, Berger, Bonnie, Brenner, Steven E, Brent, Michael R, Cherbas, Lucy, Elgin, Sarah C. R., Gingeras, Thomas R, Grossman, Robert, Hoskins, Roger A, Kaufman, Thomas C, Kent, William, Kuroda, Mitzi I, Orr-Weaver, Terry, Perrimon, Norbert, Pirrotta, Vincenzo, Posakony, James W, Ren, Bing, Russell, Steven, Cherbas, Peter, Graveley, Brenton R, Lewis, Suzanna, Micklem, Gos, Oliver, Brian, Park, Peter J, Celniker, Susan E, Henikoff, Steven, Karpen, Gary H, Lai, Eric C, MacAlpine, David M, Stein, Lincoln D, White, Kevin P, & Kellis, Manolis. Identification of functional elements and regulatory circuits by Drosophila modENCODE. United States.
Roy, Sushmita, Ernst, Jason, Kharchenko, Peter V, Kheradpour, Pouya, Negre, Nicolas, Eaton, Matthew L, Landolin, Jane M, Bristow, Christopher A, Ma, Lijia, Lin, Michael F, Washietl, Stefan, Arshinoff, Bradley I, Ay, Ferhat, Meyer, Patrick E, Robine, Nicolas, Washington, Nicole L, Stefano, Luisa Di, Berezikov, Eugene, Brown, Christopher D, Candeias, Rogerio, Carlson, Joseph W, Carr, Adrian, Jungreis, Irwin, Marbach, Daniel, Sealfon, Rachel, Tolstorukov, Michael Y, Will, Sebastian, Alekseyenko, Artyom A, Artieri, Carlo, Booth, Benjamin W, Brooks, Angela N, Dai, Qi, Davis, Carrie A, Duff, Michael O, Feng, Xin, Gorchakov, Andrey A, Gu, Tingting, Henikoff, Jorja G, Kapranov, Philipp, Li, Renhua, MacAlpine, Heather K, Malone, John, Minoda, Aki, Nordman, Jared, Okamura, Katsutomo, Perry, Marc, Powell, Sara K, Riddle, Nicole C, Sakai, Akiko, Samsonova, Anastasia, Sandler, Jeremy E, Schwartz, Yuri B, Sher, Noa, Spokony, Rebecca, Sturgill, David, van Baren, Marijke, Wan, Kenneth H, Yang, Li, Yu, Charles, Feingold, Elise, Good, Peter, Guyer, Mark, Lowdon, Rebecca, Ahmad, Kami, Andrews, Justen, Berger, Bonnie, Brenner, Steven E, Brent, Michael R, Cherbas, Lucy, Elgin, Sarah C. R., Gingeras, Thomas R, Grossman, Robert, Hoskins, Roger A, Kaufman, Thomas C, Kent, William, Kuroda, Mitzi I, Orr-Weaver, Terry, Perrimon, Norbert, Pirrotta, Vincenzo, Posakony, James W, Ren, Bing, Russell, Steven, Cherbas, Peter, Graveley, Brenton R, Lewis, Suzanna, Micklem, Gos, Oliver, Brian, Park, Peter J, Celniker, Susan E, Henikoff, Steven, Karpen, Gary H, Lai, Eric C, MacAlpine, David M, Stein, Lincoln D, White, Kevin P, and Kellis, Manolis. Wed . "Identification of functional elements and regulatory circuits by Drosophila modENCODE". United States. https://www.osti.gov/servlets/purl/1010935.
@article{osti_1010935,
title = {Identification of functional elements and regulatory circuits by Drosophila modENCODE},
author = {Roy, Sushmita and Ernst, Jason and Kharchenko, Peter V and Kheradpour, Pouya and Negre, Nicolas and Eaton, Matthew L and Landolin, Jane M and Bristow, Christopher A and Ma, Lijia and Lin, Michael F and Washietl, Stefan and Arshinoff, Bradley I and Ay, Ferhat and Meyer, Patrick E and Robine, Nicolas and Washington, Nicole L and Stefano, Luisa Di and Berezikov, Eugene and Brown, Christopher D and Candeias, Rogerio and Carlson, Joseph W and Carr, Adrian and Jungreis, Irwin and Marbach, Daniel and Sealfon, Rachel and Tolstorukov, Michael Y and Will, Sebastian and Alekseyenko, Artyom A and Artieri, Carlo and Booth, Benjamin W and Brooks, Angela N and Dai, Qi and Davis, Carrie A and Duff, Michael O and Feng, Xin and Gorchakov, Andrey A and Gu, Tingting and Henikoff, Jorja G and Kapranov, Philipp and Li, Renhua and MacAlpine, Heather K and Malone, John and Minoda, Aki and Nordman, Jared and Okamura, Katsutomo and Perry, Marc and Powell, Sara K and Riddle, Nicole C and Sakai, Akiko and Samsonova, Anastasia and Sandler, Jeremy E and Schwartz, Yuri B and Sher, Noa and Spokony, Rebecca and Sturgill, David and van Baren, Marijke and Wan, Kenneth H and Yang, Li and Yu, Charles and Feingold, Elise and Good, Peter and Guyer, Mark and Lowdon, Rebecca and Ahmad, Kami and Andrews, Justen and Berger, Bonnie and Brenner, Steven E and Brent, Michael R and Cherbas, Lucy and Elgin, Sarah C. R. and Gingeras, Thomas R and Grossman, Robert and Hoskins, Roger A and Kaufman, Thomas C and Kent, William and Kuroda, Mitzi I and Orr-Weaver, Terry and Perrimon, Norbert and Pirrotta, Vincenzo and Posakony, James W and Ren, Bing and Russell, Steven and Cherbas, Peter and Graveley, Brenton R and Lewis, Suzanna and Micklem, Gos and Oliver, Brian and Park, Peter J and Celniker, Susan E and Henikoff, Steven and Karpen, Gary H and Lai, Eric C and MacAlpine, David M and Stein, Lincoln D and White, Kevin P and Kellis, Manolis},
abstractNote = {To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation. Several years after the complete genetic sequencing of many species, it is still unclear how to translate genomic information into a functional map of cellular and developmental programs. The Encyclopedia of DNA Elements (ENCODE) (1) and model organism ENCODE (modENCODE) (2) projects use diverse genomic assays to comprehensively annotate the Homo sapiens (human), Drosophila melanogaster (fruit fly), and Caenorhabditis elegans (worm) genomes, through systematic generation and computational integration of functional genomic data sets. Previous genomic studies in flies have made seminal contributions to our understanding of basic biological mechanisms and genome functions, facilitated by genetic, experimental, computational, and manual annotation of the euchromatic and heterochromatic genome (3), small genome size, short life cycle, and a deep knowledge of development, gene function, and chromosome biology. The functions of {approx}40% of the protein and nonprotein-coding genes [FlyBase 5.12 (4)] have been determined from cDNA collections (5, 6), manual curation of gene models (7), gene mutations and comprehensive genome-wide RNA interference screens (8-10), and comparative genomic analyses (11, 12). The Drosophila modENCODE project has generated more than 700 data sets that profile transcripts, histone modifications and physical nucleosome properties, general and specific transcription factors (TFs), and replication programs in cell lines, isolated tissues, and whole organisms across several developmental stages (Fig. 1). Here, we computationally integrate these data sets and report (i) improved and additional genome annotations, including full-length proteincoding genes and peptides as short as 21 amino acids; (ii) noncoding transcripts, including 132 candidate structural RNAs and 1608 nonstructural transcripts; (iii) additional Argonaute (Ago)-associated small RNA genes and pathways, including new microRNAs (miRNAs) encoded within protein-coding exons and endogenous small interfering RNAs (siRNAs) from 3-inch untranslated regions; (iv) chromatin 'states' defined by combinatorial patterns of 18 chromatin marks that are associated with distinct functions and properties; (v) regions of high TF occupancy and replication activity with likely epigenetic regulation; (vi)mixed TF and miRNA regulatory networks with hierarchical structure and enriched feed-forward loops; (vii) coexpression- and co-regulation-based functional annotations for nearly 3000 genes; (viii) stage- and tissue-specific regulators; and (ix) predictive models of gene expression levels and regulator function.},
doi = {},
url = {https://www.osti.gov/biblio/1010935}, journal = {Science},
number = ,
volume = ,
place = {United States},
year = {2010},
month = {12}
}