skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

Abstract

Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshortmore » motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by patterns in mutation, suggesting thatselection which causes their conservation is not always verystrong.« less

Authors:
; ; ; ;
Publication Date:
Research Org.:
Ernest Orlando Lawrence Berkeley NationalLaboratory, Berkeley, CA (US)
Sponsoring Org.:
USDOE Director, Office of Science; National Institutes ofHealth
OSTI Identifier:
923415
Report Number(s):
LBNL-62510
R&D Project: GHPG6C; BnR: 400412000; TRN: US200804%%1117
DOE Contract Number:
DE-AC02-05CH11231; NIHU1HL66681B
Resource Type:
Journal Article
Resource Relation:
Journal Name: BMC Genomics; Journal Volume: 8; Journal Issue: 378; Related Information: Journal Publication Date: 10/18/2007
Country of Publication:
United States
Language:
English
Subject:
60; ABUNDANCE; AVOIDANCE; DNA; FUNCTIONALS; GENES; NUCLEOTIDES; TRANSCRIPTION; TRANSCRIPTION FACTORS

Citation Formats

Minovitsky, Simon, Stegmaier, Philip, Kel, Alexander, Kondrashov,Alexey S., and Dubchak, Inna. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences. United States: N. p., 2007. Web.
Minovitsky, Simon, Stegmaier, Philip, Kel, Alexander, Kondrashov,Alexey S., & Dubchak, Inna. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences. United States.
Minovitsky, Simon, Stegmaier, Philip, Kel, Alexander, Kondrashov,Alexey S., and Dubchak, Inna. Wed . "Short sequence motifs, overrepresented in mammalian conservednon-coding sequences". United States. doi:. https://www.osti.gov/servlets/purl/923415.
@article{osti_923415,
title = {Short sequence motifs, overrepresented in mammalian conservednon-coding sequences},
author = {Minovitsky, Simon and Stegmaier, Philip and Kel, Alexander and Kondrashov,Alexey S. and Dubchak, Inna},
abstractNote = {Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by patterns in mutation, suggesting thatselection which causes their conservation is not always verystrong.},
doi = {},
journal = {BMC Genomics},
number = 378,
volume = 8,
place = {United States},
year = {Wed Feb 21 00:00:00 EST 2007},
month = {Wed Feb 21 00:00:00 EST 2007}
}
  • ABSTRACT-The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused whether on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRN abundance and non-random features in coding sequences (e.g. codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much codingmore » and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together...« less
  • RNA interference (RNAi) is an extremely powerful and widely used gene silencing approach for reverse functional genomics and molecular therapeutics. In mammals, the conserved poly(ADP-ribose) polymerase 2 (PARP-2)/RNase P bidirectional control promoter simultaneously expresses both the PARP-2 protein and RNase P RNA by RNA polymerase II- and III-dependent mechanisms, respectively. To explore this unique bidirectional control system in RNAi-mediated gene silencing strategy, we have constructed two novel bidirectional expression vectors, pbiHsH1 and pbiMmH1, which contained the PARP-2/RNase P bidirectional control promoters from human and mouse, for simultaneous expression of both the protein-coding genes and short hairpin RNAs. Analyses of themore » dual transcriptional activities indicated that these two bidirectional expression vectors could not only express enhanced green fluorescent protein as a functional reporter but also simultaneously transcribe shLuc for inhibiting the firefly luciferase expression. In addition, to extend its utility for the establishment of inherited stable clones, we have also reconstructed this bidirectional expression system with the blasticidin S deaminase gene, an effective dominant drug resistance selectable marker, and examined both the selection and inhibition efficiencies in drug resistance and gene expression. Moreover, we have further demonstrated that this bidirectional expression system could efficiently co-regulate the functionally important genes, such as overexpression of tumor suppressor protein p53 and inhibition of anti-apoptotic protein Bcl-2 at the same time. In summary, the bidirectional expression vectors, pbiHsH1 and pbiMmH1, should provide a simple, convenient, and efficient novel tool for manipulating the gene function in mammalian cells.« less
  • The hst gene was originally identified as a transforming gene in DNAs from human stomach cancers and from a noncancerous portion of stomach mucosa by DNA-mediated transfection assay using NIH3T3 cells. cDNA clones of hst were isolated from the cDNA library constructed from poly(A)/sup +/ RNA of a secondary transformant induced by the DNA from a stomach cancer. The sequence analysis of the hst cDNA revealed the presence of two open reading frames. When this cDNA was inserted into an expression vector containing the simian virus 40 promoter, it efficiently induced the transformation of NIH3T3 cells upon transfection. It wasmore » found that one of the reading frames, which coded for 206 amino acids, was responsible for the transforming activity.« less