skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

Journal Article · · BMC Genomics
OSTI ID:923415

Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by patterns in mutation, suggesting thatselection which causes their conservation is not always verystrong.

Research Organization:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Director, Office of Science; National Institutes ofHealth
DOE Contract Number:
DE-AC02-05CH11231; NIHU1HL66681B
OSTI ID:
923415
Report Number(s):
LBNL-62510; R&D Project: GHPG6C; BnR: 400412000; TRN: US200804%%1117
Journal Information:
BMC Genomics, Vol. 8, Issue 378; Related Information: Journal Publication Date: 10/18/2007
Country of Publication:
United States
Language:
English