skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: SeqTU: A web server for identification of bacterial transcription units

Abstract

A transcription unit (TU) consists of K ≥ 1 consecutive genes on the same strand of a bacterial genome that are transcribed into a single mRNA molecule under certain conditions. Their identification is an essential step in elucidation of transcriptional regulatory networks. We have recently developed a machine-learning method to accurately identify TUs from RNA-seq data, based on two features of the assembled RNA reads: the continuity and stability of RNA-seq coverage across a genomic region. While good performance was achieved by the method on Escherichia coli and Clostridium thermocellum, substantial work is needed to make the program generally applicable to all bacteria, knowing that the program requires organism specific information. A web server, named SeqTU, was developed to automatically identify TUs with given RNA-seq data of any bacterium using a machine-learning approach. The server consists of a number of utility tools, in addition to TU identification, such as data preparation, data quality check and RNA-read mapping. SeqTU provides a user-friendly interface and automated prediction of TUs from given RNA-seq data. Furthermore, the predicted TUs are displayed intuitively using HTML format along with a graphic visualization of the prediction.

Authors:
 [1];  [2];  [3];  [4]
  1. Jilin Univ. Jilin (China); Univ. of Georgia, Athens, GA (United States); BioEnergy Science Center, Washington, D.C. (United States); Tianjin Univ., Tianjin (China)
  2. Broad Institute of MIT and Harvard Univ., Cambridge, MA (United States)
  3. South Dakota State Univ., Brookings, SD (United States)
  4. Jilin Univ., Jilin (China); Univ. of Georgia, Athens, GA (United States); BioEnergy Science Center, Washington, D.C. (United States)
Publication Date:
Research Org.:
South Dakota State Univ., Brookings, SD (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23)
OSTI Identifier:
1355909
Grant/Contract Number:
SC0013632
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
Scientific Reports
Additional Journal Information:
Journal Volume: 7; Journal ID: ISSN 2045-2322
Publisher:
Nature Publishing Group
Country of Publication:
United States
Language:
English
Subject:
59 BASIC BIOLOGICAL SCIENCES; bacillus-subtilis; Escherichia coli; database; operons; reveals; door; bioinformatics; computational platforms and environments; software

Citation Formats

Chen, Xin, Chou, Wen -Chi, Ma, Qin, and Xu, Ying. SeqTU: A web server for identification of bacterial transcription units. United States: N. p., 2017. Web. doi:10.1038/srep43925.
Chen, Xin, Chou, Wen -Chi, Ma, Qin, & Xu, Ying. SeqTU: A web server for identification of bacterial transcription units. United States. doi:10.1038/srep43925.
Chen, Xin, Chou, Wen -Chi, Ma, Qin, and Xu, Ying. Tue . "SeqTU: A web server for identification of bacterial transcription units". United States. doi:10.1038/srep43925. https://www.osti.gov/servlets/purl/1355909.
@article{osti_1355909,
title = {SeqTU: A web server for identification of bacterial transcription units},
author = {Chen, Xin and Chou, Wen -Chi and Ma, Qin and Xu, Ying},
abstractNote = {A transcription unit (TU) consists of K ≥ 1 consecutive genes on the same strand of a bacterial genome that are transcribed into a single mRNA molecule under certain conditions. Their identification is an essential step in elucidation of transcriptional regulatory networks. We have recently developed a machine-learning method to accurately identify TUs from RNA-seq data, based on two features of the assembled RNA reads: the continuity and stability of RNA-seq coverage across a genomic region. While good performance was achieved by the method on Escherichia coli and Clostridium thermocellum, substantial work is needed to make the program generally applicable to all bacteria, knowing that the program requires organism specific information. A web server, named SeqTU, was developed to automatically identify TUs with given RNA-seq data of any bacterium using a machine-learning approach. The server consists of a number of utility tools, in addition to TU identification, such as data preparation, data quality check and RNA-read mapping. SeqTU provides a user-friendly interface and automated prediction of TUs from given RNA-seq data. Furthermore, the predicted TUs are displayed intuitively using HTML format along with a graphic visualization of the prediction.},
doi = {10.1038/srep43925},
journal = {Scientific Reports},
number = ,
volume = 7,
place = {United States},
year = {Tue Mar 07 00:00:00 EST 2017},
month = {Tue Mar 07 00:00:00 EST 2017}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share:
  • Identification of metabolites in complex mixtures represents a key step in metabolomics. A new strategy is introduced, which is implemented in a new public web server, COLMARm, that permits the co-analysis of up to three 2D NMR spectra, namely 13C-1H HSQC, 1H-1H TOCSY, and 13C-1H HSQC-TOCSY for the comprehensive, accurate, and efficient performance of this task. The highly versatile and interactive nature of COLMARm permits its application to a wide range of metabolomics samples independent of the magnetic field. Database query is performed using the HSQC spectrum and the top metabolite hits are then validated against the TOCSY-type experiment(s) bymore » superimposing the expected cross-peaks on the mixture spectrum. In this way the user can directly accept or reject candidate metabolites by taking advantage of the complementary spectral information offered by these experiments and their different sensitivities. The power of COLMARm is demonstrated for a human serum sample uncovering the existence of 14 metabolites that hitherto were not identified by NMR.« less
  • The Gas and Oil Technology Exchange and Communication Highway (GO-TECH) provides an electronic information system for the petroleum community for exchanging ideas, data, and technology. The PC-based system fosters communication and discussion by linking the oil and gas producers with resource centers, government agencies, consulting firms, service companies, national laboratories, academic research groups, and universities throughout the world. The oil and gas producers can access the GO-TECH World Wide Web (WWW) home page through modem links, as well as through the Internet. Future GO-TECH applications will include the establishment of virtual corporations consisting of consortia of small companies, consultants, andmore » service companies linked by electronic information systems. These virtual corporations will have the resources and expertise previously found only in major corporations.« less
  • microRNAs (miRNA) are a class of non-protein coding functional RNAs that are thought to regulate expression of target genes by direct interaction with mRNAs. miRNAs have been identified through both experimental and computational methods in a variety of eukaryotic organisms. Though these approaches have been partially successful, there is a need to develop more tools for detection of these RNAs as they are also thought to be present in abundance in many genomes. In this report we describe a tool and a web server, named CID-miRNA, for identification of miRNA precursors in a given DNA sequence, utilising secondary structure-based filteringmore » systems and an algorithm based on stochastic context free grammar trained on human miRNAs. CID-miRNA analyses a given sequence using a web interface, for presence of putative miRNA precursors and the generated output lists all the potential regions that can form miRNA-like structures. It can also scan large genomic sequences for the presence of potential miRNA precursors in its stand-alone form. The web server can be accessed at (http://mirna.jnu.ac.in/cidmirna/)« less
  • LigSearch is a web server for identifying ligands likely to bind to a given protein. Identifying which ligands might bind to a protein before crystallization trials could provide a significant saving in time and resources. LigSearch, a web server aimed at predicting ligands that might bind to and stabilize a given protein, has been developed. Using a protein sequence and/or structure, the system searches against a variety of databases, combining available knowledge, and provides a clustered and ranked output of possible ligands. LigSearch can be accessed at http://www.ebi.ac.uk/thornton-srv/databases/LigSearch.
  • In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less