skip to main content

DOE PAGESDOE PAGES

4 results for: All records
Author ORCID ID is 0000000242510362
Full Text and Citations
Filters
  1. For many bacteria with sequenced genomes, we do not understand how they synthesize some amino acids. This makes it challenging to reconstruct their metabolism, and has led to speculation that bacteria might be cross-feeding amino acids. Here, we studied heterotrophic bacteria from 10 different genera that grow without added amino acids even though an automated tool predicts that the bacteria have gaps in their amino acid synthesis pathways. Across these bacteria, there were 11 gaps in their amino acid biosynthesis pathways that we could not fill using current knowledge. Using genome-wide mutant fitness data, we identified novel enzymes that fillmore » 9 of the 11 gaps and hence explain the biosynthesis of methionine, threonine, serine, or histidine by bacteria from six genera. We also found that the sulfate-reducing bacterium Desulfovibrio vulgaris synthesizes homocysteine (which is a precursor to methionine) by using DUF39, NIL/ferredoxin, and COG2122 proteins, and that homoserine is not an intermediate in this pathway. Our results suggest that most free-living bacteria can likely make all 20 amino acids and illustrate how high-throughput genetics can uncover previously-unknown amino acid biosynthesis genes.« less
  2. Although transcriptional regulation is fundamental to understanding bacterial physiology, the targets of most bacterial transcription factors are not known. Comparative genomics has been used to identify likely targets of some of these transcription factors, but these predictions typically lack experimental support. Here, we used mutant fitness data, which measures the importance of each gene for a bacterium's growth across many conditions, to test regulatory predictions from RegPrecise, a curated collection of comparative genomics predictions. Because characterized transcription factors often have correlated fitness with one of their targets (either positively or negatively), correlated fitness patterns provide support for the comparative genomicsmore » predictions. At a false discovery rate of 3%, we identified significant cofitness for at least one target of 158 TFs in 107 ortholog groups and from 24 bacteria. Thus, high-throughput genetics can be used to identify a high-confidence subset of the sequence-based regulatory predictions.« less
  3. In order to study how a bacterium allocates its resources, we compared the costs and benefits of most (86%) of the proteins in Escherichia coli K-12 during growth in minimal glucose medium. The cost or investment in each protein was estimated from ribosomal profiling data, and the benefit of each protein was measured by assaying a library of transposon mutants. We found that proteins that are important for fitness are usually highly expressed, and 95% of these proteins are expressed at above 13 parts per million (ppm). Conversely, proteins that do not measurably benefit the host (with a benefit ofmore » less than 5% per generation) tend to be weakly expressed, with a median expression of 13 ppm. In aggregate, genes with no detectable benefit account for 31% of protein production, or about 22% if we correct for genetic redundancy. Though some of the apparently unnecessary expression could have subtle benefits in minimal glucose medium, the majority of the burden is due to genes that are important in other conditions. We propose that at least 13% of the cell's protein is on standby in case conditions change.« less
  4. Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quicklymore » finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins’ functions.« less

"Cited by" information provided by Web of Science.

DOE PAGES offers free public access to the best available full-text version of DOE-affiliated accepted manuscripts or articles after an administrative interval of 12 months. The portal and search engine employ a hybrid model of both centralized and distributed content, with PAGES maintaining a permanent archive of all full text and metadata.