evSeq: Cost-Effective Amplicon Sequencing of Every Variant in a Protein Library
Journal Article
·
· ACS Synthetic Biology
- California Institute of Technology (CalTech), Pasadena, CA (United States); California Institute of Technology
- California Institute of Technology (CalTech), Pasadena, CA (United States)
Widespread availability of protein sequence-fitness data would revolutionize both our biochemical understanding of proteins and our ability to engineer them. Unfortunately, even though thousands of protein variants are generated and evaluated for fitness during a typical protein engineering campaign, most are never sequenced, leaving a wealth of potential sequence-fitness information untapped. Primarily, this is because sequencing is unnecessary for many protein engineering strategies; the added cost and effort of sequencing is thus unjustified. It also results from the fact that, even though many lower cost sequencing strategies have been developed, they often require at least some sequencing or computational resources, both of which can be barriers to access. In this work, we present every variant sequencing (evSeq), a method and collection of tools/standardized components for sequencing a variable region within every variant gene produced during a protein engineering campaign at a cost of cents per variant. evSeq was designed to democratize low-cost sequencing for protein engineers and, indeed, anyone interested in engineering biological systems. Execution of its wet-lab component is simple, requires no sequencing experience to perform, relies only on resources and services typically available to biology labs, and slots neatly into existing protein engineering workflows. Analysis of evSeq data is likewise made simple by its accompanying software (found at github.com/fhalab/evSeq, documentation at fhalab.github.io/evSeq), which can be run on a personal laptop and was designed to be accessible to users with no computational experience. Here, low-cost and easy to use, evSeq makes collection of extensive protein variant sequence-fitness data practical.
- Research Organization:
- California Institute of Technology (CalTech), Pasadena, CA (United States)
- Sponsoring Organization:
- National Science Foundation (NSF); USDOE Office of Science (SC), Basic Energy Sciences (BES)
- Grant/Contract Number:
- SC0022218
- OSTI ID:
- 1853986
- Alternate ID(s):
- OSTI ID: 1855942
- Journal Information:
- ACS Synthetic Biology, Journal Name: ACS Synthetic Biology Journal Issue: 3 Vol. 11; ISSN 2161-5063
- Publisher:
- American Chemical Society (ACS)Copyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
LevSeq: Rapid Generation of Sequence-Function Data for Directed Evolution and Machine Learning
Enzyme Engineering Database (EnzEngDB): a platform for sharing and interpreting sequence–function relationships across protein engineering campaigns
Journal Article
·
Mon Dec 23 19:00:00 EST 2024
· ACS Synthetic Biology
·
OSTI ID:2567050
Enzyme Engineering Database (EnzEngDB): a platform for sharing and interpreting sequence–function relationships across protein engineering campaigns
Journal Article
·
Sun Dec 07 19:00:00 EST 2025
· Nucleic Acids Research
·
OSTI ID:3014245