evSeq: Cost-Effective Amplicon Sequencing of Every Variant in a Protein Library
- California Institute of Technology (CalTech), Pasadena, CA (United States); California Institute of Technology
- California Institute of Technology (CalTech), Pasadena, CA (United States)
Widespread availability of protein sequence-fitness data would revolutionize both our biochemical understanding of proteins and our ability to engineer them. Unfortunately, even though thousands of protein variants are generated and evaluated for fitness during a typical protein engineering campaign, most are never sequenced, leaving a wealth of potential sequence-fitness information untapped. Primarily, this is because sequencing is unnecessary for many protein engineering strategies; the added cost and effort of sequencing is thus unjustified. It also results from the fact that, even though many lower cost sequencing strategies have been developed, they often require at least some sequencing or computational resources, both of which can be barriers to access. In this work, we present every variant sequencing (evSeq), a method and collection of tools/standardized components for sequencing a variable region within every variant gene produced during a protein engineering campaign at a cost of cents per variant. evSeq was designed to democratize low-cost sequencing for protein engineers and, indeed, anyone interested in engineering biological systems. Execution of its wet-lab component is simple, requires no sequencing experience to perform, relies only on resources and services typically available to biology labs, and slots neatly into existing protein engineering workflows. Analysis of evSeq data is likewise made simple by its accompanying software (found at github.com/fhalab/evSeq, documentation at fhalab.github.io/evSeq), which can be run on a personal laptop and was designed to be accessible to users with no computational experience. Here, low-cost and easy to use, evSeq makes collection of extensive protein variant sequence-fitness data practical.
- Research Organization:
- California Institute of Technology (CalTech), Pasadena, CA (United States)
- Sponsoring Organization:
- National Science Foundation (NSF); USDOE Office of Science (SC), Basic Energy Sciences (BES)
- Grant/Contract Number:
- SC0022218
- OSTI ID:
- 1853986
- Journal Information:
- ACS Synthetic Biology, Journal Name: ACS Synthetic Biology Journal Issue: 3 Vol. 11; ISSN 2161-5063
- Publisher:
- American Chemical Society (ACS)Copyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Enzyme Engineering Database (EnzEngDB): a platform for sharing and interpreting sequence–function relationships across protein engineering campaigns
DeCOIL: Optimization of Degenerate Codon Libraries for Machine Learning-Assisted Protein Engineering