Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

High-Performance Deep Learning Toolbox for Genome-Scale Prediction of Protein Structure and Function

Conference ·

Computational biology is one of many scientific disciplines ripe for innovation and acceleration with the advent of high-performance computing (HPC). In recent years, the field of machine learning has also seen significant benefits from adopting HPC practices. In this work, we present a novel HPC pipeline that incorporates various machine-learning approaches for structure-based functional annotation of proteins on the scale of whole genomes. Our pipeline makes extensive use of deep learning and provides computational insights into best practices for training advanced deep-learning models for high-throughput data such as proteomics data. We showcase methodologies our pipeline currently supports and detail future tasks for our pipeline to envelop, including large-scale sequence comparison using SAdLSA and prediction of protein tertiary structures using AlphaFold2.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1840182
Country of Publication:
United States
Language:
English

Similar Records

A General Framework to Learn Tertiary Structure for Protein Sequence Characterization
Journal Article · Fri May 21 00:00:00 EDT 2021 · Frontiers in Bioinformatics · OSTI ID:1831097

Proteome-scale Deployment of Protein Structure Prediction Workflows on the Summit Supercomputer
Conference · Sun May 01 00:00:00 EDT 2022 · OSTI ID:1881138

Towards Native Execution of Deep Learning on a Leadership-Class HPC System
Conference · Wed May 01 00:00:00 EDT 2019 · OSTI ID:1550753

Related Subjects