# Solving Globally-Optimal Threading Problems in ''Polynomial-Time''

## Abstract

Computational protein threading is a powerful technique for recognizing native-like folds of a protein sequence from a protein fold database. In this paper, we present an improved algorithm (over our previous work) for solving the globally-optimal threading problem, and illustrate how the computational complexity and the fold recognition accuracy of the algorithm change as the cutoff distance for pairwise interactions changes. For a given fold of m residues and M core secondary structures (or simply cores) and a protein sequence of n residues, the algorithm guarantees to find a sequence-fold alignment (threading) that is globally optimal, measured collectively by (1) the singleton match fitness, (2) pairwise interaction preference, and (3) alignment gap penalties, in O(mn + MnN{sup 1.5C-1}) time and O(mn + nN{sup C-1}) space. C, the topological complexity of a fold as we term, is a value which characterizes the overall structure of the considered pairwise interactions in the fold, which are typically determined by a specified cutoff distance between the beta carbon atoms of a pair of amino acids in the fold. C is typically a small positive integer. N represents the maximum number of possible alignments between an individual core of the fold and the protein sequencemore »

- Authors:

- Publication Date:

- Research Org.:
- Oak Ridge National Lab., TN (US)

- Sponsoring Org.:
- USDOE Office of Energy Research (ER) (US)

- OSTI Identifier:
- 1749

- Report Number(s):
- ORNL/CP-100039; KP 11 01 01 0

KP 11 01 01 0; TRN: AH200112%%74

- DOE Contract Number:
- AC05-96OR22464

- Resource Type:
- Conference

- Resource Relation:
- Conference: 3rd International Annual Conference on Computational Molecular Biology, Lyon (FR), 04/12/1999; Other Information: PBD: 12 Apr 1999

- Country of Publication:
- United States

- Language:
- English

- Subject:
- 59 BASIC BIOLOGICAL SCIENCES; ALGORITHMS; DISULFIDES; MOLECULAR BIOLOGY; PROTEINS; PROTEIN STRUCTURE; AMINO ACID SEQUENCE; CALCULATION METHODS

### Citation Formats

```
Uberbacher, E C, Xu, D, and Xu, Y.
```*Solving Globally-Optimal Threading Problems in ''Polynomial-Time''*. United States: N. p., 1999.
Web.

```
Uberbacher, E C, Xu, D, & Xu, Y.
```*Solving Globally-Optimal Threading Problems in ''Polynomial-Time''*. United States.

```
Uberbacher, E C, Xu, D, and Xu, Y. Mon .
"Solving Globally-Optimal Threading Problems in ''Polynomial-Time''". United States. https://www.osti.gov/servlets/purl/1749.
```

```
@article{osti_1749,
```

title = {Solving Globally-Optimal Threading Problems in ''Polynomial-Time''},

author = {Uberbacher, E C and Xu, D and Xu, Y},

abstractNote = {Computational protein threading is a powerful technique for recognizing native-like folds of a protein sequence from a protein fold database. In this paper, we present an improved algorithm (over our previous work) for solving the globally-optimal threading problem, and illustrate how the computational complexity and the fold recognition accuracy of the algorithm change as the cutoff distance for pairwise interactions changes. For a given fold of m residues and M core secondary structures (or simply cores) and a protein sequence of n residues, the algorithm guarantees to find a sequence-fold alignment (threading) that is globally optimal, measured collectively by (1) the singleton match fitness, (2) pairwise interaction preference, and (3) alignment gap penalties, in O(mn + MnN{sup 1.5C-1}) time and O(mn + nN{sup C-1}) space. C, the topological complexity of a fold as we term, is a value which characterizes the overall structure of the considered pairwise interactions in the fold, which are typically determined by a specified cutoff distance between the beta carbon atoms of a pair of amino acids in the fold. C is typically a small positive integer. N represents the maximum number of possible alignments between an individual core of the fold and the protein sequence when its neighboring cores are already aligned, and its value is significantly less than n. When interacting amino acids are required to see each other, C is bounded from above by a small integer no matter how large the cutoff distance is. This indicates that the protein threading problem is polynomial-time solvable if the condition of seeing each other between interacting amino acids is sufficient for accurate fold recognition. A number of extensions have been made to our basic threading algorithm to allow finding a globally-optimal threading under various constraints, which include consistencies with (1) specified secondary structures (both cores and loops), (2) disulfide bonds, (3) active sites, etc.},

doi = {},

url = {https://www.osti.gov/biblio/1749},
journal = {},

number = ,

volume = ,

place = {United States},

year = {1999},

month = {4}

}