Summary: A Decision Graph Explanation of Protein
Secondary Structure Prediction.
David L. Dowe * , Jonathan Oliver, Lloyd Allison,
Christopher S. Wallace & Trevor I. Dix,
Department of Computer Science,
* email: firstname.lastname@example.org
Partly supported by Australian Research Council grant A49030439.
Technical Report 92/163
Abstract. Oliver and Wallace recently introduced the machinelearning technique of decision graphs, a
generalisation of decision trees. Here it is applied to the prediction of protein secondary structure to infer a
theory for this problem. The resulting decision graph provides both a prediction method and, perhaps more
interestingly, an explanation for the problem. Many decision graphs are possible for the problem; a
particular graph is just one theory or hypothesis of secondary structure formation. Minimum message
length encoding is used to judge the quality of different theories and, for example, prevents learning the
noise in the training data. Minimum message length encoding is a general technique in inductive inference.
The predictive accuracy for 3 states (Extended, Helix, Other) is in the range achieved by current
methods. Many believe this is close to the limit for methods based on only local information. In addition, a