Understanding the function of proteins continues to be a fundamental problem of biology. Functional annotation of proteins
through biological experimentation, however, is time-consuming and expensive. Many computational methods for prediction of protein function have been developed, including techniques that detect geometric similarities between protein structures. Similarities in the geometry of key protein components may indicate functional similarity. Although these computational techniques are accurate and efficient in determining geometric similarity, the choice of protein components to compare is important as well. The components must be both functionally significant and a geometrically distinct representation of a protein relative to other protein structures to prevent matches with proteins that are not functionally similar.
We define a set of protein components to be used as a basis for comparison as a motif. Each protein component within the motif is represented as a point in three-dimensional space, usually chosen from the area surrounding the active site of the protein. Our current capabilities allow us to take a motif based on a single protein structure and refine it to determine an optimal motif for representing that structure in a geometrically unique way. However, many proteins have multiple entries in the Protein Data Bank (PDB). Conformational changes may occur during ligand binding or catalysis, resulting in the existence of many different structures of the same protein. I am working on a method to optimize motifs for a protein with multiple PDB structures.