Contact-based sequence alignment.

J. Kleinjung, J.W. Romein, K. Lin, J. Heringa

Research output: Contribution to JournalArticleAcademicpeer-review

163 Downloads (Pure)


This paper introduces the novel method of contact-based protein sequence alignment, where structural information in the form of contact mutation probabilities is incorporated into an alignment routine using contact-mutation matrices (CAO: Contact Accepted mutatiOn). The contact-based alignment routine optimizes the score of matched contacts, which involves four (two per contact) instead of two residues per match in pairwise alignments. The first contact refers to a real side-chain contact in a template sequence with known structure, and the second contact is the equivalent putative contact of a homologous query sequence with unknown structure. An algorithm has been devised to perform a pairwise sequence alignment based on contact information. The contact scores were combined with PAM-type (Point Accepted Mutation) substitution scores after parameterization of gap penalties and score weights by means of a genetic algorithm. We show that owing to the structural information contained in the CAO matrices, significantly improved alignments of distantly related sequences can be obtained. This has allowed us to annotate eight putative Drosophila IGF sequences. Contact-based sequence alignment should therefore prove useful in comparative modelling and fold recognition. © Oxford University Press 2004; all rights reserved.
Original languageEnglish
Pages (from-to)2464-2473
Number of pages10
JournalNucleic Acids Research
Issue number8
Publication statusPublished - 2004


Dive into the research topics of 'Contact-based sequence alignment.'. Together they form a unique fingerprint.

Cite this