Improving pairwise comparison of protein sequences with domain co-occurrence

Author summary Deciphering the functions of the different proteins of an organism constitutes a first step toward the understanding of its biology. Because they provide strong clues regarding protein functions, domains occupy a key position among the relevant annotations that can be assigned to a protein. Protein domains are sequential motifs that are conserved along evolution and are found in different proteins and in different combinations. One common approach for identifying the domains of a protein is to run sequence-sequence comparisons with local alignment tools as BLAST. However these approaches sometimes miss several hits, especially for species that are phylogenetically distant from reference organisms. We propose here an approach to increase the sensitivity of pairwise sequence comparisons. This approach makes use of the fact that protein domains tend to appear with a limited number of other domains on the same protein (the domain co-occurrence property). On P. falciparum, our approach allows identifying 2240 new domains for which, in most cases, no domain of the Pfam database could be linked.See it on Scoop.it, via Viruses, Immunology & Bioinformatics from Virology.uvic.ca
Improving pairwise comparison of protein sequences with domain co-occurrence
Source: Viral Bioinformatics

You may also like...