HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment
HHBlits will use a Hidden-Markov modeled sequence (.hhm files) to find sequences in existing databases that match the pattern. HHBlits is a part of a HHSuite software (Github link). On COSMIC2, we have HHBlits available so that users can take outputs from Model Angelo to find which sequences match the hidden Markov Model pattern.
- .hmm file
- .hhr file – list of proteins that match input .hmm pattern
- Aligned FASTA sequence file (.a3m) – sequence alignment for hits in .hhr file
Example: Using output from Model Angelo to find sequence matches with HHBlits
After running Model Angelo, you will have .hhm files per chain. An example from KIFBP run in Model Angelo shows this result for one of the chains:
After running HHBlits using UniClust30, the .hhr file has the following list of proteins:
It’s hard to see, but in this file but the top hits are all KIFBP from different species, indicating that the approach worked at finding the type of protein.
Remmert, M., Biegert, A., Hauser, A. et al. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 9, 173–175 (2012). https://doi.org/10.1038/nmeth.1818