ESMFold – Evolutionary Scale Modeling

ESMFold is a large language model approach to protein structure prediction by Meta AI. Unlike AlphaFold2, ESMFold does not use sequence alignments during prediction. Instead, ESMFold learned to predict protein structures directly from sequence. As such, ESMFold is 10X faster than AlphaFold-based approaches and is sensitive to amino acid changes.

Current size limit: ~2000 amino acids

Running ESMFold on COSMIC²

Input: a FASTA protein sequence file containing your single OR multiple sequences of interest. If multiple sequences separate by different chains.

  • Upload data via browser upload (not Globus!)

Number of Recycles Increasing can help generate higher confidence models.

Chunk size for axial attention. If job crashes due to memory limitations, smaller chunks will have small memory footprints.

Example ESMFold run

Example target sequences for a leucine zipper homodimer:

> chain a
XRMKQLEDKVEELLSKNYHLENEVARLKKLVGER
> chain b
XRMKQLEDKVEELLSKNYHLENEVARLKKLVGER

Runtime: 2 minutes. (ColabFold = 2 mins.; AlphaFold2 = )

 

Citation

Evolutionary-scale prediction of atomic-level protein structure with a language model. Lin et al. 2023