This tool calculates the Pseudo Log-Likelihood (PLL) and Pseudo-Perplexity (PPPL) of protein sequences using Meta AI's ESM2 model. These scores help quantify how "natural" a protein looks to a protein language model. Making it useful for evaluating designed sequences like antibodies or binders.
Score a FASTA file containing a single protein sequence (binder only):
python pll_score.py --fasta example_binder.fastaReading sequences from example_binder.fasta
Scoring: Cradle_EGFR_241aa (length: 241)
PLL: -301.2522
Avg Log-Likelihood: -1.2500
PPPL: 3.4904
conda create -n pll_env python=3.8
conda activate pll_envpip install fair-esm torch biopythonRequires Python 3.8+. A GPU is recommended (but CPU works for short sequences).
pll_score.py– script to score PLL and PPPL using a sequence from a FASTA fileexample_binder.fasta– example input binder sequence from Adaptyv Bio challenge
If you use this tool, please cite Meta AI's ESM2 models.