Accelerating Virtual Screening: BALM is Now Available on Vecura
This update enables medicinal chemists and computational researchers to rapidly estimate protein-ligand binding affinities through a streamlined workflow on Vecura, eliminating the need to manage complex, structure-dependent simulation pipelines.
What is BALM?
BALM (Binding Affinity Language Models) is a structure-free deep-learning framework designed to predict protein-ligand binding affinity (pKd) directly from protein amino acid sequences and drug SMILES strings. By fine-tuning pretrained language models—ESM-2 for proteins and ChemBERTa for ligands—BALM projects their representations into a shared 256-dimensional space where affinity is determined by cosine similarity. It removes the need for time-consuming 3D structural data or docking simulations, making it an efficient tool for high-throughput computational drug discovery.
It helps users rapidly estimate binding affinities and generate molecular and protein embeddings for downstream analysis. It is especially useful for researchers conducting early-stage virtual screening over large chemical libraries where 3D structures are unavailable.
What can users do with BALM on Vecura?
With BALM on Vecura, users can:
- Predict the binding affinity (pKd) of a drug candidate for a specific protein target.
- Efficiently rank large libraries of compounds against fixed targets without requiring docking poses.
- Generate 256-dimensional embeddings for proteins and drugs to be used as features in other predictive models, such as Gaussian-process regressors.
- Compare binding strengths across different compound sets using normalized cosine similarity scores.
What the output means
The output provides a predicted pKd value (the negative logarithm of the dissociation constant, Kd), raw cosine similarity metrics, and learned 256-dimensional embeddings for the input protein and ligand.
This output should be used to support scientific decision making. It does not replace experimental validation.
Why this matters
The traditional bottleneck in drug discovery often lies in the computational cost and technical complexity of generating and screening 3D protein-ligand docking poses. By leveraging the representational power of large language models trained on protein and chemical space, BALM democratizes the ability to perform accurate, sequence-based affinity estimation, significantly accelerating the triage phase of drug development.
This approach provides a scalable, structure-agnostic alternative that bridges the gap between massive chemical libraries and target protein data. While it does not replace the gold-standard experimental assays, it acts as a powerful filter to prioritize the most promising leads for further wet-lab investigation.
- Developed by: Rohan Gorantla, Aryo Gema, and Antonia Mey
- Source: Official GitHub Repository
- Reference: Learning Binding Affinities via Fine-Tuning of Protein and Ligand Language Models (J. Chem. Inf. Model. 2025)
Try BALM on Vecura.
Open the model workspace and start evaluating it with your own inputs.