Drug-target binding affinity prediction with docking pose physics

Marina Villacampa-fernandez; Raúl Fernández Díaz; Lam Thanh Hoang

ACS Fall 2025

Poster

17 Aug 2025

Drug-target binding affinity prediction with docking pose physics

Abstract

Accurate prediction of drug–target binding affinity (DTA) is crucial for drug discovery and virtual screening. Traditional docking scoring functions often provide poor estimates of binding affinity. In this work, we leverage docking poses to improve DTA prediction. We model protein–ligand binding interactions using MACE, an SO(n) equivariant graph neural network, which captures detailed atomic environments. Binding regions are defined by spatial proximity, incorporating both ligand and protein atoms. MACE is then trained on these atomic graphs to enhance affinity estimation accuracy. Docked poses are generated using DiffDock, a diffusion-based blind docking algorithm. We demonstrate that incorporating the top-10 ranked docking poses during training as a form of data augmentation yields better performance than relying solely on the top-ranked pose. Furthermore, ensembling predictions across the top-10 poses improves robustness by mitigating the impact of misranked conformations. To further refine binding affinity predictions, we integrate additional physical and chemical descriptors. These include MACE-based binding predictions, neural potential energy estimates, molecular fingerprints, and DFT-based energy calculations of the binding poses. Ligand descriptors are combined with target representations derived from the pre-trained protein language model ESM. An ablation study evaluates the contribution of each feature types using LightGBM (Light Gradient Boosting Machine), highlighting the current limitations of machine-learned physical descriptors based on predicted docking poses. We refer to our integrated framework as DockBind. We evaluate DockBind on the Davis dataset, which focuses on kinase–inhibitor binding affinities. Our results underscore the value of combining physics-informed docking pose information with machine learning for improved DTA prediction. Future directions include generalizing the approach to other protein families and further optimizing docking pose selection strategies.

Poster