Workshop paper

Pareto-Guided Reinforcement Learning for Multi-Objective ADMET Optimization in Generative Drug Design

Abstract

Designing effective drug molecules is a multi-objective challenge that requires the simultaneous optimization of ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) properties. Existing generative frameworks face two major limitations: (1) a reliance on molecular descriptors that fail to capture pharmacologically meaningful endpoints, and (2) the use of reward linear scalarization, which collapses multiple objectives into a single score and obscures trade-offs. To address these challenges, we propose a Pareto-guided reinforcement learning framework for predictor-driven ADMET optimization (RL-Pareto). Our framework enables simultaneous optimization of multiple objectives and flexibly scales to user-defined objective sets without retraining. Predictor models trained on ADMET datasets provide direct feedback on drug-relevant properties, while Pareto dominance defines reward signals that preserve trade-off diversity during chemical space exploration. In benchmarking, our framework achieved a 99% success rate, with 100% validity, 87% uniqueness, and 100% novelty, alongside improved hypervolume coverage compared to strong baselines. These results demonstrate the potential of Pareto-based reinforcement learning to generate molecules that effectively balance competing properties while maintaining diversity and novelty.