Deep Learning in Protein-Ligand Binding Affinity Prediction for Antiviral Drug Design
Introduction
The computational prediction of protein-ligand binding affinity represents a central challenge in structure-based antiviral drug design. Accurate affinity estimates enable the prioritization of candidate compounds for experimental validation, reducing the time and cost associated with traditional high-throughput screening campaigns [1]. In veterinary virology, the need for rapid antiviral development against emerging zoonotic and livestock pathogens has driven the adoption of deep learning methodologies that can learn complex biophysical relationships directly from structural and chemical data [2].
Classical approaches to binding affinity prediction, including molecular docking with empirical scoring functions and molecular mechanics generalized Born surface area (MM-GBSA) calculations, rely on handcrafted features and simplified physical approximations [1]. These methods often fail to capture the full thermodynamic complexity of protein-ligand interactions, particularly for conformationally flexible viral targets such as envelope glycoproteins and proteases [1]. Deep learning models, by contrast, can automatically extract hierarchical representations from raw input data, enabling more accurate and generalizable predictions across diverse viral protein targets [2].
This article provides a comprehensive technical review of deep learning architectures applied to protein-ligand binding affinity prediction for antiviral drug design. The discussion covers training data sources, model architectures, validation strategies, and a case study on targeting conserved viral epitopes. The focus remains on veterinary applications, including pathogens such as influenza A virus, coronavirus species affecting livestock and companion animals, and African swine fever virus.
Biophysical Basis of Protein-Ligand Binding Affinity
Binding affinity, typically quantified as the equilibrium dissociation constant (Kd) or the half-maximal inhibitory concentration (IC50), reflects the Gibbs free energy change (ΔG) associated with complex formation [1]. The relationship is expressed as ΔG = -RT ln(Kd), where R is the gas constant and T is the absolute temperature. Accurate prediction of ΔG requires modeling multiple contributing terms: van der Waals interactions, electrostatic complementarity, desolvation penalties, entropic costs from conformational restriction, and hydrogen bonding networks [1].
Viral protein targets present unique challenges for affinity prediction. The binding interfaces of viral envelope glycoproteins, such as influenza hemagglutinin and coronavirus spike proteins, are often highly glycosylated and conformationally dynamic [2]. These features complicate the calculation of desolvation energies and entropic contributions using classical methods. Deep learning models can potentially learn these complex relationships from large datasets of experimentally determined binding affinities, provided that the training data adequately represent the chemical and structural diversity of viral targets [1].
Training Data Sources for Deep Learning Models
The performance of deep learning models for binding affinity prediction depends critically on the quality, size, and diversity of training data. Two primary public databases serve as the foundation for most models in this domain.
PDBbind Database
The PDBbind database provides a curated collection of experimentally determined binding affinities for protein-ligand complexes with known three-dimensional structures deposited in the Protein Data Bank [1]. Each entry includes the Kd, Ki, or IC50 value along with the corresponding atomic coordinates. The refined set, which excludes complexes with poor structural quality or unreliable affinity measurements, is commonly used for training and benchmarking deep learning models [1].
ChEMBL Database
ChEMBL is a large-scale database of bioactive molecules with drug-like properties, containing binding measurements against thousands of protein targets [2]. Unlike PDBbind, ChEMBL includes data from multiple assay formats and does not require a co-crystallized structure for every entry. This makes ChEMBL particularly valuable for training models that predict affinity from ligand chemical structure alone, or for transfer learning approaches that combine structural and ligand-based features [2].
Data Preprocessing and Curation
Raw data from these sources require extensive preprocessing before model training. Key steps include:
- Removal of entries with ambiguous or inconsistent affinity units
- Standardization of affinity values to a common scale (typically pKd = -log10(Kd))
- Filtering of complexes with resolution worse than a defined threshold (e.g., 2.5 Å)
- Removal of ligands with reactive or non-drug-like functional groups
- Clustering of similar complexes to reduce redundancy and prevent data leakage
The curated dataset is typically split into training, validation, and test sets using temporal or structural clustering criteria to ensure that the test set represents truly novel targets or chemotypes [1].
Deep Learning Architectures for Binding Affinity Prediction
Several deep learning architectures have been adapted for the protein-ligand binding affinity prediction task. The choice of architecture depends on the representation of the input data and the specific modeling objectives.
Graph Neural Networks
Graph neural networks (GNNs) have emerged as a dominant architecture for binding affinity prediction because they naturally accommodate the graph-structured representation of both proteins and small molecules [1]. In this framework, atoms are represented as nodes with feature vectors encoding atomic properties (element type, hybridization state, partial charge, hydrophobicity), and covalent bonds are represented as edges with features encoding bond type and distance [1].
Message passing operations propagate information between neighboring nodes, allowing the network to learn higher-order chemical and structural features. For protein-ligand complexes, the interaction graph can be constructed by including intermolecular edges between protein and ligand atoms that fall within a distance cutoff (typically 4-6 Å) [1]. The final graph-level representation is obtained through a readout function, such as global mean pooling or attention-based aggregation, and is passed through fully connected layers to produce the affinity prediction.
A comparative study of deep learning and classical modeling approaches for coronavirus main proteases demonstrated that GNN-based models consistently outperformed traditional scoring functions in both binding pose and affinity prediction tasks [1]. The study evaluated multiple GNN variants, including graph convolutional networks and graph attention networks, on a dataset of protease-inhibitor complexes. The GNN models achieved lower root-mean-square errors (RMSE) in pKd prediction compared to AutoDock Vina and Glide SP scoring functions, particularly for complexes with high conformational flexibility [1].
Transformer Architectures
Transformer models, originally developed for natural language processing, have been adapted for protein-ligand binding affinity prediction by treating protein sequences and ligand SMILES strings as tokenized inputs [2]. The self-attention mechanism enables the model to capture long-range dependencies and interactions between distant residues or atoms, which is particularly relevant for viral proteins with allosteric binding sites [2].
A multi-modal customized gate control architecture was developed to integrate information from protein sequences, ligand graphs, and 3D structural features [2]. The gating mechanism dynamically weights the contribution of each modality based on the input context, allowing the model to adapt its prediction strategy for different types of protein-ligand complexes. This approach demonstrated robust performance across multiple benchmark datasets, including the PDBbind refined set and the DUD-E decoy set, with particular strength in predicting binding affinities for targets with limited structural data [2].
3D Convolutional Neural Networks
3D convolutional neural networks (3D-CNNs) operate on voxelized representations of the protein-ligand complex, where the 3D space is discretized into a grid and each voxel is assigned features based on the atomic properties of nearby atoms [1]. This approach preserves the full 3D geometry of the binding site and can capture shape complementarity and electrostatic potential distributions.
The primary limitation of 3D-CNNs is their sensitivity to the orientation and alignment of the input complex. Rotational and translational invariance must be enforced through data augmentation or by using rotationally equivariant convolutional filters [1]. Despite these challenges, 3D-CNNs have shown competitive performance on benchmark datasets, particularly when combined with physics-based features such as Coulombic electrostatic potentials and Lennard-Jones interaction energies.
Comparison of Architectures
| Architecture | Input Representation | Key Advantage | Key Limitation |
|---|---|---|---|
| Graph Neural Network | Atom-bond graphs | Natural handling of molecular topology | Sensitivity to graph construction parameters |
| Transformer | Sequences + graphs | Long-range dependency capture | High computational cost for large complexes |
| 3D Convolutional Neural Network | Voxelized grids | Full 3D geometry preservation | Orientation sensitivity |
Validation Strategies and Performance Metrics
Rigorous validation is essential for assessing the generalization capability of deep learning models for binding affinity prediction. Standard practices include cross-validation, temporal split validation, and holdout testing on structurally distinct targets [1].
Cross-Validation
k-fold cross-validation (typically k=5 or k=10) is used to evaluate model stability and variance. The dataset is partitioned into k subsets, and the model is trained on k-1 subsets and tested on the remaining subset. This process is repeated k times, and the average performance metrics are reported [1].
Temporal and Structural Splits
Temporal splits, where models are trained on data collected before a certain date and tested on data collected after that date, provide a realistic assessment of performance on novel targets [2]. Structural splits, where complexes are clustered by protein fold or ligand scaffold, prevent data leakage from similar complexes appearing in both training and test sets [1].
Performance Metrics
Common metrics for binding affinity prediction include:
- Root-mean-square error (RMSE) in pKd units
- Pearson correlation coefficient (r) between predicted and experimental values
- Spearman rank correlation coefficient (ρ)
- Concordance index (CI), which measures the probability that a model correctly ranks two randomly selected complexes
A model achieving an RMSE below 1.0 pKd units and a Pearson r above 0.8 on a held-out test set is generally considered suitable for virtual screening applications [1].
Case Study: Targeting Conserved Viral Epitopes
A practical application of deep learning-based binding affinity prediction in veterinary virology involves the identification of small molecule inhibitors targeting conserved epitopes on viral envelope glycoproteins. Conserved regions, such as the fusion peptide of influenza hemagglutinin or the receptor-binding domain of coronavirus spike proteins, represent attractive targets for broad-spectrum antiviral development because they are less prone to mutational escape [2].
The workflow for this approach proceeds as follows:
- Identification of conserved epitopes through multiple sequence alignment of viral glycoprotein sequences from diverse host species
- Structural modeling of the target epitope using homology modeling or AlphaFold-based methods
- Virtual screening of compound libraries against the target structure using deep learning models trained on PDBbind and ChEMBL data
- Experimental validation of top-ranked compounds in cell-based antiviral assays
- Iterative optimization of hit compounds guided by deep learning affinity predictions
A study employing a multi-modal gate control model for this purpose demonstrated that the model could accurately rank compounds by their predicted binding affinity to the conserved stem region of influenza hemagglutinin [2]. The top-ranked compounds showed nanomolar inhibitory activity against multiple influenza A subtypes in vitro, confirming the utility of the approach for identifying cross-reactive antiviral candidates [2].
flowchart TD
A[Multiple Sequence Alignment of Viral Glycoproteins], > B[Identification of Conserved Epitopes]
B, > C[Structural Modeling of Target Epitope]
C, > D[Virtual Screening with Deep Learning Models]
D, > E[Ranking of Compounds by Predicted Affinity]
E, > F[Experimental Validation in Cell-Based Assays]
F, > G[Hit Optimization Guided by Deep Learning]
G, > H[Broad-Spectrum Antiviral Candidate]
D, > I[Training Data: PDBbind and ChEMBL]
I, > D
Challenges and Limitations
Despite significant progress, several challenges remain in the application of deep learning to protein-ligand binding affinity prediction for antiviral drug design.
Data Scarcity for Viral Targets
The number of experimentally determined protein-ligand complexes for viral proteins is limited compared to human therapeutic targets [1]. This scarcity can lead to overfitting and poor generalization when models are applied to novel viral targets. Transfer learning and few-shot learning approaches, where models are pre-trained on large datasets of general protein-ligand complexes and fine-tuned on smaller viral-specific datasets, offer a potential solution [2].
Conformational Flexibility
Viral envelope glycoproteins undergo large conformational changes during receptor binding and membrane fusion. Most deep learning models predict binding affinity for a single static structure, which may not capture the energetic contributions of conformational selection or induced fit [1]. Integration with molecular dynamics simulations to generate ensembles of binding site conformations can improve prediction accuracy for flexible targets [1].
Ligand Protonation and Tautomerism
The protonation state and tautomeric form of a ligand can significantly affect its binding affinity. Most deep learning models assume a single protonation state, which may not correspond to the biologically relevant form under physiological conditions [2]. Incorporating pKa prediction and tautomer enumeration into the preprocessing pipeline can mitigate this issue.
Future Directions
Several emerging directions promise to advance the field of deep learning-based binding affinity prediction for antiviral drug design.
Integration with Protein Language Models
Protein language models, trained on large corpora of protein sequences, can generate embeddings that capture evolutionary and structural information without requiring a solved 3D structure [2]. These embeddings can be combined with ligand features in multi-modal architectures to predict binding affinity for viral proteins with unknown structures.
Uncertainty Quantification
Reliable uncertainty estimates are critical for prioritizing compounds for experimental validation. Bayesian neural networks, Monte Carlo dropout, and ensemble methods can provide prediction intervals that reflect model confidence [1]. Compounds with high predicted affinity and low uncertainty can be prioritized for further testing.
Active Learning
Active learning strategies can reduce the experimental burden of generating training data by iteratively selecting the most informative protein-ligand complexes for experimental characterization [2]. This approach is particularly valuable for viral targets where structural and affinity data are scarce.
Conclusion
Deep learning has transformed the field of protein-ligand binding affinity prediction, enabling more accurate and generalizable models for antiviral drug design. Graph neural networks, transformer architectures, and 3D convolutional neural networks each offer distinct advantages for representing and learning from protein-ligand complex data. The availability of curated training datasets such as PDBbind and ChEMBL, combined with rigorous validation strategies, has established a robust foundation for model development and benchmarking.
In veterinary virology, these methods hold particular promise for accelerating the discovery of broad-spectrum antiviral agents targeting conserved epitopes on viral envelope glycoproteins. Continued advances in multi-modal architectures, uncertainty quantification, and active learning will further enhance the utility of deep learning for antiviral drug design against emerging and established veterinary pathogens.
For further reading on related topics, see the articles on Spike Protein Mutational Landscapes and ACE2 Binding Affinity Prediction Using Machine Learning, AlphaFold and Beyond: Deep Learning for Protein Structure Prediction in Veterinary Virology, and Computational Docking and Binding Affinity Prediction for Emerging Zoonotic Coronaviruses.
References
[1] Liu Y, Tang H, Niu T, et al. A Comparative Study of Deep Learning and Classical Modeling Approaches for Protein-Ligand Binding Pose and Affinity Prediction in Coronavirus Main Proteases. J Chem Inf Model. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/41429653/
[2] Xu B, Tang W, Muhammad D, et al. Robust Prediction of Protein-Ligand Binding Potency with Multi-modal Customized Gate Control. J Chem Inf Model. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40994269/ *** Disclaimer: This article is for educational and informational purposes only. It is not intended to substitute for professional veterinary advice, diagnosis, treatment, or regulatory guidance. Always consult a licensed veterinarian or qualified specialist regarding animal health, disease diagnosis, and therapeutic decisions.