Zubair Khalid

Virologist/Molecular Biologist | Veterinarian | Bioinformatician

Conventional & Molecular Virology • Vaccine Development • Computational Biology

Dr. Zubair Khalid is a veterinarian and virologist specializing in conventional and molecular virology, vaccine development, and computational biology. Dedicated to advancing animal health through innovative research and multi-omics approaches.

Dr. Zubair Khalid - Veterinarian, Virologist, and Vaccine Development Researcher specializing in Computational Biology, Multi-omics, Animal Health, and Infectious Disease Research

Section: Computational Biology

Protein-Ligand Docking and Free Energy Perturbation in Antiviral Drug Design: A Computational Virology Approach to SARS-CoV-2 Spike Protein

1. Introduction

Structure-based computational methods have become integral to antiviral drug discovery, enabling the rational design of small-molecule inhibitors against viral proteins. In the context of emerging coronaviruses, the spike glycoprotein represents a critical target for therapeutic intervention due to its role in host cell attachment and entry [1, 2]. The SARS-CoV-2 spike protein, while primarily associated with human disease, serves as an instructive model for understanding coronavirus spike dynamics in a veterinary virology framework [3, 4]. This article provides a comprehensive overview of protein-ligand docking and free energy perturbation (FEP) calculations as applied to antiviral design, using the spike protein as a central example. The computational workflows described herein are directly transferable to veterinary coronaviruses such as those affecting livestock and companion animals [5, 6].

The integration of molecular docking, molecular dynamics (MD) simulations, and alchemical free energy methods has accelerated the identification and optimization of candidate inhibitors against viral targets [7, 8, 9]. Benchmarking studies have demonstrated the ability of docking programs to reproduce binding modes for viral proteases, providing a foundation for spike protein applications [1]. FEP calculations offer a rigorous thermodynamic approach to rank binding affinities, guiding medicinal chemistry efforts with high accuracy [10, 11]. This review covers the theoretical basis of these methods, case studies drawn from the provided literature, and the role of structural databases and computational pipelines in predicting resistance mutations. Cross-links to related resources such as the Protein Data Bank (PDB), AlphaFold2, and genomic surveillance platforms like GISAID are integrated throughout.

2. Theoretical Basis of Docking Algorithms

Molecular docking algorithms predict the preferred orientation and conformation of a small molecule (ligand) within the binding site of a target protein. The process involves two core components: a search algorithm that samples ligand poses and a scoring function that estimates binding affinity [1]. Docking programs such as AutoDock Vina and Glide use empirical or knowledge-based scoring functions that approximate the energetic contributions of van der Waals interactions, hydrogen bonding, electrostatic forces, and desolvation penalties [12]. Rigid receptor docking treats the protein as static, while flexible docking allows limited conformational changes in side chains.

Table 1. Comparison of Docking Algorithm Features

Feature Rigid Receptor Docking Flexible Docking Induced Fit Docking
Protein flexibility None Selected side chains Full side chain and backbone
Computational cost Low Moderate High
Accuracy for spike Suitable for initial screening Improved for flexible loops Best for cryptic binding sites

Scoring functions are often calibrated against experimental binding data [1]. For the SARS-CoV-2 spike protein, the receptor-binding domain (RBD) presents a challenging target due to its conformational plasticity. Docking campaigns against the RBD have leveraged both rigid and flexible approaches, though validation against known inhibitors remains essential [11, 12]. The benchmarking study by Zev et al. [1] evaluated several docking programs against the main protease (Mpro) and found that while docking can recapitulate crystallographic poses, accurate ranking of affinities requires more sophisticated methods such as FEP.

3. Free Energy Perturbation Calculations

FEP is an alchemical simulation technique that computes the relative binding free energy between two ligands by gradually transforming one ligand into another through a series of nonphysical intermediate states [10, 2]. The thermodynamic cycle approach connects bound and unbound states, allowing calculation of (\Delta\Delta G_{bind}). FEP calculations require extensive sampling of protein-ligand conformations, typically using MD simulations with explicit solvent and reliable force fields [13, 4].

Table 2. Comparison of Docking Scores and FEP Binding Free Energies

Property Docking Score FEP (\Delta\Delta G_{bind})
Physical rigor Approximate Thermodynamically exact
Computational cost Minutes per ligand Days per ligand pair
Relative accuracy (RMSE) 2-3 kcal/mol 0.5-1.0 kcal/mol
Suitability for ranking Moderate High

The use of FEP in antiviral design has been demonstrated for several SARS-CoV-2 targets. Li et al. [11] applied accelerated FEP-based virtual screening to existing drugs against Mpro, successfully identifying potent inhibitors. Zhang et al. [2] used FEP to guide the molecular sculpting of perampanel into a noncovalent Mpro inhibitor. Patel et al. [10] designed a novel nucleotide antiviral agent through FEP calculations, optimizing binding to the RNA-dependent RNA polymerase. These studies underscore the value of FEP in rational ligand optimization.

Recent advances integrate physics-based simulations with deep learning to improve FEP throughput and accuracy [14]. Force field selection is critical; Roca et al. [13] compared QM/MM and classical force field treatments for the SARS-CoV-2 RdRp, highlighting the importance of accurate electrostatic descriptions. For the spike protein, FEP has been applied to predict the effects of mutations on ACE2 binding affinity and to guide the design of small-molecule inhibitors targeting the RBD.

4. Application to SARS-CoV-2 Spike Protein

The SARS-CoV-2 spike protein mediates host cell entry by binding to angiotensin-converting enzyme 2 (ACE2) through its RBD. The spike also contains host protease cleavage sites (e.g., furin) that are essential for membrane fusion. These features make the spike a high-value target for antiviral intervention [7, 15]. Computational approaches validated for other viral proteins are directly applicable to the spike. For example, the structure-guided identification of inhibitors against Mpro [7, 8, 3] and the prediction of resistance mutations in viral proteases [5, 16, 9] provide methodological templates for spike-focused studies.

The spike RBD undergoes conformational transitions between "up" and "down" states, which must be accounted for in docking simulations. Ensemble docking, where multiple protein conformations from MD simulations are used, improves the identification of binders that stabilize the closed state [2, 11]. The high-energy hydration sites at the RBD-ACE2 interface can be exploited for inhibitor design, as demonstrated for Mpro by Carney et al. [15]. Similarly, the furin cleavage site can be targeted with peptide aldehyde inhibitors [15].

Cross-reference to the article on Spike Protein Dynamics and Host Receptor Binding: A Computational Approach to Predicting Zoonotic Potential provides additional context for host-range parallels.

5. Case Studies from Provided Literature

The following table summarizes key contributions from the provided literature, emphasizing their methodological relevance to spike protein docking and FEP.

Table 3. Selected Studies and Their Contributions to Computational Methods

Study Target Methodological Contribution Relevance to Spike
[7] Mik et al. Mpro Structure-guided ligand identification Docking workflow transferable
[13] Roca et al. RdRp Force field comparison in QM/MM MD protocol for binding free energy
[8] Okabe et al. Mpro Computer-aided noncovalent inhibitor discovery FEP optimization strategy
[14] Yang et al. Mpro Integration of physics-based and deep learning Hybrid approach for affinity prediction
[5] Tan et al. PLpro Identification of drug-resistant mutations Resistance prediction pipeline
[6] Wang et al. Mpro MD and machine learning for mutant analysis Mutational scanning methods
[16] Xia et al. Mpro FEP for mutation effects on inhibitor efficacy Alchemical free energy for spike variants
[15] Carney et al. Mpro High-energy hydration site exploitation Water mapping for spike RBD
[9] Havranek et al. Mpro Computational-experimental resistance discovery Dual approach for spike inhibitors
[3] Kang et al. Mpro Structure-guided hit-to-lead optimization Lead optimization workflow
[10] Patel et al. RdRp In silico nucleotide design via FEP FEP for nucleoside analog design
[4] Yuan et al. RdRp Incorporation efficiency and inhibition mechanism MD studies of inhibitor binding
[1] Zev et al. Mpro Docking benchmarking Validation of docking protocols
[2] Zhang et al. Mpro Molecular sculpting driven by FEP FEP-guided scaffold hopping
[11] Li et al. Mpro Accelerated FEP virtual screening High-throughput FEP for spike
[12] Ngo et al. Mpro Computational determination of potential inhibitors Virtual screening protocols

6. Integration with Experimental Validation

Computational predictions require experimental confirmation. Surface plasmon resonance (SPR) and pseudovirus entry assays are commonly used to validate spike inhibitors. In the provided literature, several studies combined computation with experiment for Mpro inhibitors. Okabe et al. [8] discovered highly potent noncovalent inhibitors through computer-aided design and validated them with enzymatic assays. Kang et al. [3] performed structure-guided optimization of carmofur, followed by cellular antiviral testing. Havranek et al. [9] used a computational-experimental approach to identify nirmatrelvir resistance mutations in Mpro. These strategies are directly applicable to spike: computational hits can be tested in pseudovirus neutralization assays to measure entry inhibition.

Cross-reference to Structure-Based Drug Design in Bioinformatics: Computational Pipelines, Active Site Grid Mapping, and Virtual Screening Workflows provides additional protocol details.

7. Role of Structural Databases and Computational Pipelines for Resistance Prediction

The rapid mutation of viral glycoproteins necessitates proactive prediction of drug resistance. The PDB provides high-resolution structures of spike protein variants, while AlphaFold2 models can fill gaps where experimental structures are unavailable. Cross-link to AlphaFold and Beyond: Deep Learning for Protein Structure Prediction in Veterinary Virology. Computational pipelines that combine MD simulations, FEP, and machine learning can forecast the impact of mutations on inhibitor binding [14, 6, 16].

For example, Wang et al. [6] used MD and machine learning to enhance understanding of Mpro mutants. Xia et al. [16] applied FEP to predict the effects of Mpro mutations on nirmatrelvir efficacy. Tan et al. [5] identified naturally occurring drug-resistant mutations in the papain-like protease. These methodologies can be repurposed to predict spike mutations that confer resistance to small-molecule inhibitors. The GISAID database enables genomic surveillance of spike variants, while the KEGG database facilitates pathway analysis of host-virus interactions. Cross-link to Deep Mutational Scanning and Machine Learning for Predicting SARS-CoV-2 Spike Protein Escape Mutations from Antibody Neutralization.

8. Workflow for Computational Antiviral Design

The following Mermaid diagram outlines an integrated workflow for docking and FEP-based antiviral design targeting the spike protein.

flowchart TD
    A[Target Identification: Spike Protein], > B[Structure Determination: PDB or AlphaFold2]
    B, > C[Binding Site Characterization: RBD or Cleavage Site]
    C, > D[Molecular Docking: Virtual Screening of Libraries]
    D, > E[Initial Hit Selection: Docking Score Filter]
    E, > F[Free Energy Perturbation: Relative Binding Free Energy]
    F, > G[Hit Optimization: Ligand Modifications Guided by FEP]
    G, > H[Experimental Validation: SPR, Pseudovirus Assays]
    H, > I{Activity Confirmed?}
    I, Yes, > J[Resistance Prediction: MD/ML Mutation Scans]
    I, No, > D
    J, > K[Lead Candidate]
    K, > L[In Vivo Testing and Preclinical Development]

This pipeline emphasizes iterative cycling between computation and experiment, with resistance prediction integrated early in the design process.

9. Conclusion

Protein-ligand docking and FEP calculations provide a powerful combination for rational antiviral drug design. The application of these methods to the SARS-CoV-2 spike protein, as a model system, demonstrates the feasibility of identifying and optimizing small-molecule inhibitors against coronavirus entry machinery. The case studies drawn from the provided literature underscore the importance of rigorous validation and the transferability of computational protocols across viral targets. Veterinary virology stands to benefit from these approaches as they are adapted to animal coronaviruses and other viral pathogens. Continued integration of structural databases, machine learning, and experimental biology will further enhance the predictive power of computational drug discovery.

References

[1] Zev S, Raz K, Schwartz R, et al. Benchmarking the Ability of Common Docking Programs to Correctly Reproduce and Score Binding Modes in SARS-CoV-2 Protease Mpro. J Chem Inf Model. 2021. https://pubmed.ncbi.nlm.nih.gov/34047191/

[2] Zhang CH, Stone EA, Deshmukh M, et al. Potent Noncovalent Inhibitors of the Main Protease of SARS-CoV-2 from Molecular Sculpting of the Drug Perampanel Guided by Free Energy Perturbation Calculations. ACS Cent Sci. 2021. https://pubmed.ncbi.nlm.nih.gov/33786375/

[3] Kang KM, Jang Y, Lee SS, et al. Discovery of antiviral SARS-CoV-2 main protease inhibitors by structure-guided hit-to-lead optimization of carmofur. Eur J Med Chem. 2023. https://pubmed.ncbi.nlm.nih.gov/37633203/

[4] Yuan C, Goonetilleke EC, Unarta IC, et al. Incorporation efficiency and inhibition mechanism of 2'-substituted nucleotide analogs against SARS-CoV-2 RNA-dependent RNA polymerase. Phys Chem Chem Phys. 2021. https://pubmed.ncbi.nlm.nih.gov/34514487/

[5] Tan H, Zhang Q, Georgiou K, et al. Identification of naturally occurring drug-resistant mutations of SARS-CoV-2 papain-like protease. Nat Commun. 2025. https://pubmed.ncbi.nlm.nih.gov/40379662/

[6] Wang J, Xie J, Yu Y, et al. Enhancing the understandings on SARS-CoV-2 main protease (M(pro)) mutants from molecular dynamics and machine learning. Int J Biol Macromol. 2025. https://pubmed.ncbi.nlm.nih.gov/40220823/

[7] Mik V, Benz LS, Bělíček J, et al. Structure-guided identification and characterization of a novel pyrazolo[1,5-a]pyrimidine ligand of SARS-CoV-2 main protease. Int J Biol Macromol. 2026. https://pubmed.ncbi.nlm.nih.gov/42128112/

[8] Okabe A, Carney DW, Tawada M, et al. Discovery of Highly Potent Noncovalent Inhibitors of SARS-CoV-2 Main Protease through Computer-Aided Drug Design. J Med Chem. 2025. https://pubmed.ncbi.nlm.nih.gov/41076627/

[9] Havranek B, Demissie R, Lee H, et al. Discovery of Nirmatrelvir Resistance Mutations in SARS-CoV-2 3CLpro: A Computational-Experimental Approach. J Chem Inf Model. 2023. https://pubmed.ncbi.nlm.nih.gov/37947496/

[10] Patel D, Cox BD, Kasthuri M, et al. In silico design of a novel nucleotide antiviral agent by free energy perturbation. Chem Biol Drug Des. 2022. https://pubmed.ncbi.nlm.nih.gov/35313085/

[11] Li Z, Li X, Huang YY, et al. Identify potent SARS-CoV-2 main protease inhibitors via accelerated free energy perturbation-based virtual screening of existing drugs. Proc Natl Acad Sci U S A. 2020. https://pubmed.ncbi.nlm.nih.gov/33051297/

[12] Ngo ST, Quynh Anh Pham N, Thi Le L, et al. Computational Determination of Potential Inhibitors of SARS-CoV-2 Main Protease. J Chem Inf Model. 2020. https://pubmed.ncbi.nlm.nih.gov/32530282/ *** Disclaimer: This article is for educational and informational purposes only. It is not intended to substitute for professional veterinary advice, diagnosis, treatment, or regulatory guidance. Always consult a licensed veterinarian or qualified specialist regarding animal health, disease diagnosis, and therapeutic decisions.

[13] Roca M, Maghsoud Y, Cisneros GA, et al. Comparing Force Field Treatments in QM/MM Studies of the SARS-CoV-2 RNA-Dependent RNA Polymerase (RdRp) Mechanism. J Chem Theory Comput. 2025. https://pubmed.ncbi.nlm.nih.gov/41285575/

[14] Yang Y, Jiang Y, Zhang D, et al. Integrating Physics-Based Simulations with Data-Driven Deep Learning Represents a Robust Strategy for Developing Inhibitors Targeting the Main Protease. J Chem Inf Model. 2025. https://pubmed.ncbi.nlm.nih.gov/40767530/

[15] Carney DW, Leffler AE, Bell JA, et al. Exploiting high-energy hydration sites for the discovery of potent peptide aldehyde inhibitors of the SARS-CoV-2 main protease with cellular antiviral activity. Bioorg Med Chem. 2024. https://pubmed.ncbi.nlm.nih.gov/38518735/

[16] Xia YL, Du WW, Li YP, et al. Computational Insights into SARS-CoV-2 Main Protease Mutations and Nirmatrelvir Efficacy: The Effects of P132H and P132H-A173V. J Chem Inf Model. 2024. https://pubmed.ncbi.nlm.nih.gov/38913174/