Zubair Khalid

Virologist/Molecular Biologist | Veterinarian | Bioinformatician

Conventional & Molecular Virology • Vaccine Development • Computational Biology

Dr. Zubair Khalid is a veterinarian and virologist specializing in conventional and molecular virology, vaccine development, and computational biology. Dedicated to advancing animal health through innovative research and multi-omics approaches.

Dr. Zubair Khalid - Veterinarian, Virologist, and Vaccine Development Researcher specializing in Computational Biology, Multi-omics, Animal Health, and Infectious Disease Research

Section: Computational Biology

Structural Prediction of Bat Coronavirus Spike Proteins: Insights from AlphaFold2 and Molecular Dynamics

Introduction

Bats serve as reservoir hosts for a diverse array of coronaviruses, including alphacoronaviruses and betacoronaviruses that have demonstrated the capacity for cross-species transmission [1, 2, 3, 4, 5]. The spike glycoprotein (S protein) is the primary determinant of host range and tissue tropism, mediating viral entry via receptor recognition and membrane fusion [6, 7, 8]. Understanding the three-dimensional architecture of bat coronavirus spike proteins is therefore critical for assessing zoonotic spillover risk and for developing predictive models of host adaptation [9, 10, 11]. However, experimental structure determination through cryo-electron microscopy (cryo-EM) or X-ray crystallography remains resource-intensive and is not feasible for every novel bat coronavirus isolate [12, 8]. Computational approaches, particularly deep learning-based protein structure prediction and molecular dynamics (MD) simulations, have emerged as powerful alternatives for generating high-confidence structural models and for probing the dynamic behavior of spike proteins [13, 14, 15, 16]. This article reviews the application of AlphaFold2 and MD simulations to bat coronavirus spike proteins, focusing on receptor-binding domain (RBD) architecture, ACE2 interaction dynamics, and implications for zoonotic risk assessment.

AlphaFold2 in Structural Prediction of Bat Coronavirus Spikes

AlphaFold2, a deep learning algorithm that predicts protein structures from amino acid sequences with near-experimental accuracy, has been extensively applied to viral glycoproteins [13, 12, 15]. For bat coronaviruses, AlphaFold2 models have been used to generate full-length spike ectodomain structures, including the RBD, N-terminal domain (NTD), and the S1/S2 cleavage region [8, 16]. The algorithm leverages multiple sequence alignment (MSA) features and an attention-based neural network to predict inter-residue distances and angles, producing a ranked set of structural models [13, 14]. These models have been particularly valuable for sarbecoviruses and merbecoviruses where no experimental structure exists, such as for SHC014-CoV and BtKY72 [2, 16]. Comparative analyses have shown that AlphaFold2 models of bat coronavirus RBDs recapitulate the core beta-sheet fold and receptor-binding motif (RBM) conformation observed in cryo-EM structures, with root-mean-square deviations (RMSD) typically below 2 Å over backbone atoms [8, 16]. However, loop regions, especially the RBM loop that contacts ACE2, often exhibit higher conformational variability and require refinement through MD simulations [17, 16]. AlphaFold2 also predicts the positioning of glycans attached to N-linked glycosylation sequons, which is critical for understanding glycan shielding and immune evasion [8]. For a broader discussion of deep learning-based structural prediction of viral envelope glycoproteins, readers may refer to the article Deep Learning-Driven Structural Prediction of Viral Envelope Glycoproteins: Implications for Receptor Binding and Antigenic Drift.

Molecular Dynamics Simulations of Spike Dynamics

MD simulations provide atomic-level insights into the conformational flexibility and dynamics of spike proteins that are inaccessible from static experimental or predicted structures [18, 10, 17]. All-atom MD simulations of bat coronavirus spike models, including those from AlphaFold2, are typically performed with explicit solvent and physiological ionic strength using force fields such as CHARMM or AMBER [18, 14, 16]. Simulation timescales ranging from hundreds of nanoseconds to several microseconds have been employed to investigate RBD opening, the transition from closed (RBD-down) to open (RBD-up) conformations, and the allosteric communication between the RBD and the subdomain-2 (SD2) region [16]. For example, MD simulations of SHC014-CoV spike variants carrying the F294L and A835D mutations revealed epistatic effects that increase RBD opening propensity, a prerequisite for ACE2 binding [16]. The simulations also identified a conserved salt-bridge network involving the fusion peptide proximal region (FPPR) that modulates large-scale conformational changes [16]. Free energy perturbation (FEP) and binding free energy calculations using methods like MM-GBSA have been applied to quantify the impact of RBD mutations on ACE2 binding affinity [18, 17, 19]. These computational predictions have been validated against experimental binding assays for several bat coronaviruses, including the recently described heart-nosed bat alphacoronavirus that utilizes human CEACAM6 as an entry receptor [1]. A detailed overview of MD simulation methodologies for viral spike glycoproteins can be found in the article Molecular Dynamics Simulations of Viral Spike Glycoproteins: Insights into Host Receptor Binding and Antibody Escape.

Receptor-Binding Domain and ACE2 Interactions

The RBD of bat coronavirus spike proteins interacts with host cell receptors, predominantly angiotensin-converting enzyme 2 (ACE2) for sarbecoviruses, although alternative receptors such as CEACAM6 and dipeptidyl peptidase 4 (DPP4) are used by some lineages [1, 6, 7]. Structural prediction of the RBD-ACE2 complex is essential for assessing cross-species compatibility and zoonotic potential [9, 20, 11]. AlphaFold2 has been used to model the RBDs of bat coronaviruses such as BtKY72, RsSHC014, and WIV1 in complex with ACE2 orthologs from humans, bats, and other mammals [2, 17, 21]. These models identify key contact residues in the RBM, including those that form hydrogen bonds and hydrophobic interactions with ACE2 [2, 19]. Comparative studies have shown that bat ACE2 polymorphisms at positions 31, 41, and 353, among others, critically affect binding affinity [7]. For example, the bat ACE2 residue asparagine at position 31 (N31) is conserved across many bat species and is compatible with sarbecovirus RBD binding, whereas substitutions such as histidine or serine reduce affinity [7]. MD simulations of these RBD-ACE2 complexes provide dynamic binding free energies and reveal that conformational fluctuations in the RBM loop can compensate for unfavorable substitutions [17, 19]. The predictive capacity of these models was demonstrated in the identification of a critical residue motif YYDRxxG in the RBD that determines neutralization breadth of pan-sarbecovirus antibodies [22]. Additional insights into receptor-binding dynamics and cross-species transmission are discussed in the article Computational Prediction of Zoonotic Spillover: Receptor-Binding Dynamics and Structural Modeling of Bat Coronavirus Spike Proteins.

Mutations and Zoonotic Spillover Potential

Spike protein mutations accumulated in bat coronaviruses through evolution in reservoir hosts and intermediate hosts can enhance human ACE2 binding and immune evasion [23, 24, 3, 8]. Bioinformatic and computational approaches have identified several adaptive mutations in the RBD, such as N501Y, T487A, and K417N, that increase binding affinity for human ACE2 [24, 3]. For bat coronaviruses, substitutions in the RBM that introduce positively charged residues or alter the loop flexibility are associated with increased spillover risk [10, 25]. The furin cleavage site at the S1/S2 boundary, which is present in some sarbecoviruses but absent in many bat coronaviruses, has been investigated through sequence analysis and structural modeling [25, 12]. MD simulations have shown that insertion of a furin motif can alter the proteolytic activation and fusion kinetics [12]. Furthermore, multi-epitope vaccine design studies have utilized predicted spike protein structures from bat coronaviruses to identify conserved epitopes that may confer cross-protection [23, 26, 27, 28]. These in silico approaches rely on accurate structural models from AlphaFold2 to map epitope accessibility and stability [26, 27]. The role of accessory proteins in coronavirus evolution and host adaptation has also been characterized genetically, though structural predictions for these smaller proteins remain less common [29]. For a comprehensive treatment of spike protein evolution and receptor binding adaptation, see the article Computational Prediction of Cross-Species Receptor Binding: Bat Coronavirus Spike Protein Evolution and Human Pandemic Risk.

Comparison with Experimental Structures

Validation of predicted structures against experimentally determined cryo-EM or X-ray crystallography models is a critical step in establishing the reliability of computational predictions [8, 16]. For bat coronavirus spikes, several cryo-EM structures have been solved for close relatives of SARS-CoV-2, including WIV1, SHC014, and civet coronaviruses [8]. These experimental structures reveal detailed features such as glycan trees, water molecule coordination networks, and the presence of a fatty acid (linoleic acid) binding pocket that stabilizes the RBD in the down conformation [8]. AlphaFold2 predictions generally agree well with these experimental models for ordered secondary structure elements and core beta-sheets, but differences arise in flexible loops, the SD2 region, and the NTD [12, 8]. Additionally, cryo-EM structures have resolved biliverdin binding in the NTD, a feature that is not consistently predicted by AlphaFold2 without ligand information [8]. MD simulations initiated from AlphaFold2 models can be used to relax the structure and improve agreement with experimental data, particularly for water-mediated hydrogen bonding networks [16]. Comparative analysis between predicted and experimental structures has informed the design of pan-sarbecovirus therapeutics and vaccines [22, 14]. Further discussion of integrating cryo-EM and computational modeling is available in the article Structural and Evolutionary Dynamics of Coronavirus Spike Protein: Integrating Cryo-EM, Molecular Dynamics, and Phylogenetic Surveillance.

Workflow for Structural Prediction and Dynamics Analysis

The following Mermaid diagram outlines a typical computational pipeline combining AlphaFold2 and MD simulations to investigate bat coronavirus spike protein structure and dynamics.

flowchart TD
    A[Bat Coronavirus Spike Sequence], > B[AlphaFold2 Structure Prediction]
    B, > C{Model Evaluation}
    C, >|pLDDT > 70| D[Model Refinement<br>Energy Minimization]
    C, >|pLDDT < 50| E[Iterative Template Selection<br>or Metagenomic Assembly]
    D, > F[System Setup<br>Solvation, Ions, Glycosylation]
    F, > G[All-Atom MD Production Run<br>100 ns - 1 μs]
    G, > H[Trajectory Analysis]
    H, > I[RMSD / RMSF / Principal Component Analysis]
    H, > J[Dynamic Cross-Correlation<br>& Community Network Analysis]
    H, > K[Binding Free Energy Calculation<br>MM-GBSA / FEP]
    I, > L[Identify Conformational States<br>RBD Open/Closed Populations]
    J, > M[Allosteric Pathways<br>e.g., SD2 to RBM]
    K, > N[Binding Affinity Predictions<br>for ACE2 Orthologs]
    L, > O[Zoonotic Risk Assessment]
    M, > O
    N, > O

Summary of Key Bat Coronavirus Spike Proteins Studied Using Computational Approaches

Bat Coronavirus Genus RBD Features Predicted Structures (AlphaFold2) Experimental Structures (Cryo-EM) Key References
SHC014-CoV Sarbecovirus RBM loop with two key mutations (F294L, A835D) Full ectodomain; MD simulations show epistasis Partial cryo-EM (unpublished) [3, 16]
WIV1 Sarbecovirus High homology to SARS-CoV; linoleic acid binding pocket RBD-ACE2 complex Cryo-EM at 3.5 Å (2024) [8]
BtKY72 Sarbecovirus Unique RBD residues; bat ACE2 compatibility RBD-ACE2 complex with multiple orthologs None available [2]
HKU5-CoV-2 Merbecovirus S1 subunit targeting; DPP4 binding RBD and S1 domain; drug docking None available [23, 18]
MERS-CoV related (bat) Merbecovirus Conserved antigenic sites Spike monomer and trimer models Partial cryo-EM for MERS-CoV [30, 31]
Heart-nosed bat alphacoronavirus Alphacoronavirus Uses CEACAM6 receptor Predicted RBD None available [1]

Conclusion

The integration of AlphaFold2-based structure prediction with all-atom molecular dynamics simulations provides a robust framework for characterizing bat coronavirus spike proteins in the absence of experimental structures [13, 14, 16]. These computational approaches enable the identification of critical residues governing receptor specificity, the dynamic conformational changes required for viral entry, and the evolutionary pathways that may facilitate zoonotic spillover [18, 2, 24, 10, 3]. The continued development of deep learning models and enhanced sampling techniques in MD will further improve the accuracy and predictive power of these methods [15, 28]. Future efforts should focus on expanding coverage to understudied bat coronavirus lineages, incorporating glycan heterogeneity into simulations, and coupling structural predictions with machine learning-based risk assessment [1, 23, 32, 33, 34, 35, 4]. Such integrated computational virology approaches will be instrumental in pandemic preparedness and the design of broad-spectrum countermeasures [22, 26, 27, 30].

References

[1] Gallo G, Di Nardo A, Lugano D et al. Heart-nosed bat alphacoronaviruses use human CEACAM6 to enter cells. Nature. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/42020746/

[2] Su C, He J, Wang L et al. Structural characteristics of BtKY72 RBD bound to bat ACE2 reveal multiple key residues affecting ACE2 usage of sarbecoviruses. mBio. 2024. URL: https://pubmed.ncbi.nlm.nih.gov/39082798/

[3] Guo H, Hu BJ, Yang XL et al. Evolutionary Arms Race between Virus and Host Drives Genetic Diversity in Bat Severe Acute Respiratory Syndrome-Related Coronavirus Spike Genes. J Virol. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/32699095/

[4] Lau SKP, Luk HKH, Wong ACP et al. Identification of a Novel Betacoronavirus (Merbecovirus) in Amur Hedgehogs from China. Viruses. 2019. URL: https://pubmed.ncbi.nlm.nih.gov/31653070/

[5] Moreno A, Lelli D, de Sabato L et al. Detection and full genome characterization of two beta CoV viruses related to Middle East respiratory syndrome from bats in Italy. Virol J. 2017. URL: https://pubmed.ncbi.nlm.nih.gov/29258555/

[6] Wan Y, Shang J, Graham R et al. Receptor Recognition by the Novel Coronavirus from Wuhan: an Analysis Based on Decade-Long Structural Studies of SARS Coronavirus. J Virol. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/31996437/

[7] Hou Y, Peng C, Yu M et al. Angiotensin-converting enzyme 2 (ACE2) proteins of different bat species confer variable susceptibility to SARS-CoV entry. Arch Virol. 2010. URL: https://pubmed.ncbi.nlm.nih.gov/20567988/

[8] Hills FR, Eruera AR, Hodgkinson-Bean J et al. Variation in structural motifs within SARS-related coronavirus spike proteins. PLoS Pathog. 2024. URL: https://www.semanticscholar.org/paper/b978c5075e0b5257fafa6abf579e1a7b5ef8750c

[9] Kaushik R, Kumar N, Zhang KYJ et al. A novel structure-based approach for identification of vertebrate susceptibility to SARS-CoV-2: Implications for future surveillance programmes. Environ Res. 2022. URL: https://pubmed.ncbi.nlm.nih.gov/35460633/

[10] Khaledian E, Ulusan S, Erickson J et al. Sequence determinants of human-cell entry identified in ACE2-independent bat sarbecoviruses: A combined laboratory and computational network science approach. EBioMedicine. 2022. URL: https://pubmed.ncbi.nlm.nih.gov/35405384/

[11] Damas J, Hughes GM, Keough KC et al. Broad host range of SARS-CoV-2 predicted by comparative and structural analysis of ACE2 in vertebrates. Proc Natl Acad Sci U S A. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/32826334/

[12] Jaimes JA, Andre NM, Millet JK et al. Structural modeling of 2019-novel coronavirus (nCoV) spike protein reveals a proteolytically-sensitive activation loop as a distinguishing feature compared to SARS-CoV and related SARS-like coronaviruses. ArXiv. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/32550245/

[13] Math RK, Mudennavar N, Javaregowda PK et al. In Silico Comparative Analysis of the Functional, Structural, and Evolutionary Properties of SARS-CoV-2 Variant Spike Proteins. JMIR Bioinform Biotechnol. 2022. URL: https://pubmed.ncbi.nlm.nih.gov/35669291/

[14] Trigueiro-Louro J, Correia V, Figueiredo-Nunes I et al. Unlocking COVID therapeutic targets: A structure-based rationale against SARS-CoV-2, SARS-CoV and MERS-CoV Spike. Comput Struct Biotechnol J. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/32913581/

[15] Sharma SK, Maheshwari S. Protein Structure and Function Prediction of SARS-CoV 2: Prospective Antivirus Active Drug Binding Sites. IOP Conference Series: Materials Science and Engineering. 2021. URL: https://www.semanticscholar.org/paper/4e43ae78400f26c83a4593483e34f17bafa79f40

[16] Balogun T, Kearns FL, Calvó-Tusell C et al. Structural Dynamics and Allosteric Communication of a SARS-Like Bat Coronavirus Spike Glycoprotein. bioRxiv. 2025. URL: https://www.semanticscholar.org/paper/82afae5c3629fc34425135df3187d4de9e56f0e4 *** Disclaimer: This article is for educational and informational purposes only. It is not intended to substitute for professional veterinary advice, diagnosis, treatment, or regulatory guidance. Always consult a licensed veterinarian or qualified specialist regarding animal health, disease diagnosis, and therapeutic decisions.

[17] Lupala CS, Kumar V, Su XD et al. Computational insights into differential interaction of mammalian angiotensin-converting enzyme 2 with the SARS-CoV-2 spike receptor binding domain. Comput Biol Med. 2022. URL: https://pubmed.ncbi.nlm.nih.gov/34758907/

[18] Dubey A, Kumar M, Tufail A. Inhibiting viral entry of bat-derived coronavirus HKU5-CoV-2: Targeting spike protein S1 subunit with FDA-approved antivirals-A structural dynamics and energetics study. Bioorg Chem. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40865231/

[19] Armijos-Jaramillo V, Yeager J, Muslin C et al. SARS-CoV-2, an evolutionary perspective of interaction with human ACE2 reveals undiscovered amino acids necessary for complex stability. Evol Appl. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/32837536/

[20] Sang ER, Tian Y, Gong Y et al. Integrate structural analysis, isoform diversity, and interferon-inductive propensity of ACE2 to predict SARS-CoV2 susceptibility in vertebrates. Heliyon. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/32904785/

[21] Zhai X, Sun J, Yan Z et al. Comparison of Severe Acute Respiratory Syndrome Coronavirus 2 Spike Protein Binding to ACE2 Receptors from Human, Pets, Farm Animals, and Putative Intermediate Hosts. J Virol. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/32404529/

[22] Stein SC, Ssebyatika G, Benecke T et al. A critical residue in a conserved RBD epitope determines neutralization breadth of pan-sarbecovirus antibodies with recurring YYDRxxG motifs. mBio. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40742150/

[23] Zheng N, Xu Y. In silico design of a novel multi-epitope mRNA vaccine candidate for BtHKU5-CoV-2 using immunoinformatics. PLoS Negl Trop Dis. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/41931545/

[24] Telenti A, Hodcroft EB, Robertson DL. The Evolution and Biology of SARS-CoV-2 Variants. Cold Spring Harb Perspect Med. 2022. URL: https://pubmed.ncbi.nlm.nih.gov/35444005/

[25] Nagy A, Basiouni S, Parvin R et al. Evolutionary insights into the furin cleavage sites of SARS-CoV-2 variants from humans and animals. Arch Virol. 2021. URL: https://pubmed.ncbi.nlm.nih.gov/34258664/

[26] Kathwate GH. In silico Design and Characterization of Multi-epitopes Vaccine for SARS-CoV2 from Its Spike Protein. Int J Pept Res Ther. 2022. URL: https://pubmed.ncbi.nlm.nih.gov/35002585/

[27] Pagadala NS, Landi A, Maturu P et al. In silico identification of RBD subdomain of spike protein from Pro(322)-Thr(581) for applications in vaccine development against SARS-CoV2. J Mol Struct. 2021. URL: https://pubmed.ncbi.nlm.nih.gov/33967342/

[28] Akhand MRN, Azim KF, Hoque SF et al. Genome based Evolutionary study of SARS-CoV-2 towards the Prediction of Epitope Based Chimeric Vaccine. bioRxiv. 2020. URL: https://www.semanticscholar.org/paper/c9d0167a016211160a57d0dc9eb5f7c1566ab16a

[29] Michel CJ, Mayer C, Poch O et al. Characterization of accessory genes in coronavirus genomes. Virol J. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/32854725/

[30] Sharmin R, Islam AB. Conserved antigenic sites between MERS-CoV and Bat-coronavirus are revealed through sequence analysis. Source Code Biol Med. 2016. URL: https://pubmed.ncbi.nlm.nih.gov/26962326/

[31] Al-Shomrani BM, Manee MM, Alharbi SN et al. Genomic Sequencing and Analysis of Eight Camel-Derived Middle East Respiratory Syndrome Coronavirus (MERS-CoV) Isolates in Saudi Arabia. Viruses. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/32503352/

[32] Vale FF, Vítor JMB, Marques AT et al. Origin, phylogeny, variability and epitope conservation of SARS-CoV-2 worldwide. Virus Res. 2021. URL: https://pubmed.ncbi.nlm.nih.gov/34339772/

[33] Pérez de la Lastra JM, Asensio-Calavia P, González-Acosta S et al. Bioinformatic Analysis of Genome-Predicted Bat Cathelicidins. Molecules. 2021. URL: https://pubmed.ncbi.nlm.nih.gov/33806967/

[34] Lu S, Xie XX, Zhao L et al. The immunodominant and neutralization linear epitopes for SARS-CoV-2. Cell Rep. 2021. URL: https://pubmed.ncbi.nlm.nih.gov/33503420/

[35] Akhand MRN, Azim KF, Hoque SF et al. Genome based evolutionary lineage of SARS-CoV-2 towards the development of novel chimeric vaccine. Infect Genet Evol. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/32882432/