Computational Prediction of Host Tropism and Receptor Binding Dynamics in Emerging Zoonotic Coronaviruses
Introduction
The continuous emergence of zoonotic coronaviruses from animal reservoirs poses a persistent threat to animal health and global biosecurity. Understanding the molecular determinants that govern host range and cross‑species transmission is essential for pre‑emptive risk assessment [1, 2]. The spike (S) glycoprotein, particularly its receptor‑binding domain (RBD), is the primary determinant of host tropism through its interaction with species‑specific cellular receptors such as angiotensin‑converting enzyme 2 (ACE2) or dipeptidyl peptidase 4 (DPP4) [3, 4]. Recent advances in computational structural biology, phylogenetics, and machine learning now enable the systematic prediction of these interactions, providing a quantitative framework for evaluating zoonotic potential [5, 6].
This article reviews the biophysical principles, computational algorithms, and integrative workflows used to forecast host tropism and receptor binding dynamics in coronaviruses circulating among bats, pangolins, and other potential reservoir hosts. Emphasis is placed on veterinary applications, including surveillance in livestock and wildlife populations, and on methods that can be deployed without reliance on commercial platforms.
Molecular Basis of Receptor Binding
Coronavirus entry into host cells is initiated by the binding of the RBD to a species‑specific receptor. For betacoronaviruses of the subgenus Sarbecovirus, ACE2 serves as the primary entry receptor, whereas lineages within the subgenus Merbecovirus (e.g., Middle East respiratory syndrome coronavirus, MERS‑CoV) utilize DPP4 [7, 8]. The structural compatibility between the RBD and host receptor is governed by shape complementarity, electrostatic potential, and hydrogen‑bond networks at the interface [9]. Mutations within the RBD can expand or restrict the range of receptors bound, thereby altering host tropism [10, 11]. For example, cryo‑electron microscopy (cryo‑EM) structures of bat SARS‑like coronavirus WIV1 spike reveal that even single amino acid substitutions in the receptor‑binding motif (RBM) can markedly alter the dynamics of the RBD and enhance ACE2 binding across different species [12, 13].
In parallel, the co‑evolution of spike and hemagglutinin‑esterase (HE) proteins in betacoronaviruses modulates virion avidity and receptor usage [14, 15]. The progressive loss of HE lectin activity has been associated with adaptation to human hosts, but similar evolutionary transitions are likely occurring in companion and production animals [16, 17]. Computational models must therefore account for both spike‑receptor and spike‑HE functional balance.
Computational Methods for Tropism Prediction
Sequence Surveillance and Phylogenetic Analysis
The first layer of computational prediction relies on extensive sequence data from animal reservoirs. Global repositories such as GISAID and NCBI GenBank archive thousands of coronavirus genomes collected from bats, pangolins, birds, and livestock [18, 19]. Phylogenetic reconstruction of spike genes allows the identification of clades that harbour RBD sequences with known or predicted receptor affinity [20, 21]. Ancestral sequence reconstruction methods can infer the sequence of intermediate viral ancestors and evaluate whether they possessed latent capacity to bind a given receptor [22]. For example, maximum‑likelihood reconstruction of SARS‑CoV‑2 ancestral RBD sequences revealed that the progenitor already contained residues compatible with human ACE2 binding, suggesting a pre‑existing risk [22].
Protein Structure Prediction and Docking
High‑resolution structures of RBD‑receptor complexes determined by X‑ray crystallography or cryo‑EM serve as templates for computational docking [12, 13]. In the absence of experimental structures, tools such as AlphaFold2 and Rosetta can generate reliable models of RBD and ACE2 orthologs across host species. AlphaFold2, in particular, has been used to model bat coronavirus spike glycoproteins and to predict their interaction interfaces with ACE2 molecules from different mammals [1, 2]. Docking algorithms (e.g., HADDOCK, ZDOCK, ClusPro) then compute binding energies and interface scores, which correlate with experimentally measured affinities [23]. Improved fuzzy‑based optimization algorithms have also been developed to refine docking predictions for viral receptor interactions [6].
Molecular Dynamics Simulations
Static docking is insufficient to capture the conformational flexibility inherent in viral glycoproteins. Molecular dynamics (MD) simulations provide a time‑resolved view of RBD‑ACE2 interactions, revealing transient contacts, key stabilizing residues, and dissociation kinetics [17]. All‑atom MD simulations of bat sarbecovirus spikes have shown that adaptive mutations in the RBD increase the frequency of closed‑to‑open conformational transitions, thereby enhancing receptor accessibility [13]. Similarly, long‑timescale simulations of MERS‑CoV and mink coronaviruses have clarified how DPP4 orthologs from different species influence binding stability [20, 23]. Coarse‑grained MD methods extend these simulations to larger systems, such as full trimeric spike assemblies [12, 13].
Machine Learning and Foundation Models
Recent developments in protein language models (PLMs) and deep learning enable direct prediction of host tropism from sequence alone. Models such as HViLM (a foundation model for viral genomics) and PRIME (a protein representation inference framework) integrate sequence embeddings with downstream classifiers to predict pathogenicity, transmissibility, and host range [1, 2]. Convolutional neural networks trained on spike sequences can discriminate between avian and mammalian receptor binding preferences, a concept originally demonstrated for influenza A virus neuraminidase and hemagglutinin [5, 9, 10]. Random forest classifiers applied to protein signatures have successfully distinguished zoonotic from non‑zoonotic influenza A viruses; similar approaches are now being adapted to coronaviruses [11, 18, 19].
Gene expression profiling of host tissues can further refine tropism predictions. For bovine coronavirus (BCoV), artificial intelligence models trained on transcriptomic data have identified candidate receptors and host cell enzymes that may facilitate tissue‑specific infection [4].
Predictive Frameworks for Zoonotic Risk
An integrated computational pipeline for spillover risk assessment typically proceeds through the following stages, as illustrated in Figure 1.
flowchart TD
A[Field surveillance: bats, livestock, wildlife], > B[Sequencing of coronavirus genomes]
B, > C[Phylogenetic analysis and clade assignment]
C, > D[Ancestral sequence reconstruction of RBD]
D, > E[Structure prediction (AlphaFold2 / Rosetta)]
E, > F[Molecular docking against orthologous receptors]
F, > G[Molecular dynamics simulation of interface]
G, > H[Binding energy scoring and interaction network analysis]
H, > I[Machine learning classifiers (PLM, RF, CNN)]
I, > J[Risk tier: low / moderate / high spillover potential]
J, > K[Targeted veterinary surveillance and biosecurity planning]
Figure 1. Integrated computational workflow for predicting host tropism and zoonotic risk in emerging coronaviruses.
Such frameworks have been applied to predict the zoonotic potential of bat sarbecoviruses such as WIV1 and to identify adaptive mutations that enhance ACE2 binding [12, 13, 22]. The same logic can be extended to mink coronaviruses that have been shown to use ACE2 despite belonging to a different phylogenetic lineage, indicating convergent evolution [23].
Case Studies in Receptor Binding Dynamics
Bat Sarbecovirus WIV1
The bat coronavirus WIV1 (also known as Rs3367) possesses a spike RBD that is structurally similar to that of SARS‑CoV‑1 [12]. Cryo‑EM structures and MD simulations reveal that A368S and L472P mutations in the WIV1 RBM increase electrostatic compatibility with human ACE2 relative to the ancestral bat sequence [13]. These mutations also alter the conformational dynamics of the spike, favouring the open state and thus enhancing receptor binding [13]. Computational docking accurately recapitulates the effect of these changes and predicts that additional substitutions (e.g., T487S) could further broaden host range [12].
MERS‑CoV and Mink Coronavirus
MERS‑CoV requires DPP4 for entry, and the ability to infect a given host species depends on the conservation of key contact residues in DPP4 [20]. Computational mutagenesis and structural analysis have shown that the DPP4 of camels, the primary reservoir, supports robust binding, whereas the DPP4 of horses and dogs does not [20]. Unexpectedly, a MERS‑like coronavirus isolated from mink was found to use ACE2 rather than DPP4, as demonstrated by pseudovirus entry assays and confirmed by docking studies [23]. This finding underscores the need to consider receptor switching in spillover predictions.
Endemic Betacoronaviruses
The adaptation of endemic human coronaviruses (HCoV‑OC43 and HCoV‑229E) from animal ancestors involved progressive structural changes in the RBD and the HE lectin [16, 15]. Comparative computational modeling of animal coronaviruses, such as bovine coronavirus (BCoV) and canine respiratory coronavirus, reveals analogous evolutionary trajectories that may currently be underway in livestock [4, 14].
Limitations and Future Directions
Despite impressive progress, computational tropism prediction faces several hurdles. First, the availability of high‑quality experimental structures for host receptors from diverse species remains limited; many simulations rely on homology models with uncertain accuracy [7, 17]. Second, machine learning models trained on influenza virus data may not directly transfer to coronaviruses due to differences in receptor biochemistry (e.g., sialic acid versus ACE2) [24]. Third, predictions of binding affinity do not always correlate with actual infection because other viral and host factors (e.g., proteolytic activation of spike, innate immune barriers) modulate entry efficiency [8, 25].
Future work should integrate proteome‑scale modeling of host range, incorporate data from deep mutational scanning of RBD‑antibody escape [deep mutational scanning article], and link computational outputs with field epidemiological data. Foundation models that unify sequence, structure, and fitness predictions hold particular promise [2]. Additionally, the inclusion of glycan shields and their effect on RBD accessibility will improve the realism of MD simulations [24].
Conclusion
Computational prediction of host tropism and receptor binding dynamics has matured into a quantitative discipline that complements traditional virological surveillance. By combining phylogenetic analysis, protein structure prediction, molecular dynamics, and machine learning, researchers can now identify coronaviruses with elevated spillover risk before they become established in new host populations. These in silico tools are essential for veterinary biosecurity, enabling proactive monitoring in bat roosts, livestock herds, and wildlife markets. Continued integration of experimental validation and open‑source modeling will further strengthen our capacity to anticipate and mitigate zoonotic threats.
References
[1] Gibson K, Li PE, Li V, et al. PRIME: An evaluation framework for protein representation inference and generalization in viral mutation space. BMC Genomics. 2026. https://pubmed.ncbi.nlm.nih.gov/42215857/
[2] Dutta P, Vaska J, Surana P, et al. HViLM: A Foundation Model for Viral Genomics Enables Multi‑Task Prediction of Pathogenicity, Transmissibility, and Host Tropism. bioRxiv. 2026. https://pubmed.ncbi.nlm.nih.gov/41890006/
[3] Mellace M, Ceniti C, Cataldi M, et al. Avian Influenza Virus: Comparative Evolution as the Key for Predicting Host Tropism Expansion. Pathogens. 2025. https://pubmed.ncbi.nlm.nih.gov/40732656/
[4] Khan MY, Shah AU, Duraisamy N, et al. Leveraging Artificial Intelligence and Gene Expression Analysis to Identify Some Potential Bovine Coronavirus (BCoV) Receptors and Host Cell Enzymes Potentially Involved in the Viral Replication and Tissue Tropism. Int J Mol Sci. 2025. https://pubmed.ncbi.nlm.nih.gov/39941096/
[5] Borkenhagen LK, Runstadler JA. Examining the Influenza A Virus Sialic Acid Binding Preference Predictions of a Sequence‑Based Convolutional Neural Network. Influenza Other Respir Viruses. 2024. https://pubmed.ncbi.nlm.nih.gov/39663148/
[6] Achom A, Das R, Pakray P. An improved Fuzzy based GWO algorithm for predicting the potential host receptor of COVID‑19 infection. Comput Biol Med. 2022. https://pubmed.ncbi.nlm.nih.gov/36334362/
[7] Beaudoin CA, Jamasb AR, Alsulami AF, et al. Predicted structural mimicry of spike receptor‑binding motifs from highly pathogenic human coronaviruses. Comput Struct Biotechnol J. 2021. https://pubmed.ncbi.nlm.nih.gov/34234921/
[8] Murgolo N, Therien AG, Howell B, et al. SARS‑CoV‑2 tropism, entry, replication, and propagation: Considerations for drug discovery and development. PLoS Pathog. 2021. https://pubmed.ncbi.nlm.nih.gov/33596266/
[9] Scarafoni D, Telfer BA, Ricke DO, et al. Predicting Influenza A Tropism with End‑to‑End Learning of Deep Networks. Health Secur. 2019. https://pubmed.ncbi.nlm.nih.gov/31859569/
[10] Yin R, Zhou X, Zheng J, et al. Computational identification of physicochemical signatures for host tropism of influenza A virus. J Bioinform Comput Biol. 2018. https://pubmed.ncbi.nlm.nih.gov/30567479/
[11] Eng CLP, Tong JC, Tan TW. Predicting Zoonotic Risk of Influenza A Viruses from Host Tropism Protein Signature Using Random Forest. Int J Mol Sci. 2017. https://pubmed.ncbi.nlm.nih.gov/28587080/
[12] Liu C, Zheng J, Wang Y, et al. Cryo‑EM structure of locked spike glycoprotein from bat SARS‑like coronavirus WIV1, molecular dynamics and biophysics across host range. Proc Natl Acad Sci U S A. 2026. https://pubmed.ncbi.nlm.nih.gov/41706884/
[13] Tse AL, Lasso G, Berrigan J, et al. Bat sarbecovirus WIV1‑CoV bears an adaptive mutation that alters spike dynamics and enhances ACE2 binding. PLoS Pathog. 2025. https://pubmed.ncbi.nlm.nih.gov/41100556/
[14] Lang Y, Li W, Li Z, et al. Coronavirus hemagglutinin‑esterase and spike proteins coevolve for functional balance and optimal virion avidity. Proc Natl Acad Sci U S A. 2020. https://pubmed.ncbi.nlm.nih.gov/32994342/
[15] Bakkers MJ, Lang Y, Feitsma LJ, et al. Betacoronavirus Adaptation to Humans Involved Progressive Loss of Hemagglutinin‑Esterase Lectin Activity. Cell Host Microbe. 2017. https://pubmed.ncbi.nlm.nih.gov/28279346/ *** Disclaimer: This article is for educational and informational purposes only. It is not intended to substitute for professional veterinary advice, diagnosis, treatment, or regulatory guidance. Always consult a licensed veterinarian or qualified specialist regarding animal health, disease diagnosis, and therapeutic decisions.
[16] Forni D, Cagliani R, Arrigoni F, et al. Adaptation of the endemic coronaviruses HCoV‑OC43 and HCoV‑229E to the human host. Virus Evol. 2021. https://pubmed.ncbi.nlm.nih.gov/34527284/
[17] Rynkiewicz P, Lynch ML, Cui F, et al. Functional binding dynamics relevant to the evolution of zoonotic spillovers in endemic and emergent Betacoronavirus strains. J Biomol Struct Dyn. 2022. https://pubmed.ncbi.nlm.nih.gov/34286673/
[18] Eng CL, Tong JC, Tan TW. Distinct Host Tropism Protein Signatures to Identify Possible Zoonotic Influenza A Viruses. PLoS One. 2016. https://pubmed.ncbi.nlm.nih.gov/26915079/
[19] Eng CL, Tong JC, Tan TW. Predicting host tropism of influenza A virus proteins using random forest. BMC Med Genomics. 2014. https://pubmed.ncbi.nlm.nih.gov/25521718/
[20] van Doremalen N, Miazgowicz KL, Milne‑Price S, et al. Host species restriction of Middle East respiratory syndrome coronavirus through its receptor, dipeptidyl peptidase 4. J Virol. 2014. https://pubmed.ncbi.nlm.nih.gov/24899185/
[21] Tønnessen R, Hauge AG, Hansen EF, et al. Host restrictions of avian influenza viruses: in silico analysis of H13 and H16 specific signatures in the internal proteins. PLoS One. 2013. https://pubmed.ncbi.nlm.nih.gov/23646204/
[22] Brintnell E, Gupta M, Anderson DW. Phylogenetic and Ancestral Sequence Reconstruction of SARS‑CoV‑2 Reveals Latent Capacity to Bind Human ACE2 Receptor. J Mol Evol. 2021. https://pubmed.ncbi.nlm.nih.gov/34739551/
[23] Wang N, Ji W, Jiao H, et al. A MERS‑CoV‑like mink coronavirus uses ACE2 as an entry receptor. Nature. 2025. https://pubmed.ncbi.nlm.nih.gov/40306315/
[24] Kuchipudi SV, Nelli RK, Gontu A, et al. Sialic Acid Receptors: The Key to Solving the Enigma of Zoonotic Virus Spillover. Viruses. 2021. https://pubmed.ncbi.nlm.nih.gov/33567791/
[25] Ou J, Zhou Z, Dai R, et al. V367F Mutation in SARS‑CoV‑2 Spike RBD Emerging during the Early Transmission Phase Enhances Viral Infectivity through Increased Human ACE2 Receptor Binding Affinity. J Virol. 2021. https://pubmed.ncbi.nlm.nih.gov/34105996/