Zubair Khalid

Virologist/Molecular Biologist | Veterinarian | Bioinformatician

Conventional & Molecular Virology • Vaccine Development • Computational Biology

Dr. Zubair Khalid is a veterinarian and virologist specializing in conventional and molecular virology, vaccine development, and computational biology. Dedicated to advancing animal health through innovative research and multi-omics approaches.

Dr. Zubair Khalid - Veterinarian, Virologist, and Vaccine Development Researcher specializing in Computational Biology, Multi-omics, Animal Health, and Infectious Disease Research

Section: Computational Biology

Predicting Cross-Species Viral Spillover: Integrating Structural Modeling, Receptor Binding Dynamics, and Genomic Surveillance

Introduction

The emergence of viral pathogens from animal reservoirs into new host species represents a persistent threat to animal health and agricultural biosecurity [1, 2]. Predicting cross-species spillover requires an integrated understanding of viral molecular determinants, host receptor landscapes, and ecological exposure pathways [3, 4]. Computational virology has advanced to provide quantitative frameworks that combine structural biology, biophysical simulation, and high-throughput sequence analysis to assess spillover risk before a virus establishes sustained transmission in a novel host [5, 6]. This review focuses on the integration of three core methodologies: structural modeling of viral receptor-binding proteins, analysis of receptor binding dynamics across host orthologs, and genomic surveillance of viral diversity in reservoir populations. Emphasis is placed on veterinary-relevant pathogens, including coronaviruses, influenza A viruses, and henipaviruses, with bat and avian reservoirs as primary examples [7, 8, 9].

Structural Modeling of Viral Receptor-Binding Proteins

The initial molecular event in cross-species transmission is the binding of a viral attachment protein to a host cell surface receptor [2, 10]. For enveloped RNA viruses, the receptor-binding protein (e.g., coronavirus spike protein, influenza hemagglutinin, henipavirus attachment glycoprotein) determines host range and tissue tropism [11, 12]. Structural modeling using homology-based approaches and deep learning methods such as AlphaFold2 enables the prediction of three-dimensional conformations for these proteins when experimental structures are unavailable [13, 12]. These models serve as templates for downstream docking simulations and binding affinity calculations.

Molecular docking algorithms, such as those implemented in AutoDock Vina, predict the orientation and binding energy of a viral receptor-binding domain (RBD) with a host receptor ortholog [11, 14]. For example, docking of porcine respiratory coronavirus spike protein with candidate intermediate host ACE2 orthologs identified key residue contacts that govern species susceptibility [11]. Similarly, systematic docking of bat sarbecovirus RBDs against a panel of vertebrate ACE2 sequences revealed that binding affinity correlates with phylogenetic distance between host species [14, 15]. These studies demonstrate that structural compatibility at the RBD-receptor interface is a necessary but not sufficient condition for spillover [12, 16].

Molecular dynamics (MD) simulations extend static docking by modeling the conformational flexibility of both viral and host proteins over nanosecond to microsecond timescales [12, 10]. MD trajectories provide estimates of binding free energy, hydrogen bond occupancy, and induced-fit changes that static docking cannot capture [12]. Comparative MD simulations of SARS-related coronavirus spike proteins with ACE2 orthologs from multiple mammalian species have identified mutations in the RBD that enhance binding to non-reservoir hosts [12, 15]. These simulations can be used to rank the spillover potential of circulating viral variants in real time [13, 12].

Receptor Binding Dynamics Across Host Orthologs

The host receptor itself evolves under selective pressures that shape its interaction with viral ligands [14, 15]. For coronaviruses, ACE2 is the primary receptor, and its sequence variability across mammals determines the breadth of potential host range [14, 15]. Systematic multi-reference sequence alignment of ACE2 from over 100 vertebrate species, combined with structural mapping of contact residues, allows prediction of which species are likely to be susceptible to a given sarbecovirus [14]. This approach identified several bat species and select artiodactyls as having ACE2 orthologs with high predicted binding affinity to pandemic-capable spike RBDs [14, 15].

For influenza A viruses, the receptor is sialic acid linked to galactose by either alpha-2,3 or alpha-2,6 linkages [10]. Avian influenza viruses preferentially bind alpha-2,3 sialic acids found in the avian intestinal tract, while mammalian-adapted viruses bind alpha-2,6 sialic acids predominant in the mammalian respiratory tract [8, 10]. Computational modeling of hemagglutinin-receptor interactions using MD simulations and free energy perturbation can predict the amino acid substitutions that shift binding preference from avian-type to mammalian-type receptors [10, 17]. This structural insight is critical for assessing the pandemic potential of avian influenza viruses circulating in poultry [18, 19].

Henipaviruses, such as Nipah virus and the recently discovered Angavokely virus from fruit bats in Madagascar, utilize ephrin-B2 and ephrin-B3 as entry receptors [9]. Structural modeling of the henipavirus attachment glycoprotein in complex with ephrin orthologs from various mammalian species can identify species barriers [9, 5]. The conservation of ephrin-B2 across mammals suggests a broad potential host range, but subtle differences in binding interface residues may modulate spillover efficiency [9, 5].

Genomic Surveillance and Sequence-Based Risk Prediction

Genomic surveillance of viral populations in reservoir hosts provides the raw material for structural and evolutionary analyses [20, 21, 22]. Metagenomic sequencing of wildlife samples, including bats, rodents, and shrews, has uncovered a vast diversity of novel viruses with unknown zoonotic potential [20, 23, 24, 22]. For example, virome characterization of small mammals in West Africa and Nigeria identified numerous coronaviruses, paramyxoviruses, and orthomyxoviruses with sequence features suggestive of cross-species capability [20, 21]. Similarly, analysis of rodent and shrew viromes in suburban and high-altitude regions revealed novel viruses with phylogenetic proximity to known zoonotic pathogens [23, 24, 22].

Sequence-based risk prediction leverages features such as codon usage bias, genomic reassortment potential, and phylogenetic host range [19, 25, 6]. Codon usage patterns in merbecoviruses (subgenus of betacoronaviruses) differ between bat and non-bat hosts, and these patterns can be used to infer recent host adaptation [25]. For segmented viruses like influenza A, genomic similarity metrics can quantify the reassortment potential between circulating avian strains and mammalian-adapted strains [19, 17]. The H7N9 subtype, for instance, acquired internal gene segments from H9N2 viruses through reassortment, a process that can be modeled using sequence similarity networks [19, 17].

Machine learning models trained on viral genomic features and host ecological traits can predict the probability of zoonotic spillover [3, 26, 5, 4]. A unified framework that integrates RNA virus genome composition, host phylogenetic distance, and reservoir contact networks has been developed to prioritize cross-species transmission risk across an expansive host landscape [3]. Predictive multi-omic biomarkers, including host transcriptomic signatures and viral mutation profiles, further refine spillover risk assessments in urban environments where human-animal interfaces are dense [26]. These models consistently identify bat-borne coronaviruses and avian influenza viruses as high-risk groups [3, 4].

Integrating Structural and Genomic Data: A Computational Workflow

The integration of structural modeling, receptor binding dynamics, and genomic surveillance requires a systematic computational pipeline. The following Mermaid diagram illustrates a typical workflow for predicting cross-species spillover risk.

flowchart TD
    A[Wildlife Sampling & Metagenomic Sequencing], > B[Viral Genome Assembly & Annotation]
    B, > C[Phylogenetic Classification & Host Association]
    C, > D[Identification of Receptor-Binding Protein Genes]
    D, > E[Structural Modeling of RBD (Homology / AlphaFold2)]
    E, > F[Molecular Docking with Host Receptor Orthologs]
    F, > G[Molecular Dynamics Simulations & Binding Free Energy Calculation]
    G, > H[Ranking of Host Species by Predicted Binding Affinity]
    C, > I[Codon Usage & Reassortment Analysis]
    I, > J[Machine Learning Risk Classification]
    H, > J
    J, > K[Spillover Risk Prioritization]
    K, > L[Targeted Surveillance & Experimental Validation]

The workflow begins with metagenomic sequencing of reservoir hosts, followed by genome assembly and phylogenetic classification [20, 21, 22]. Receptor-binding protein genes are extracted and modeled structurally using template-based or deep learning methods [13, 12]. Docking and MD simulations evaluate binding affinity against a panel of host receptor orthologs [11, 14, 12]. Concurrently, genomic features such as codon usage and reassortment potential are fed into machine learning classifiers [3, 25]. The combined output prioritizes host species and viral lineages for targeted surveillance and experimental validation [26, 5].

Case Studies in Veterinary Virology

Several case studies illustrate the practical application of these integrated methods. For bat coronaviruses, structural modeling of the spike RBD from diverse sarbecoviruses against ACE2 orthologs of livestock species (e.g., swine, cattle, goats) has identified specific mutations that enhance binding to these hosts [14, 12]. These predictions align with experimental binding assays and suggest that certain bat coronaviruses have a higher propensity for spillover into domestic animals than others [7, 14].

For avian influenza, computational docking of hemagglutinin from H5N1 and H7N9 subtypes with avian and mammalian sialic acid receptors has pinpointed key amino acid positions (e.g., Q226L, G228S) that confer mammalian adaptation [10, 17]. MD simulations further demonstrate that these mutations stabilize the hemagglutinin-receptor complex in the mammalian respiratory tract environment [10]. Genomic surveillance of poultry flocks in endemic regions can detect the emergence of such mutations early, enabling preemptive biosecurity measures [18, 8].

For henipaviruses, structural comparison of the attachment glycoprotein from Angavokely virus with that of Nipah virus reveals conserved ephrin-binding interfaces, suggesting that livestock species expressing ephrin-B2 are potentially susceptible [9]. Docking simulations with ephrin orthologs from pigs, horses, and dogs can rank spillover risk and inform surveillance priorities [9, 5].

Limitations and Future Directions

Despite significant progress, current computational approaches have limitations. Docking and MD simulations are computationally intensive and may not capture the full complexity of viral entry, including post-receptor binding events such as membrane fusion and endosomal escape [2, 12]. Additionally, the availability of high-quality structural templates for novel viral proteins remains a bottleneck [13]. Machine learning models require large, well-curated training datasets that are often biased toward well-studied viral families [3, 6]. Furthermore, ecological and behavioral factors that influence exposure frequency are not captured by molecular models alone [16, 4].

Future directions include the integration of structural modeling with deep mutational scanning data to predict escape mutations that alter receptor specificity [13, 12]. The development of foundation models for protein language, such as ESMFold, can accelerate structure prediction for novel viral proteins without requiring homologous templates [13]. Real-time genomic surveillance using portable sequencing technologies, combined with automated structural modeling pipelines, could enable rapid risk assessment during outbreak investigations [20, 21]. Finally, incorporating host immune status and prior exposure history into predictive models will improve the accuracy of spillover risk assessments [26, 1].

Conclusion

Predicting cross-species viral spillover requires a multidisciplinary approach that integrates structural modeling of receptor-binding proteins, biophysical analysis of receptor binding dynamics, and genomic surveillance of viral diversity in reservoir populations. Computational methods, including molecular docking, molecular dynamics simulations, and machine learning classifiers, provide quantitative tools to prioritize viral lineages and host species for targeted surveillance. These methods have been successfully applied to bat coronaviruses, avian influenza viruses, and henipaviruses, demonstrating their utility in veterinary virology. Continued advances in structural prediction algorithms, high-throughput sequencing, and integrative modeling will further enhance our ability to anticipate and mitigate emerging zoonotic threats.

References

[1] Bhatia B, Sonar S, Khan S et al. Pandemic-Proofing: Intercepting Zoonotic Spillover Events. Pathogens. 2024. URL: https://pubmed.ncbi.nlm.nih.gov/39770327/

[2] Pauciullo S, Zulian V, La Frazia S et al. Spillover: Mechanisms, Genetic Barriers, and the Role of Reservoirs in Emerging Pathogens. Microorganisms. 2024. URL: https://pubmed.ncbi.nlm.nih.gov/39597581/

[3] Zhao D, Wang YF, Yin ZF et al. A Unified Framework to Prioritize RNA Virus Cross-Species Transmission Risk Across an Expansive Host Landscape. Viruses. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/41754554/

[4] Olival KJ, Hosseini PR, Zambrana-Torrelio C et al. Host and viral traits predict zoonotic spillover from mammals. Nature. 2017. URL: https://pubmed.ncbi.nlm.nih.gov/28636590/

[5] Pandit PS, Anthony SJ, Goldstein T et al. Predicting the potential for zoonotic transmission and host associations for novel viruses. Commun Biol. 2022. URL: https://pubmed.ncbi.nlm.nih.gov/35986178/

[6] Geoghegan JL, Holmes EC. Predicting virus emergence amid evolutionary noise. Open Biol. 2017. URL: https://pubmed.ncbi.nlm.nih.gov/29070612/

[7] Li X, Kang M, Jiao XY et al. Addressing the zoonotic threat of merbecoviruses. Nat Microbiol. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/42342927/

[8] Rather MA, Gul I, Aman M et al. Understanding the pandemic potential of the avian influenza Virus- key insights into pathogenesis, transmission, and host immunity. Arch Microbiol. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40549164/

[9] Madera S, Kistler A, Ranaivoson HC et al. Discovery and Genomic Characterization of a Novel Henipavirus, Angavokely Virus, from Fruit Bats in Madagascar. J Virol. 2022. URL: https://pubmed.ncbi.nlm.nih.gov/36040175/

[10] Kuchipudi SV, Nelli RK, Gontu A et al. Sialic Acid Receptors: The Key to Solving the Enigma of Zoonotic Virus Spillover. Viruses. 2021. URL: https://pubmed.ncbi.nlm.nih.gov/33567791/

[11] Sootichote R, Chamkasem A, Toniti W et al. Screening candidate intermediate hosts for porcine respiratory coronavirus using molecular docking. Comp Immunol Microbiol Infect Dis. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/42361779/

[12] Rajendran M, Babbitt GA. Persistent cross-species SARS-CoV-2 variant infectivity predicted via comparative molecular dynamics simulation. R Soc Open Sci. 2022. URL: https://pubmed.ncbi.nlm.nih.gov/36340517/

[13] Kim MC, Jung HJ, Jang SS et al. Evolutionary trajectory estimation via replication simulation of coronavirus spike gene based on random mutation and similarity-based selection. Comput Struct Biotechnol J. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40895283/

[14] Frank JA, Gan EX, Hooper WB et al. Systematic multi-reference vertebrate ACE2 sequence similarity analysis predicts species susceptibility to SARS-related sarbecoviruses. Sci Rep. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/41851226/

[15] Devaux CA, Pinault L, Osman IO et al. Can ACE2 Receptor Polymorphism Predict Species Susceptibility to SARS-CoV-2? Front Public Health. 2020. URL: https://pubmed.ncbi.nlm.nih.gov/33643982/

[16] Guth S, Visher E, Boots M et al. Host phylogenetic distance drives trends in virus virulence and transmissibility across the animal-human interface. Philos Trans R Soc Lond B Biol Sci. 2019. URL: https://pubmed.ncbi.nlm.nih.gov/31401961/

[17] Farooqui A, Leon AJ, Huang L et al. Genetic diversity of the 2013-14 human isolates of influenza H7N9 in China. BMC Infect Dis. 2015. URL: https://pubmed.ncbi.nlm.nih.gov/25880069/ *** Disclaimer: This article is for educational and informational purposes only. It is not intended to substitute for professional veterinary advice, diagnosis, treatment, or regulatory guidance. Always consult a licensed veterinarian or qualified specialist regarding animal health, disease diagnosis, and therapeutic decisions.

[18] Wang M, Laison EKE, Philippsen T et al. Mechanistic modelling of highly pathogenic avian influenza: A scoping review revealing critical gaps in cross-species transmission models. PLoS One. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/42060628/

[19] Gao Y, Liu Y, Cao ZW et al. Genomic similarity to quantitatively evaluate the reassortment potential of H7N9 with other subtypes of avian influenza viruses. Front Cell Infect Microbiol. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/41969657/

[20] Zhang XA, Zhang MQ, Liu YW et al. Virome characterization of wild small mammals provides new insight into zoonotic pathogens in West Africa. Microbiome. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/41299624/

[21] Happi A, Sijuwola A, Omah IF et al. Metagenomics reveals cryptic circulation of zoonotic viruses in Nigeria. Res Sq. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/41041524/

[22] Zhang JT, Hu ZY, Tang F et al. Decoding the RNA viromes in shrew lungs along the eastern coast of China. NPJ Biofilms Microbiomes. 2024. URL: https://pubmed.ncbi.nlm.nih.gov/39117662/

[23] Wu H, Jiang X, Xi Y et al. Genetic characterization of the marmot gut virome in high-altitude Qinghai Province and identification of novel viruses with zoonotic potential. mSphere. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40742122/

[24] Hu ZY, Tang F, Zhang MQ et al. Virome characterization of field-collected rodents in suburban city. NPJ Biofilms Microbiomes. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40500267/

[25] Yan G, Li Y, Zhou H et al. Comparative Analysis of Codon Usage Patterns and Host Adaptation in Merbecoviruses. Viruses. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/41305501/

[26] Martínez-Ortiz IC, Garcia-Atutxa I, Sanchez-Villamil JI et al. Predictive multi-omic biomarkers for urban zoonotic spillover detection: an integrative review. Front Public Health. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/41613079/