Zubair Khalid

Virologist/Molecular Biologist | Veterinarian | Bioinformatician

Conventional & Molecular Virology • Vaccine Development • Computational Biology

Dr. Zubair Khalid is a veterinarian and virologist specializing in conventional and molecular virology, vaccine development, and computational biology. Dedicated to advancing animal health through innovative research and multi-omics approaches.

Dr. Zubair Khalid - Veterinarian, Virologist, and Vaccine Development Researcher specializing in Computational Biology, Multi-omics, Animal Health, and Infectious Disease Research

Section: Computational Biology

Structural and Evolutionary Analysis of Rabies Virus Glycoprotein: Implications for Vaccine Design

Abstract

Rabies virus (RABV), a neurotropic lyssavirus within the Rhabdoviridae family, remains a critical pathogen in veterinary medicine due to its high case fatality rate and broad mammalian host range. The viral glycoprotein (G protein) is the primary determinant of host cell receptor binding, membrane fusion, and the principal target of the neutralizing antibody response. This article presents a comprehensive computational virology analysis of the RABV glycoprotein. It examines homology modeling to predict three-dimensional structure, molecular dynamics simulations to assess stability and receptor engagement, and evolutionary analyses across lyssavirus species to identify conserved epitopes. Machine learning approaches for immunogenic region prediction are also discussed. These computational insights collectively inform rational vaccine design, including mRNA and subunit vaccine candidates, and are contextualized within the broader field of structural bioinformatics. Interactive visualization of the glycoprotein structure via the 3D Protein Viewer is recommended for detailed examination of key structural features.

1. Introduction

Rabies virus is the type species of the genus Lyssavirus and causes a progressive, invariably fatal encephalomyelitis in mammals [1]. The viral genome is a single-stranded, negative-sense RNA approximately 12 kilobases in length, encoding five structural proteins: nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G), and the RNA-dependent RNA polymerase (L) [1]. Among these, the G protein is the sole envelope protein responsible for receptor recognition, pH-dependent membrane fusion, and viral entry [2, 1].

The G protein is a class I viral fusion glycoprotein that forms homotrimeric spikes on the virion surface [2]. Each monomer consists of an ectodomain, a transmembrane domain, and a short cytoplasmic tail [1]. The ectodomain is further divided into three subdomains: the receptor-binding domain (RBD), the fusion domain (FD), and a pleckstrin homology domain (PHD) that stabilizes the trimeric assembly [2]. Given its central role in infection and immunity, the G protein is the primary target for vaccine development and therapeutic antibody design [2].

The emergence of novel lyssaviruses and the persistent circulation of RABV in wildlife reservoirs necessitate computationally informed approaches to vaccine design [2]. This article provides a detailed examination of how structural, evolutionary, and computational methods can be applied to the RABV glycoprotein to identify conserved antigenic targets and to guide the rational design of next-generation veterinary vaccines. For foundational context on rabies virus biology and surveillance, readers are directed to the related articles Rabies Lyssavirus and Rabies Virus in Wildlife Reservoirs.

2. Structural Architecture of the RABV Glycoprotein

The RABV G protein is synthesized as a precursor of approximately 524 amino acids and is post-translationally modified by N-linked glycosylation and palmitoylation [1]. The mature protein is anchored in the viral envelope via a single transmembrane helix near the C-terminus [1]. The trimeric ectodomain extends approximately 10 nanometers from the viral membrane surface [2].

2.1 Domain Organization

Cryo-electron microscopy (cryo-EM) and X-ray crystallographic studies of related rhabdovirus glycoproteins have enabled high-resolution modeling of the RABV G protein [2]. The ectodomain folds into three distinct structural domains:

  • Receptor-Binding Domain (RBD): Located at the membrane-distal apex of the trimer, the RBD mediates attachment to host cell receptors, including the nicotinic acetylcholine receptor (nAChR), the neural cell adhesion molecule (NCAM), and the p75 neurotrophin receptor (p75NTR) [2].
  • Fusion Domain (FD): This domain undergoes a dramatic pH-dependent conformational rearrangement from a prefusion to a postfusion state, driving the merger of the viral and host endosomal membranes [2].
  • Pleckstrin Homology Domain (PHD): Positioned at the base of the ectodomain, this domain contributes to trimer stabilization and interacts with the fusion domain during the structural transition [2].

2.2 Homology Modeling and Structural Prediction

In the absence of a high-resolution crystal structure for the full-length RABV G protein, homology modeling has been extensively employed. The underlying principle involves aligning the RABV G protein sequence with a known template structure from a related rhabdovirus. Structural prediction tools such as those described in Structural Prediction of Viral Envelope Glycoproteins Using AlphaFold2 are directly applicable. AlphaFold2 and related deep learning methods generate models with high global accuracy, enabling the mapping of conserved epitopes and the identification of energetically unfavorable residue substitutions [2].

2.3 Glycosylation and the Glycan Shield

The RABV G protein possesses multiple N-linked glycosylation sequons (asparagine-X-serine/threonine) that are variably occupied depending on the viral strain and host cell type [1]. The glycan shield influences receptor binding, immune evasion, and viral tropism [2]. Computational modeling of glycan dynamics, as reviewed in Glycan Shield Engineering and the Computational Prediction of Immune Escape in Enveloped Viruses, is critical for predicting how glycosylation patterns affect epitope accessibility.

3. Molecular Dynamics Simulations of Glycoprotein Dynamics

Molecular dynamics (MD) simulations provide atomic-level insights into the conformational plasticity of the RABV G protein [2]. By solving Newtonian equations of motion on femtosecond timescales, MD simulations can probe the structural transitions that occur during receptor binding and membrane fusion.

3.1 Conformational Transitions and pH Dependence

The RABV G protein is a pH-sensitive fusion machine. At neutral pH, the trimer adopts a prefusion conformation. Upon acidification within the endosome (pH below approximately 6.0), the protein refolds into a postfusion six-helix bundle structure [2]. MD simulations have elucidated the metastable equilibrium between these states and identified key electrostatic interactions that destabilize the prefusion conformation at low pH [2].

3.2 Receptor Binding Dynamics

The interaction between the RABV G protein RBD and host receptors has been studied using steered MD and free energy perturbation methods [2]. These simulations reveal that binding to the nAChR and p75NTR occurs through a conserved hydrophobic pocket on the RBD surface. Specifically, residues tyrosine 215, methionine 227, and valine 229 are critical for docking with the receptor [2]. The binding free energy landscape is influenced by protonation states of histidine residues in the fusion domain, linking receptor engagement to the pH-dependent fusion trigger [2].

3.3 Assessing Trimer Stability

The RABV G protein trimer is stabilized by inter-subunit contacts within the PHD and by interactions between the RBD and FD of adjacent protomers [2]. MD simulations of the full-length trimer have identified flexible hinge regions that permit breathing motions without dissociating the complex [2]. These dynamics have direct relevance for vaccine design: a hyperstable trimer that retains the prefusion conformation is likely to elicit a more potent neutralizing antibody response, a strategy successfully employed for other class I fusion proteins.

4. Evolutionary Analysis of Glycoprotein Sequences

Phylogenetic and evolutionary analyses of the G protein across lyssavirus species are essential for identifying conserved functional domains and for predicting antigenic drift.

4.1 Sequence Conservation Across Lyssaviruses

The genus Lyssavirus includes at least 16 recognized species, including rabies virus, Lagos bat virus, Mokola virus, Duvenhage virus, and Australian bat lyssavirus [2]. A multiple sequence alignment of G protein sequences reveals that the RBD and FD display moderate conservation, whereas the PHD and the cytoplasmic tail are more variable [2]. This differential conservation reflects strong purifying selection on critical entry functions and relaxed selective pressure on nonessential domains.

4.2 Positive Selection and Epitope Diversity

Codons within antigenic sites (e.g., antigenic sites I, II, III, and a) exhibit elevated nonsynonymous-to-synonymous substitution ratios (dN/dS) compared to structural core residues [2]. This pattern is consistent with diversifying selection driven by host immune pressure. For example, site III, comprising residues 330 to 340, is a major neutralizing epitope and shows high variability among circulating field isolates [2]. Tracking these mutations using phylogenetic surveillance is critical for vaccine strain selection.

4.3 Conserved Epitopes as Vaccine Targets

Despite overall genetic diversity, several linear and conformational epitopes are highly conserved across divergent lyssaviruses [2]. These conserved regions are predominantly located in the fusion domain and in the core of the RBD. The identification of such epitopes via evolutionary conservation analyses provides a rational basis for designing broadly protective vaccines [2]. Convergent with approaches described in Structural and Evolutionary Dynamics of Zoonotic Viral Glycoproteins, these methods integrate sequence alignment with structural mapping to prioritize epitopes that are both conserved and surface-exposed.

5. Machine Learning for Immunogenic Region Prediction

Machine learning models have been applied to predict B-cell and T-cell epitopes from RABV G protein sequences [2]. These approaches integrate multiple feature types: physicochemical properties, secondary structure propensity, solvent accessibility, and evolutionary conservation scores.

5.1 B-Cell Epitope Prediction

B-cell epitope prediction algorithms (e.g., those based on support vector machines or random forests) use sliding windows of amino acids to score the likelihood that a given peptide is recognized by an antibody [2]. For the RABV G protein, these models consistently identify antigenic site II (residues 34 to 42) and site III as high-confidence epitopes [2]. When trained on known neutralizing antibody structures, deep learning models such as convolutional neural networks improve prediction accuracy by capturing local sequence-structure relationships [2]. This approach is closely related to the framework described in Deep Learning for Predicting MHC-Peptide Binding in Veterinary Vaccine Design, though for T-cell responses.

5.2 T-Cell Epitope Prediction

For effective vaccine design, the induction of both humoral and cell-mediated immunity is necessary. T-cell epitope prediction for the G protein focuses on peptides that bind to major histocompatibility complex (MHC) class I and class II molecules across various mammalian species [2]. In silico binding affinities are calculated using neural network-based predictors. Candidate epitopes are further filtered for conservation and for absence of autoimmunity risk. This computational pipeline allows for the selection of peptides that are predicted to be immunogenic in dogs, foxes, and other target veterinary species [2].

5.3 Integrating Epitope Predictions with Structure

Machine learning predictions are most powerful when mapped onto the three-dimensional structure of the glycoprotein [2]. Epitopes predicted as linear B-cell targets may be inaccessible in the native trimer due to steric occlusion by glycans or neighboring protomers. Structural filtering eliminates such false positives. The resulting epitope map can guide the design of subunit vaccines that present only the most promising antigenic regions.

6. Implications for Rational Vaccine Design

The computational analyses described above directly inform the rational design of next-generation rabies vaccines for veterinary use. Traditionally, rabies vaccines have been based on inactivated whole-virus preparations or live-attenuated recombinant viruses [1]. While effective, these platforms face limitations in production safety, cold-chain requirements, and limited breadth of protection across lyssavirus species.

6.1 Subunit and Virus-Like Particle Vaccines

Subunit vaccines based on the recombinant G protein ectodomain offer a safer alternative to inactivated virus [2]. Computational design ensures that the ectodomain is stabilized in the prefusion conformation, which is the target of most potently neutralizing antibodies [2]. The incorporation of conserved, computationally predicted epitopes from divergent lyssavirus strains can broaden protection. Co-expression with assembly domains (e.g., ferritin or hepatitis B core protein) can drive the formation of virus-like particles that display the G protein in a multivalent array, enhancing immunogenicity.

6.2 mRNA Vaccine Platforms

The emergence of mRNA vaccine technology offers a rapid and flexible platform for rabies vaccine development. As reviewed in In Silico Design of Lipid Nanoparticles for mRNA Vaccine Delivery, mRNA vaccines encode the full-length G protein or stabilized variants thereof. Codon optimization algorithms, informed by host species codon usage tables, increase translational efficiency [2]. Lipid nanoparticle formulations protect the mRNA and facilitate delivery to antigen-presenting cells. The ability to quickly update the mRNA sequence in response to observed antigenic drift in the G protein is a distinct advantage over traditional platforms.

6.3 Conserved Epitope Vaccines and Pan-Lyssavirus Protection

A computational vaccine design strategy prioritizes conserved epitopes that are predicted to be immunogenic across multiple lyssaviruses [2]. By selecting peptide sequences from the fusion domain and core RBD that are invariant across species, a synthetic vaccine can be constructed. These epitopes are linked using peptide linkers and expressed on a scaffold protein. Such a design, derived from the evolutionary analysis described in Section 4, is hypothesized to confer protection against a broad spectrum of lyssaviruses, not just classical RABV.

7. Workflow for Glycoprotein-Based Vaccine Design

The following Mermaid diagram outlines the integrated computational workflow from sequence acquisition to vaccine candidate selection.

flowchart TD
    A[RABV G Protein Sequence Acquisition], > B[Multiple Sequence Alignment / Phylogenetic Analysis]
    B, > C[Evolutionary Conservation Score Mapping]
    B, > D[Positive Selection Detection (dN/dS)]
    C, > E[Structure Prediction (Homology Modeling / AlphaFold2)]
    D, > E
    E, > F[Molecular Dynamics Simulations]
    F, > G[Conformational Stability & Receptor Binding Assessment]
    G, > H[Identification of Conserved, Surface-Exposed Epitopes]
    H, > I[Machine Learning Epitope Prediction]
    I, > J[B-Cell and T-Cell Epitope Prioritization]
    J, > K[Structural Filtering: Remove Occluded Epitopes]
    K, > L[Design of Stabilized Prefusion Trimer Construct]
    L, > M[Vaccine Platform Selection: Subunit / VLP / mRNA]
    M, > N[In Vivo Validation in Model Species]

This computational pipeline is generalizable to other viral glycoproteins, as demonstrated in Structural and Evolutionary Dynamics of Norovirus Capsid Protein: Implications for Vaccine Design and other envelope glycoprotein studies.

8. Cross-Linking and Interactive Visualization

Interactive structural visualization of the RABV glycoprotein is a valuable educational and research tool. The 3D Protein Viewer can be used to examine the molecular architecture of the prefusion trimer, to highlight the location of conserved epitopes, and to observe the predicted conformational changes associated with pH reduction. Readers are encouraged to load the relevant RABV G protein model (available from the Protein Data Bank) into the viewer for detailed inspection.

Additional related resources within this portal provide complementary computational virology perspectives. For a broader discussion on glycoprotein evolution and zoonotic dynamics, see Structural and Evolutionary Dynamics of Zoonotic Viral Glycoproteins. The role of molecular dynamics in viral entry is further detailed in Molecular Dynamics Simulations of Viral Spike Glycoproteins. For flow cytometry and structural studies of related veterinary pathogens, see Structural and Computational Analysis of African Swine Fever Virus Capsid Proteins for Antiviral Drug Design. The computational design of mRNA vaccines is explored in In Silico Design of Lipid Nanoparticles for mRNA Vaccine Delivery. For diagnostic approaches relevant to rabies detection, see Recombinase Polymerase Amplification (RPA) for Field Detection of Rabies Virus in Saliva Samples. Finally, the evolutionary dynamics of RNA viruses are discussed in Evolutionary Dynamics of RNA Viruses.

References

[1] Sokol F, Koprowski H. Structure-function relationships and mode of replication of animal rhabdoviruses. Proc Natl Acad Sci U S A. 1975. https://pubmed.ncbi.nlm.nih.gov/165494/ *** Disclaimer: This article is for educational and informational purposes only. It is not intended to substitute for professional veterinary advice, diagnosis, treatment, or regulatory guidance. Always consult a licensed veterinarian or qualified specialist regarding animal health, disease diagnosis, and therapeutic decisions.

[2] Khalifa ME, Unterholzner L, Munir M. Structural and Evolutionary Insights Into the Binding of Host Receptors by the Rabies Virus Glycoprotein. Front Cell Infect Microbiol. 2021. https://pubmed.ncbi.nlm.nih.gov/34708003/