What is Dr. Zubair Khalid's research focus?

Dr. Zubair Khalid specializes in molecular virology, mRNA vaccine development, and computational biology, with a focus on avian pathogens like IBDV and Avian Reovirus.

Where is Dr. Zubair Khalid currently working?

Dr. Zubair Khalid is a Postdoctoral Research Associate at the University of Maryland (UMD), specifically within the Department of Animal and Avian Sciences.

Computational Modeling of T-Cell Receptor Binding Specificity

Introduction

The adaptive immune system in vertebrates relies on the precise molecular recognition of peptide antigens presented by major histocompatibility complex (MHC) molecules by T-cell receptors (TCRs) [1, 2]. This interaction, known as TCR-pMHC binding, is the central event initiating T-cell activation and subsequent cellular immune responses [3, 4]. In veterinary species, understanding TCR binding specificity is critical for developing vaccines against pathogens such as Highly Pathogenic Avian Influenza (H5N1) in Poultry and Wild Birds: Clinical Signs, Transmission Dynamics, and Surveillance Maps and for designing immunotherapies for conditions like Porcine Reproductive and Respiratory Syndrome: Genomic Surveillance and Vaccine Strategies Using Bioinformatics. The extreme diversity of TCR sequences, generated through V(D)J recombination, poses a formidable computational challenge for predicting binding specificity [5, 6].

Biophysical Basis of TCR-pMHC Recognition

The TCR is a heterodimeric cell surface protein composed of alpha and beta chains (or gamma and delta chains in a minority of T cells), each containing variable (V) and constant (C) domains [7, 8]. The antigen-binding site is formed by six complementarity-determining regions (CDRs): three from the alpha chain (CDR1alpha, CDR2alpha, CDR3alpha) and three from the beta chain (CDR1beta, CDR2beta, CDR3beta) [9, 10]. The CDR3 loops, encoded by the V(D)J junctional regions, exhibit the highest sequence diversity and make the most extensive contacts with the peptide presented in the MHC groove [11, 12].

The binding interface between TCR and pMHC is characterized by a relatively flat, interdigitating topology [10]. The TCR typically docks diagonally over the pMHC complex, with the Valpha domain positioned over the N-terminal half of the peptide and the Vbeta domain over the C-terminal half [13, 14]. Binding affinity and specificity are governed by a combination of hydrogen bonds, van der Waals interactions, electrostatic complementarity, and hydrophobic packing at the interface [3, 9]. The kinetic parameters of this interaction, particularly the dissociation rate, have been shown to be critical determinants of T-cell activation efficacy [3, 4]. Catch bond behavior, where bond lifetime increases under mechanical force, has been observed in TCR-pMHC interactions and nonlinearly modulates CD8 cooperation to shape T-cell specificity [4].

V(D)J Recombination Modeling

The generation of TCR diversity begins with somatic recombination of variable (V), diversity (D, for beta chains), and joining (J) gene segments [5, 15]. Computational models of V(D)J recombination aim to predict the repertoire of possible TCR sequences from germline gene segments and to simulate the probability of generating specific CDR3 sequences [5, 6]. The TRain platform provides automated immunoinformatics tools for analyzing TCR sequences, including V(D)J segment assignment and CDR loop identification [5]. These models are essential for understanding the baseline repertoire from which antigen-specific TCRs are selected during an immune response [6, 15].

Structure-Based Modeling of TCR-pMHC Complexes

Docking and Scoring Approaches

Computational docking of TCRs onto pMHC complexes requires sampling of the conformational space of the CDR loops and the pMHC surface [1, 10]. SwiftTCR introduced an efficient docking protocol using restricted rotation matrices to reduce the search space for TCR-pMHC-I complexes [1]. This method leverages the observation that the docking orientation of TCRs is relatively constrained, allowing for rapid enumeration of plausible binding geometries [1, 10]. Structure-based prediction methods, such as those described by Bradley, use Rosetta-based scoring functions to evaluate the binding energy of docked TCR-pMHC complexes [10].

NetTCR-struc represents a structure-driven approach that integrates structural features of the TCR and pMHC into a predictive model [12]. This method uses structural descriptors of the CDR loops and the peptide-MHC binding groove to predict interaction outcomes [12]. The PANDORA v2.0 platform provides benchmarking of peptide-MHC II models, which is critical for evaluating the accuracy of structural predictions for class II MHC molecules [16].

Highlighting CDR Loop Contacts in 3D Structural Viewers

To visualize TCR-pMHC interactions in a 3D structural viewer, specific contacts between TCR CDR loops and the MHC-bound peptide must be highlighted. This is typically achieved by calculating interatomic distances between atoms of the CDR loops and atoms of the peptide or MHC alpha helices [9, 10]. A common threshold for defining a contact is a distance of less than 4.0 angstroms between any two non-hydrogen atoms [9]. In structural viewers such as PyMOL or ChimeraX, these contacts can be visualized as dashed lines or colored spheres. The CDR3 loops are typically the primary focus, as they make the most intimate contacts with the peptide [11, 12]. However, CDR1 and CDR2 loops also contribute to binding by contacting the MHC helices, thereby stabilizing the overall complex [13, 10]. Residue correlation analysis and interaction energy calculations can further rationalize the contributions of individual amino acids to binding affinity [9].

Deep Learning for TCR-Antigen Pairing Prediction

Sequence-Based Models

Deep learning has revolutionized the prediction of TCR-epitope binding specificity [17, 18, 19]. These models typically take as input the amino acid sequences of the TCR CDR3 regions and the target peptide, and output a binding probability or affinity score [15]. NetTCR-2.1 provided foundational guidance on developing such models, emphasizing the importance of data quality, cross-validation, and appropriate negative sampling strategies [15].

GRAPE (graph-regularized protein language modeling) uses a graph-based regularization approach to learn TCR-epitope binding specificity from sequence data [17]. This method leverages protein language models to capture the biophysical properties of amino acid sequences while incorporating structural constraints through graph regularization [17]. Weakly supervised learning approaches have been developed to address the scarcity of experimentally validated TCR-pMHC binding data [18]. These methods use large-scale TCR repertoire sequencing data to infer binding specificity without requiring explicit labels for every TCR-epitope pair [18].

THLANet is a deep learning framework specifically designed for predicting TCR-pHLA binding in immunotherapy applications [19]. This model incorporates attention mechanisms to focus on critical residues within the CDR3 regions and the peptide [19]. Comprehensive benchmarking studies have evaluated different TCR embedding strategies, comparing sequence-based, structure-based, and hybrid approaches for modeling TCR-epitope interactions [6].

Structure-Aware and Equivariant Models

Recent advances have integrated structural information directly into deep learning architectures [2, 13, 14]. TCRLens uses structure-aware equivariant graph learning to model TCR-pMHC-I recognition [2]. This approach represents the TCR and pMHC as graphs, where nodes correspond to amino acid residues and edges represent spatial proximity [2]. Equivariant neural networks ensure that predictions are invariant to rotations and translations of the input structure, which is critical for generalizing across different TCR-pMHC complexes [2, 14].

Structure-directed pan-specific interaction prediction methods combine sequence and structural features to achieve cross-reactivity predictions across different MHC alleles and peptide sequences [13]. These models are trained on large datasets of known TCR-pMHC structures and can predict binding for unseen TCR-peptide combinations [13, 10].

Multi-Modal and Generative Models

Multi-modal generative models have been developed for joint analysis of single-cell TCR and gene expression data [20]. These models can simultaneously learn representations of TCR sequences and transcriptional states, enabling the identification of clonally expanded T cells with specific functional phenotypes [20]. Generative approaches also show promise for designing novel TCRs with desired binding specificities [21, 14]. De novo-designed pMHC binders have been shown to facilitate T-cell mediated cytotoxicity, demonstrating the potential of computational design for therapeutic applications [21].

Computational Pipelines for TCR Specificity Prediction

A typical computational pipeline for TCR specificity prediction involves several sequential steps [22, 23]. First, TCR sequencing data are processed to identify V(D)J segments and extract CDR3 sequences [5]. Second, the target peptide is defined, and its binding to MHC molecules is predicted using peptide-MHC binding algorithms [16]. Third, the TCR-pMHC interaction is modeled using either docking or deep learning approaches [1, 12]. Finally, the predicted binding is validated experimentally or through benchmarking against known datasets [6, 15].

The following table summarizes the key computational approaches for TCR-pMHC binding prediction:

Approach	Input Features	Output	Key Advantages	Limitations
Structure-based docking	3D coordinates of TCR and pMHC	Binding energy, docked pose	Mechanistic insight, atomic resolution	Computationally expensive, requires structural data
Sequence-based deep learning	TCR CDR3 and peptide sequences	Binding probability	High throughput, no structure required	Limited generalization to unseen epitopes
Structure-aware graph learning	TCR and pMHC graphs	Binding classification	Incorporates structural constraints, equivariant	Requires structural templates or predicted models
Multi-modal generative models	TCR sequences and gene expression	Joint embeddings	Integrates functional data	Complex training, data-hungry

The following Mermaid diagram illustrates a decision tree for selecting an appropriate computational approach:

flowchart TD
    A[Input: TCR sequence and peptide], > B{Is TCR-pMHC structure available?}
    B, >|Yes| C[Structure-based docking]
    B, >|No| D{Is peptide-MHC binding known?}
    D, >|Yes| E[Sequence-based deep learning]
    D, >|No| F[Predict peptide-MHC binding]
    F, > E
    C, > G[Evaluate binding energy and contacts]
    E, > H[Predict binding probability]
    G, > I[Validate with experimental data]
    H, > I

Applications in Veterinary Immunology

Computational modeling of TCR binding specificity has direct applications in veterinary medicine [7, 24]. In livestock species, such models can aid in the design of vaccines against pathogens like Mycoplasma bovis in Feedlot Cattle: Chronic Pneumonia, Arthritis, and the Challenge of Cultivation versus Molecular Detection and Avian Cholera in Waterfowl: Pasteurella multocida Serotypes, Outbreak Dynamics, and Vaccination Approaches in Wild and Domestic Birds. By predicting which T-cell epitopes are likely to elicit strong immune responses, computational pipelines can prioritize vaccine candidates for experimental testing [25, 24].

In the context of African Swine Fever: Computational Models for Early Detection and Spread Prediction in Wild Boar Populations, TCR specificity modeling can help identify conserved epitopes that are recognized across different viral strains [25]. Similarly, for Highly Pathogenic Avian Influenza (HPAI) H5N1 in Poultry: Clinical Signs and Molecular Surveillance, understanding TCR binding to hemagglutinin and neuraminidase epitopes can inform vaccine design and surveillance efforts.

Challenges and Future Directions

Several challenges remain in computational TCR specificity modeling [6, 15]. The extreme diversity of the TCR repertoire means that most TCR sequences have never been experimentally characterized [5, 6]. Data scarcity is particularly acute for non-human species, where TCR databases are far less comprehensive than for humans [7, 24]. Transfer learning from human to veterinary species may be possible for conserved MHC molecules, but this requires careful validation [24, 16].

Another challenge is the prediction of TCR cross-reactivity, where a single TCR recognizes multiple peptide-MHC complexes [11, 13]. This phenomenon is important for understanding autoimmune responses and for designing vaccines that elicit broadly reactive T cells [11]. Biophysical frameworks for identifying antigen escape variants and corresponding compensatory TCR sequences have been developed, but their application to veterinary pathogens remains limited [11].

Future directions include the integration of TCR specificity models with systems biology approaches [26, 20]. Mathematical models of T-cell activation that incorporate TCR-pMHC binding kinetics can predict the functional outcomes of T-cell stimulation [26, 3]. The combination of single-cell TCR sequencing with transcriptomic and proteomic data will enable more comprehensive models of T-cell responses in veterinary species [20].

Conclusion

Computational modeling of T-cell receptor binding specificity has advanced significantly through the integration of structural biology, deep learning, and immunoinformatics [1, 2, 17]. Structure-based docking methods provide atomic-level insights into TCR-pMHC interactions, while sequence-based deep learning models enable high-throughput prediction of binding specificity [12, 10, 15]. The development of structure-aware equivariant graph learning and multi-modal generative models represents the current frontier in the field [2, 14, 20]. These computational approaches have direct applications in veterinary vaccine design, immunotherapy development, and the study of immune responses to infectious diseases in animal populations [25, 7, 24].

References

[1] Parizi FM, Aarts YJM, Smit N et al. SwiftTCR: efficient computational docking protocol of TCRpMHC-I complexes using restricted rotation matrices. Brief Bioinform. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/42297379/

[2] Siriarchawatana P, Ingsriswang S, Kaewborisuth C et al. TCRLens: structure-aware equivariant graph learning for TCR-pMHC-I recognition and immunogenic epitope discovery. Bioinform Adv. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/41890808/

[3] Yan M, Wang J, Zhang JZH. A kinetic gatekeeper: TCR-pMHC dissociation dynamics determine T cell activation efficacy. Protein Sci. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/41108568/

[4] Qin R, Zhang Y, Shi J et al. TCR catch bonds nonlinearly control CD8 cooperation to shape T cell specificity. Cell Res. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40011760/

[5] Seamann A, Bennett-Boehm M, Ehrlich R et al. TRain: T-cell receptor automated immunoinformatics. BMC Bioinformatics. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40050726/

[6] Feng X, Huo M, Li H et al. A comprehensive benchmarking for evaluating TCR embeddings in modeling TCR-epitope interactions. Brief Bioinform. 2024. URL: https://pubmed.ncbi.nlm.nih.gov/39883514/

[7] Bravo F, Kossack C, Galleguillos-Becerra C et al. Structural characterization of T-cell kinases Lck1 and ZAP-70 in rainbow trout (Oncorhynchus mykiss) through homology modeling. Dev Comp Immunol. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/41240985/

[8] Nelson AD, Wang L, Laffey KG et al. Rigid crosslinking of the CD3 complex leads to superior T cell stimulation. Front Immunol. 2024. URL: https://pubmed.ncbi.nlm.nih.gov/39281668/

[9] Frezzini M, Narzi D. Rationalizing Enhanced Affinity of Engineered T-Cell Receptors in Cancer Immunotherapy Through Interaction Energy Calculations and Residue Correlation Analysis. Proteins. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/40747799/

[10] Bradley P. Structure-based prediction of T cell receptor:peptide-MHC interactions. Elife. 2023. URL: https://pubmed.ncbi.nlm.nih.gov/36661395/

[11] Ghoreyshi ZS, Levine H, Lin X et al. A biophysical framework for accurately identifying antigen single-amino acid escape variants and corresponding variant-specific compensatory TCR sequences. bioRxiv. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/41676605/

[12] Deleuran SN, Nielsen M. NetTCR-struc, a structure driven approach for prediction of TCR-pMHC interactions. Front Immunol. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40746534/

[13] Gao L, Zhang Y, Ge F et al. Structure-Directed Pan-Specific T-Cell Receptor-Peptide-Major Histocompatibility Complex Interaction Prediction. J Chem Inf Model. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40297927/

[14] Ribeiro-Filho HV, Jara GE, Guerra JVS et al. Exploring the potential of structure-based deep learning approaches for T cell receptor design. PLoS Comput Biol. 2024. URL: https://pubmed.ncbi.nlm.nih.gov/39348412/

[15] Montemurro A, Jessen LE, Nielsen M. NetTCR-2.1: Lessons and guidance on how to develop models for TCR specificity predictions. Front Immunol. 2022. URL: https://pubmed.ncbi.nlm.nih.gov/36561755/

[16] Parizi FM, Marzella DF, Ramakrishnan G et al. PANDORA v2.0: Benchmarking peptide-MHC II models and software improvements. Front Immunol. 2023. URL: https://pubmed.ncbi.nlm.nih.gov/38143769/

[17] Fu X, Peng L, Chen H et al. GRAPE: graph-regularized protein language modeling unlocks TCR-epitope binding specificity. Brief Bioinform. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/41052278/

[18] Gao Y, Gao Y, Wu S et al. Weakly supervised peptide-TCR binding prediction facilitates neoantigen identification. Cell Syst. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40987291/

[19] Long X, Yang Q, Dong W et al. THLANet: A deep learning framework for predicting TCR-pHLA binding in immunotherapy applications. PLoS Comput Biol. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40939018/

[20] Drost F, An Y, Bonafonte-Pardàs I et al. Multi-modal generative modeling for joint analysis of single-cell T cell receptor and gene expression data. Nat Commun. 2024. URL: https://pubmed.ncbi.nlm.nih.gov/38956082/

[21] Johansen KH, Wolff DS, Scapolo B et al. De novo-designed pMHC binders facilitate T cell-mediated cytotoxicity toward cancer cells. Science. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40705893/

[22] Song K, Xu H, Shi Y et al. Investigating TCR-pMHC interactions for TCRs without identified epitopes by constructing a computational pipeline. Int J Biol Macromol. 2024. URL: https://pubmed.ncbi.nlm.nih.gov/39423970/

[23] Rollins ZA, Curtis MB, George SC et al. A Computational Strategy for the Rapid Identification and Ranking of Patient-Specific T Cell Receptors Bound to Neoantigens. Macromol Rapid Commun. 2024. URL: https://pubmed.ncbi.nlm.nih.gov/38839076/

[24] Putignano G, Ruipérez-Campillo S, Yuan Z et al. Mathematical models and computational approaches in CAR-T therapeutics. Front Immunol. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40821789/

[25] Srivastava R. AI-powered mapping of tumor immunity for optimized mRNA vaccine engineering. Front Oncol. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/41853314/

[26] Bali Y, Rendall AD. A comparative mathematical modeling study of phenotypic approaches to T cell activation. Sci Rep. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/41422159/

[27] Sakhnini LI, Beltrame L, Fulle S et al. Prediction of antibody non-specificity using protein language models and biophysical parameters. MAbs. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/42200306/

[28] Wei S, Zhang J, Chen Y et al. GeoGAD: geometry-aware antibody design framework for complementarity-determining region precision engineering. Bioinformatics. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/41580967/

[29] Fukushima Y, Wang Y, Yada-Makino Y et al. Functional and structural analysis of KK-LC-1-specific T cell receptors from patients with lung Cancer for immunotherapy. Cell Immunol. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/41500040/

[30] Khodke P, Kumbhar BV. Computational design of CAR-scFv linker variants for enhanced CD20 binding against B-cell malignancies. Comput Biol Med. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/41319471/

[31] Ye R, Yang Y, Chan KC et al. In Silico Design of High-Affinity gp100 Antigenic Mimotopes for Targeting Melanoma. J Phys Chem B. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/40692510/

[32] Du H, Liu J, Jude KM et al. A general system for targeting MHC class II-antigen complex via a single adaptable loop. Nat Biotechnol. 2025. URL: https://pubmed.ncbi.nlm.nih.gov/39672953/

[33] Fernández-Quintero ML, Kroell KB, Bacher LM et al. Germline-Dependent Antibody Paratope States and Pairing Specific V(H)-V(L) Interface Dynamics. Front Immunol. 2021. URL: https://pubmed.ncbi.nlm.nih.gov/34447370/

[34] Armony G, Heck AJR, Wu W. Extracellular crosslinking mass spectrometry reveals HLA class I - HLA class II interactions on the cell surface. Mol Immunol. 2021. URL: https://pubmed.ncbi.nlm.nih.gov/34052579/

[35] Tasker C, Patel J, Jawa V et al. Competition-Based Cell Assay Employing Soluble T Cell Receptors to Assess MHC Class II Antigen Processing and Presentation. AAPS J. 2021. URL: https://pubmed.ncbi.nlm.nih.gov/33459871/ *** Disclaimer: This article is for educational and informational purposes only. It is not intended to substitute for professional veterinary advice, diagnosis, treatment, or regulatory guidance. Always consult a licensed veterinarian or qualified specialist regarding animal health, disease diagnosis, and therapeutic decisions.