Molecular Dynamics Simulations of Proteins and Force Fields: Principles, Algorithms, and Applications in Veterinary Structural Biology
Introduction
Molecular dynamics (MD) simulation is a computational technique that models the time-dependent behavior of atoms and molecules based on classical mechanics. In structural bioinformatics, MD simulations provide atomic-level insights into protein conformational dynamics, ligand binding, and folding pathways that are inaccessible to static experimental methods such as X-ray crystallography or cryo-electron microscopy. For veterinary virology and pathogen biology, MD simulations enable the study of viral glycoprotein flexibility, receptor binding interfaces, and the effects of mutations on host range, as exemplified by analyses of avian influenza hemagglutinin or coronavirus spike proteins [1, 2]. This article reviews the fundamental physical principles, integration algorithms, thermodynamic controls, and force field parameterizations that underpin modern MD simulations, with emphasis on their application to protein systems relevant to veterinary medicine.
Newtonian Equations of Motion and the Verlet Algorithm
MD simulations propagate atomic positions over time by numerically integrating Newton's second law of motion. For a system of N atoms, the force on atom i is given by the negative gradient of the potential energy function U:
[ \mathbf{F}i = -\nabla{\mathbf{r}_i} U = m_i \mathbf{a}_i ]
where (m_i) is the mass and (\mathbf{a}_i) is the acceleration. The acceleration is integrated to yield velocities and positions at successive time steps. The most widely used integration scheme is the Verlet algorithm, which updates positions using a Taylor expansion:
[ \mathbf{r}_i(t + \Delta t) = 2\mathbf{r}_i(t) - \mathbf{r}_i(t - \Delta t) + \mathbf{a}_i(t) \Delta t^2 ]
This formulation avoids explicit velocity calculation, though a velocity Verlet variant is commonly implemented for better energy conservation [1]. The time step (\Delta t) is typically 1–2 femtoseconds (fs) for atomistic simulations, constrained by the fastest vibrational frequencies (e.g., bond stretching). Larger time steps require constraints such as the SHAKE algorithm to fix bond lengths involving hydrogen atoms.
Thermodynamic Ensembles: Thermostats and Barostats
Biological systems are simulated under controlled temperature and pressure conditions to mimic physiological environments. The canonical (NVT) ensemble maintains constant number of particles, volume, and temperature. The isothermal-isobaric (NPT) ensemble additionally controls pressure. Temperature regulation is achieved through thermostats, which couple the system to a heat bath.
Common thermostats include the Nosé-Hoover thermostat, which introduces an extended variable with a fictitious mass to produce a canonical distribution, and the Berendsen thermostat, which rescales velocities to match a target temperature. The Nosé-Hoover chain thermostat improves ergodicity for small systems [2]. Pressure control uses barostats such as the Berendsen barostat or the Parrinello-Rahman barostat, which allow isotropic or anisotropic box scaling. For membrane simulations, semi-isotropic coupling is often employed to maintain a constant area per lipid [3].
Force Fields: CHARMM and AMBER
The potential energy function U in MD simulations is defined by a force field, a mathematical model that approximates interatomic interactions. The general form includes bonded terms (bond stretching, angle bending, dihedral torsion) and nonbonded terms (van der Waals and electrostatic interactions):
[ U = \sum_{\text{bonds}} k_b (r - r_0)^2 + \sum_{\text{angles}} k_\theta (\theta - \theta_0)^2 + \sum_{\text{dihedrals}} \frac{V_n}{2} [1 + \cos(n\phi - \gamma)] + \sum_{i<j} \left[ \frac{q_i q_j}{4\pi\epsilon_0 r_{ij}} + 4\epsilon_{ij} \left( \left( \frac{\sigma_{ij}}{r_{ij}} \right)^{12} - \left( \frac{\sigma_{ij}}{r_{ij}} \right)^{6} \right) \right] ]
Two of the most extensively used force fields for protein simulations are CHARMM (Chemistry at HARvard Macromolecular Mechanics) and AMBER (Assisted Model Building with Energy Refinement). Both are all-atom force fields with parameters derived from quantum mechanical calculations and experimental data. CHARMM uses a slightly different functional form for dihedrals and includes CMAP corrections for backbone dihedral angles to improve protein folding predictions [1]. AMBER employs a similar functional form but with distinct parameter sets (e.g., ff14SB, ff19SB) optimized for proteins and nucleic acids.
Recent evaluations have compared these force fields for intrinsically disordered proteins (IDPs) and amyloidogenic peptides. Dong et al. [1] benchmarked multiple force fields against NMR and fluorescence correlation spectroscopy data for the intrinsically disordered domain of protein 4.1G, finding that CHARMM36m with TIP4P-D water best reproduced experimental ensemble dimensions. Rosenman et al. [2] characterized Aβ monomer ensembles across multiple force fields, demonstrating that convergence of structural properties (e.g., radius of gyration, secondary structure propensities) requires careful assessment of sampling and force field bias.
Lipid Force Fields and Membrane Simulations
For simulations of membrane proteins or viral envelopes, lipid force fields are essential. The CHARMM36 lipid force field is widely used for bilayer simulations. Demerdash et al. [3] reparameterized a lipid force field using small-angle X-ray scattering (SAXS) data to improve predictions for multicomponent membranes under organic solvent stress. Their approach, implemented in the ForceBalance-SAS framework, reduced the discrepancy between experimental and computed SAXS intensities by a factor of 10 for 1-butanol stress and 7.5 for tetrahydrofuran stress. This work highlights the importance of optimizing force fields against condensed-phase properties to capture the behavior of complex biological membranes, such as those found in Gram-positive bacterial envelopes or enveloped viruses.
Structural Trajectories and Visualization
MD simulations produce trajectories: time-ordered series of atomic coordinates. These trajectories are stored in formats such as DCD, XTC, or NetCDF, often accompanied by topology files that define molecular connectivity and force field parameters. Analysis of trajectories involves computing root-mean-square deviation (RMSD), root-mean-square fluctuation (RMSF), radius of gyration, solvent-accessible surface area, and hydrogen bond networks.
Visualization and animation of trajectories are performed using molecular graphics software such as VMD, PyMOL, or ChimeraX. These tools allow rendering of protein structures as cartoons, surfaces, or licorice representations, with color coding by residue properties or dynamic fluctuations. Animations can be generated by rendering each frame of the trajectory and concatenating images into a movie file. For veterinary applications, visualizing the conformational changes of viral surface proteins (e.g., influenza hemagglutinin pH-induced fusion) or antibody binding interfaces provides mechanistic understanding of host-pathogen interactions.
Workflow of a Typical MD Simulation
The following Mermaid diagram outlines the standard workflow for setting up and running an MD simulation of a protein system.
flowchart TD
A[Obtain protein structure: X-ray, cryo-EM, or AlphaFold model], > B[Prepare system: add hydrogens, assign force field, solvate in water box]
B, > C[Add ions to neutralize and set ionic strength]
C, > D[Energy minimization: steepest descent or conjugate gradient]
D, > E[Equilibration: NVT then NPT ensembles with position restraints]
E, > F[Production MD: NPT or NVT, 10-1000 ns, save trajectory every 1-10 ps]
F, > G[Analysis: RMSD, RMSF, clustering, free energy calculations]
G, > H[Visualization and animation: render frames, create movie]
Applications in Veterinary Structural Biology
MD simulations have been applied to study a range of veterinary pathogens. For example, simulations of avian influenza hemagglutinin have elucidated the pH-dependent conformational changes required for membrane fusion, informing vaccine design. Simulations of the spike protein of porcine epidemic diarrhea virus (PEDV) have identified key residues for receptor binding. The force field evaluation by Dong et al. [1] is directly relevant to modeling intrinsically disordered regions of viral proteins, which often mediate host interactions. The lipid force field reparameterization by Demerdash et al. [3] improves simulations of bacterial membranes, aiding studies of antimicrobial peptide mechanisms.
Cross-referencing with other structural bioinformatics resources on this portal, such as AlphaFold 3 in Molecular Biology and Relion and cryoSPARC, provides a comprehensive toolkit for veterinary structural biology.
Limitations and Future Directions
Despite their power, MD simulations face limitations in timescale (microseconds to milliseconds are computationally expensive) and force field accuracy. Force fields are continuously refined against experimental data, as demonstrated by the SAXS-based reparameterization of lipid parameters [3]. For intrinsically disordered proteins, specialized force fields such as CHARMM36m and AMBER ff99SB-ILDN with optimized water models improve agreement with NMR data [1, 2]. The convergence of ensemble properties across multiple force fields, as shown for Aβ monomers [2], underscores the need for rigorous sampling and cross-validation.
Conclusion
Molecular dynamics simulations, grounded in Newtonian mechanics and driven by empirical force fields, provide a powerful computational microscope for studying protein dynamics at atomic resolution. The Verlet integration algorithm, combined with thermostats and barostats, enables realistic simulations under physiological conditions. Force fields such as CHARMM and AMBER, continuously refined through benchmarks against experimental data [1, 2, 3], are essential tools for veterinary structural biology. Visualization of trajectories in 3D protein viewers allows researchers to animate conformational changes and generate hypotheses about molecular mechanisms. As computational resources expand and force field accuracy improves, MD simulations will play an increasingly central role in understanding host-pathogen interactions and guiding the development of veterinary interventions.
References
[1] Dong X, Wang D, Yu S, et al. Force Field Evaluation for an Intrinsically Disordered Domain: MD-NMR-FCS Benchmarking of Protein 4.1G Headpiece Ensembles. J Chem Inf Model. 2026. URL: https://pubmed.ncbi.nlm.nih.gov/42299722/
[2] Rosenman D, Wang C, Garcia A. Characterization of Aβ Monomers through the Convergence of Ensemble Properties among Simulations with Multiple Force Fields. J Phys Chem B. 2016. URL: https://www.semanticscholar.org/paper/515414edd9ab0bc1c0ecd3ea78ecebccdc00e73e
[3] Demerdash O, Smith M, Wang LP, et al. Reparameterizing a Lipid Force Field Using Small-Angle X-ray Scattering to Improve Predictions of Multicomponent Membranes under Organic Solvent Stress. J Phys Chem B. 2026. URL: https://www.semanticscholar.org/paper/0c8e6c2b78673a07988a8efe315207b5374bcfff *** Disclaimer: This article is for educational and informational purposes only. It is not intended to substitute for professional veterinary advice, diagnosis, treatment, or regulatory guidance. Always consult a licensed veterinarian or qualified specialist regarding animal health, disease diagnosis, and therapeutic decisions.