Frederick Sanger and the First Sequenced Genomes
Frederick Sanger's contributions to molecular biology are foundational to the field of genomics. For veterinary virology and molecular diagnostics, the techniques he developed and the first genomes he sequenced provided the intellectual and methodological substrate upon which modern pathogen detection, characterization, and computational analysis are built. This article reviews Sanger's methodological innovations and the landmark genome sequences that resulted from his work, with an emphasis on their relevance to understanding viral pathogens and host-pathogen interactions in the animal health context.
The Methodological Foundation: Thin Acrylamide Gels and Primed Synthesis
The development of DNA sequencing was a stepwise process that refined the principles of nucleic acid chemistry. Sanger's early work focused on the sequencing of RNA, specifically the low molecular weight ribosomal RNA of Escherichia coli [1] and phenylalanine tRNA from E. coli [2]. These studies established the use of ribonuclease digestion fragments and two-dimensional electrophoresis, a technique often referred to as fingerprinting. This approach was extended to the study of ribosomal RNA sequences from specific areas of the 16S and 23S subunits [3] and involved the use of polynucleotide kinase for fingerprinting non-radioactive nucleic acids [4].
A major methodological breakthrough was the use of thin acrylamide gels for DNA sequencing [5]. These gels provided a higher resolution separation of DNA fragments differing by a single nucleotide in length. The thin gel format allowed for greater heat dissipation and more uniform electric fields, which translated into sharper bands and the ability to read longer sequences from a single electrophoretic run. This innovation was critical for scaling sequencing from short oligonucleotides to entire viral genomes.
The fundamental chemistry of Sanger's approach was the primed synthesis reaction with DNA polymerase [6, 7, 8]. In this method, a synthetic oligonucleotide primer is annealed to a single-stranded DNA template. DNA polymerase I then extends the primer by incorporating deoxynucleotide triphosphates. The key innovation was the use of dideoxynucleotide triphosphates (ddNTPs) as chain terminators. A ddNTP lacks the 3'-hydroxyl group required for phosphodiester bond formation, so its incorporation terminates chain elongation at that specific base. By performing four separate reactions, each containing one ddNTP (ddATP, ddCTP, ddGTP, ddTTP) in addition to the four normal dNTPs, a nested set of fragments is generated. These fragments, separated by size on thin acrylamide gels, reveal the exact nucleotide sequence when read from the autoradiograph [5, 6].
Sanger described this methodology comprehensively in his later reviews, emphasizing the universal applicability of the chain termination method for determining nucleotide sequences in DNA [9, 10]. The process is summarized below.
Workflow of the Sanger Chain Termination Method
flowchart TD
A["Single-stranded DNA template"] --> B["Annealing of synthetic oligonucleotide primer"]
B --> C["Four separate reaction mixtures<br>Each contains DNA polymerase I<br>All four dNTPs + one labeled ddNTP"]
C --> D["Reaction A: ddATP"]
C --> E["Reaction C: ddCTP"]
C --> F["Reaction G: ddGTP"]
C --> G["Reaction T: ddTTP"]
D --> H["Nested fragments terminated at A positions"]
E --> I["Nested fragments terminated at C positions"]
F --> J["Nested fragments terminated at G positions"]
G --> K["Nested fragments terminated at T positions"]
H --> L["Electrophoresis on thin acrylamide gel"]
I --> L
J --> L
K --> L
L --> M["Autoradiograph read from bottom to top"]
M --> N["Full DNA sequence determined"]
The First Sequenced Genome: Bacteriophage phiX174
The first complete genome sequence of any organism was that of bacteriophage phiX174, published by Sanger, Coulson, and colleagues [11]. PhiX174 is a small, icosahedral virus with a single-stranded circular DNA genome. Its genome length is approximately 5386 nucleotides. The sequence revealed several features that were unprecedented at the time and have profound implications for molecular biology and virology.
The phiX174 genome contained overlapping genes. Specifically, the sequence demonstrated that gene D was located entirely within gene E, and that the C termini of genes A and B overlapped [12]. This finding fundamentally altered the understanding of genetic information density in viral genomes. For veterinary virology, this principle of overlapping reading frames has since been observed in numerous animal pathogens, including Canine Parvovirus variants and Feline Calicivirus strains, where capsid protein genes often overlap with non-structural protein genes. The phiX174 sequence also allowed for the deduction of amino acid sequences of the viral proteins through the genetic code, as illustrated by the analysis of gene G of phiX174 [13] and the earlier work on the overlapping genes A and B [12].
The sequence of the low molecular weight ribosomal RNA of E. coli [1] and the phenylalanine tRNA sequence [2] provided essential calibration tools for the sequencing methodology. These sequences were of known length and composition, allowing Sanger to validate the accuracy of the chain termination approach before applying it to larger, more complex genomes.
The Bacteriophage f1 Genome and the Plus-Minus Method
Before the complete phiX174 sequence, Sanger and colleagues determined a nucleotide sequence in bacteriophage f1 DNA using primed synthesis with DNA polymerase [8, 7]. Bacteriophage f1 is a filamentous phage with a single-stranded DNA genome. The sequence of a specific region of f1 demonstrated the feasibility of the method for determining sequences of several hundred nucleotides. This work used a synthetic oligonucleotide primer to initiate DNA synthesis at a defined point on the viral genome.
The plus-minus method, a precursor to the full chain termination technique, was described in these studies [8]. In the plus system, the polymerization reaction is carried out in the presence of only a single dNTP, allowing extension only until a different base is required. In the minus system, extension proceeds until a specific dNTP is omitted from the reaction. The combination of these two approaches yielded nested fragment sets that could be resolved on gels. While the plus-minus method was effective, the dideoxy chain termination method proved to be more robust and became the standard technique.
The complete genome sequence of phiX174 was determined using both the plus-minus method and the newer dideoxy method, with the latter providing the final, unambiguous sequence [11]. The comparison of the human and bovine mitochondrial genomes later demonstrated the power of this sequencing approach for comparative genomics across species [14]. In the context of veterinary medicine, the ability to sequence mitochondrial genomes has been critical for studies of evolutionary biology, population genetics, and the identification of species-specific markers in food authentication and traceability, particularly in the context of bacterial contamination in chicken meat and eggs.
Implications for Veterinary Bioinformatics and Diagnostics
The methodological and conceptual advances from Sanger's work underpin several core activities in veterinary bioinformatics and diagnostics.
Infectious Disease Diagnostics: The detection of viral and bacterial pathogens in clinical samples often relies on polymerase chain reaction (PCR) and sequencing. For the diagnosis of infections such as Ehrlichia canis and Monocytic Ehrlichiosis in Dogs or Anaplasma phagocytophilum in Livestock and Companion Animals, Sanger sequencing of conserved bacterial genes (e.g., 16S rRNA) provides definitive species identification. The overlapping gene concept from phiX174 is critical when designing primers for PCR assays for viruses such as West Nile Virus or Feline Leukemia Virus, as primer binding sites must not be disrupted by overlapping reading frames that may have different mutation rates.
Computational Genomics and Phylogenetics: The first viral genomes provided the training data for developing early sequence alignment algorithms, phylogenetic tree construction methods, and tools for reading frame prediction. The phiX174 genome, with its overlapping genes, was a benchmark for testing gene-finding software. This work is directly analogous to modern computational analyses for pathogens like Lumpy Skin Disease Virus or African Swine Fever Virus, where genomic sequences are used to track transmission networks and identify virulence markers.
Vaccine Design and Antiviral Development: Knowledge of the exact nucleotide and amino acid sequences of viral proteins, as derived from the first sequenced genomes, enables the rational design of vaccines. For example, the identification of conserved epitopes in viral surface proteins for pathogens like Bovine Coronavirus or Rabbit Coronavirus is predicated on sequence data that can be traced back to the methodologies pioneered by Sanger.
The Transition to High-Throughput Sequencing
While Sanger sequencing remains the gold standard for low-volume, high-accuracy applications, the field has largely transitioned to high-throughput sequencing platforms for genome-scale projects. These platforms rely on the same core principles of polymerase extension and terminator incorporation, albeit in massively parallel formats. The thin acrylamide gel [5] has been replaced by capillary electrophoresis systems that can process hundreds of samples simultaneously. For metagenomic studies of the gut microbiome in swine or poultry, high-throughput sequencing of 16S rRNA amplicons uses the same genetic targets that Sanger originally employed to study ribosomal RNA [1, 3].
In a veterinary diagnostic setting, the distinction between Sanger sequencing and high-throughput sequencing often depends on the diagnostic question. For confirmation of a single genetic marker, such as the identification of a specific virulence factor in Escherichia coli from poultry Escherichia coli in Chickens and Poultry Products, Sanger sequencing is efficient and cost-effective. For discovery of novel pathogens or characterization of outbreak strains, high-throughput sequencing provides the depth of coverage required.
Continuing Relevance in the Veterinary Diagnostic Laboratory
The legacy of Frederick Sanger's work is evident in every molecular diagnostic test that relies on sequence-specific detection. The principles of primed synthesis [7, 8] are the basis of real-time PCR assays for the detection of pathogens such as Mycoplasma synoviae or Histomonas meleagridis. The fidelity of DNA polymerase and the specificity of oligonucleotide primers, both central to Sanger's method, are the critical parameters that determine the accuracy of these diagnostic tests.
The sequencing of the phiX174 genome [11] demonstrated that a complete viral genome could be deduced from a set of overlapping sequencing reactions. This principle is now standard practice in the characterization of newly emerging animal viruses. For example, the identification of a novel astrovirus associated with white chick syndrome Chicken Astrovirus and White Chick Syndrome would involve sequencing the complete genome from clinical samples, using primer walking or high-throughput sequencing approaches that are direct descendants of Sanger's chain termination method.
Conclusion
Frederick Sanger's development of the chain termination DNA sequencing method, the thin acrylamide gel system, and the first complete genome sequences of bacteriophages phiX174 and f1 represent a transformative period in biological science. These achievements provided the foundational tools for all subsequent genomic work, including the characterization of veterinary pathogens. The understanding of overlapping genes, the ability to read nucleotide sequences directly, and the computational frameworks developed to analyze these data are embedded in the daily practice of veterinary virology, diagnostics, and bioinformatics. The first sequenced genomes continue to serve as the reference points from which all comparative genomic analyses in animal health are measured.
References
[1] Brownlee, G., & Sanger, F. (1967). Nucleotide sequences from the low molecular weight ribosomal RNA of Escherichia coli. Journal of Molecular Biology. https://www.semanticscholar.org/paper/5fcbe265f1eb25763152e9f6038d66f1bbfc1940
[2] Barrell, B., & Sanger, F. (1969). The sequence of phenylalanine tRNA from E. coli. FEBS Letters. https://www.semanticscholar.org/paper/a3157f7e46cb744952637a3be186392d7300ed49
[3] Fellner, P., & Sanger, F. (1968). Sequence Analysis of Specific Areas of the 16S and 23S Ribosomal RNAs. Nature. https://www.semanticscholar.org/paper/e8ff083a96e9590d3cf8626b1c324c0b8c8dd726
[4] Székely, M., & Sanger, F. (1969). Use of polynucleotide kinase in fingerprinting non-radioactive nucleic acids. Journal of Molecular Biology. https://www.semanticscholar.org/paper/32ea9a63bc7a780cb4d73f8d635dead63d261e4b
[5] Sanger, F., & Coulson, A. (1978). The use of thin acrylamide gels for DNA sequencing. FEBS Letters. https://www.semanticscholar.org/paper/24bc37a5f7cc759af1b2032004c1151a9dfb57fd
[6] Sanger, F., & Coulson, A. (1989). A Rapid Method for Determining Sequences in DNA by Primed Synthesis with DNA Polymerase. Journal. https://www.semanticscholar.org/paper/a6c2b2fa546a4d7d0d449293455daff0b75cd20f
[7] Sanger, F., Donelson, J., Coulson, A., et al. (1973). Use of DNA polymerase I primed by a synthetic oligonucleotide to determine a nucleotide sequence in phage fl DNA. Proceedings of the National Academy of Sciences of the United States of America. https://www.semanticscholar.org/paper/1050fa3b8ddf2c9010483a2780507ee7144caadc
[8] Sanger, F., Donelson, J., Coulson, A., et al. (1974). Determination of a nucleotide sequence in bacteriophage f1 DNA by primed synthesis with DNA polymerase. Journal of Molecular Biology. https://www.semanticscholar.org/paper/16d7ff3fd880c6b2d32e873429c7ee9ecbc892d0
[9] Sanger, F. (1981). Determination of nucleotide sequences in DNA. Bioscience Reports. https://www.semanticscholar.org/paper/59417f667f24f456aedccefddee52a0d64d572d9
[10] Sanger, F. (1988). Sequences, sequences, and sequences. Annual Review of Biochemistry. https://www.semanticscholar.org/paper/bedb496c22cc33b4e48f265e6c14b89dfd5808ef
[11] Sanger, F., Coulson, A., Friedmann, T., et al. (1978). The nucleotide sequence of bacteriophage phiX174. Journal of Molecular Biology. https://www.semanticscholar.org/paper/9eb93aa0845d09c95a451066f381967f8cabcf51
[12] Smith, M., Brown, N., Air, G., et al. (1977). DNA sequence at the C termini of the overlapping genes A and B in bacteriophage phiX174. Nature. https://www.semanticscholar.org/paper/96de8fc863a89ba27c2c6341bcf4a76f0a2434ff
[13] Air, G., Sanger, F., & Coulson, A. (1976). Nucleotide and amino acid sequences of gene G of phiX174. Journal. https://www.semanticscholar.org/paper/b4a5a42cd60c0f0bf1e9cbbebe10360cdafb5ffe
[14] Anderson, S., Bankier, A., Barrell, B., et al. (1982). Comparison of the Human and Bovine Mitochondrial Genomes. Journal. https://www.semanticscholar.org/paper/4d6b092a79ff720a144e05f08ee09504866a11fd
[15] Sanger, F. (1975). The Croonian Lecture, 1975 Nucleotide sequences in DNA. Proceedings of the Royal Society of London. Series B. Biological Sciences. https://www.semanticscholar.org/paper/2238a01fecf94fd4f0fc9bf741e904261fbda4a9
[16] Sanger, F. (1971). Nucleotide sequences in bacteriophage ribonucleic acid. The eighth hopkins memorial lecture. Biochemical Journal. https://www.semanticscholar.org/paper/a6eb633d8c282e94f89543b506b1bd30b149fbfc
Disclaimer: This article is for educational and informational purposes only. It is not intended to substitute for professional veterinary advice, diagnosis, treatment, or regulatory guidance. Always consult a licensed veterinarian or qualified specialist regarding animal health, disease diagnosis, and therapeutic decisions.