Molecular markers – the key to molecular breeding strategies

Plant breeders and geneticists are keen to know how many genes determine important traits, where these genes are located, how the genes interact, and how the genes interact with the environment. The processes of gene discovery, characterization, and selection using molecular tools can be described as "molecular breeding"

I. Given the large number of genes per genome, huge genome sizes, as well as the technical challenges and high costs of sequencing whole genomes, it quickly became apparent that the first steps towards implementing molecular breeding strategies would involve - in many cases - surrogates for the genes themselves.  It may seem obvious, but it is worth reiterating that the search for DNA polymorphisms was not driven by a desire to complicate things, but rather by the paucity of naked eye polymorphisms (NEPs).

Revisiting the definition of an allele

These surrogates – polymorphisms in DNA sequences linked to the target genes – are known as molecular markers

If and when the target gene is identified, polymorphisms in the gene become the perfect markers

With whole genome sequences available in some crop plants, and soon to be available for many more, there will be a shift from linked (surrogate) markers to perfect markers. Ultimately, there will be a need for both types of markers. For example, in cases where gene deletion is the cause of allelic variation at a locus, a linked marker will be necessary.

II. Polymorphisms can be visualized at the metabolome, proteome, or transcriptome level but for a number of reasons (both technical and biological) DNA-level polymorphisms are currently the most targeted.

Regardless of whether it is a “perfect” or a “linked” DNA marker, there are two key considerations that need to be addressed in order for the researcher/user to visualize the underlying genetic polymorphism.

 

  1. Finding and understanding the genetic basis of the DNA-level polymorphism, which may be as small as a single nucleotide polymorphism (SNP) or as large as an insertion/deletion (INDEL) of  thousands of nucleotides.
  2. Detecting the polymorphism via a specific assay or "platform". The same DNA polymorphism may be amenable to different detection assays.

III. The two main tools for assay are are amplification (via the Polymerase Chain Reaction (PCR)) and hybridization. Both principles may incorporate the use of restriction enzymes and in some assays amplification and hybridization are combined.

A. Amplification - the Polymerase Chain Reaction (PCR schematic-1; PCR text; PCR animation -1; PCR animation-2 ):  DNA replication can be harnessed to increase the abundance of specific sequences, and these selectively amplified sequences can then be identified by the use of the appropriate label, e.g. radioactivity or fluorescence. The discovery of PCR has had such an impact on biology that the inventor, Kary Mullis, received a Nobel Prize. He now lives the happy life of a surfer in California.

PCR is based on using a special DNA polymerase to make a copy of a specific DNA fragment. The choice of what DNA will be amplified by the polymerase is determined by the primers (short pieces of synthesized DNA called oligonucleotides) that prime the polymerase reaction. The DNA between the primers is amplified by the polymerase: in subsequent reactions the original template, plus the newly amplified fragments, serve as templates. Details of the reaction include denaturing the target DNA to make it single-stranded, addition of the single stranded oligonucleotides, hybridization of the primers to the template, and primer extension.  The process is repeated as necessary until the target fragment is sufficiently amplified that it can be isolated, visualized, or manipulated as desired.  A key component of the technique is a thermostable polymerase, such as TAQ polymerase.  PCR can be used to amplify rare fragments from a pool of DNA, generate an abundance of a particular fragment from a single copy from a small sample (even fossil DNA), and it is the foundation for many types of molecular markers.      



B. Hybridization:  Single strand nucleic acids have a natural tendency to find and pair with other single strand nucleic acids with a complementary sequence.  An application of this affinity is to label one single strand with a tag – radioactivity and fluorescent dyes are often used - and then to use this probe to find complementary sequences in a population of single stranded nucleic acids. For example, if you have a cloned gene – either a cDNA or a genomic clone - you could use this as a probe to look for a homologous sequence in another DNA sample.  By denaturing the DNA in the sample, and using your labeled single stranded probe you can search the sample for the complementary sequence.  Pairing of probe and sample can be visualized by the label – e.g. on X-ray film or by measuring fluorescence. The principle of hybridization can be applied to pairing events involving DNA: DNA; DNA: RNA; and protein: antibody.  The following “blotting” techniques are used to find specific targets in populations of targets separated by electrophoresis.

Hybridization procedure

Participants

Southern

DNA: DNA

Northern

DNA: RNA

Western

Protein: antibody

Hybridization is now used in various array applications, such as DArT.

Restriction enzymes:  Restriction enzymes make cuts at defined recognition sites in DNA.  In nature, they are a defense system for bacteria, where they attack and degrade the DNA of attacking bacteriophages.  They have been harnessed for the task of systematically breaking up DNA into fragments of tractable size and for various polymorphism detection assays. Each enzyme recognizes a particular DNA sequence and cuts in a specified fashion at the sequence.  An enzyme that has a four-base recognition site will cut approximately every 256 bp (44), and more frequently than one with a six base recognition site, which in turn will cut more often than one with an eight base recognition site. The restriction enzymes are named for the organism from which they were isolated. Examples of restriction enzymes.  Note the palindromes – the same sequence is specified when each strand of the double helix is read in the opposite direction.

IV. Polymorphisms and assays: An ever-increasing number of technology platforms have been, and are being, developed to accomplish these two key considerations and these platforms lead to a bewildering array of acronyms for different types of molecular markers.  To add to the complexity, the same type of marker may be assayed on a variety of platforms.  The following is an alphabetical listing of various types of polymorphisms and/or assays.

The optimum marker is one based on known DNA sequence variations, since new and more efficient assays can be designed for the same polymorphism. However, in many species, molecular toolkits are still poorly stocked. Therefore, there is still a place for "anonymous" marker technologies.

Acronym 

Meaning 

Details

 

AFLP 

Amplified Fragment Length Polymorphism 

A combination of restriction enzyme and oligonucleotides as adapters and amplification agents that can generate large numbers of data points from a single reaction and thus reveal amplified fragments of different sizes in two or more individuals. 
A defined assay, upgraded with advances in technology.
Genetic basis of polymorphisms not known beforehand.
Can clone and sequence amplicons to establish basis of the polymorphism..

     

EST

Expressed Sequence Tag

Partial gene sequence data of a cDNA clone, which provide a sequence tag for a gene. Since Since ESTs are DNA sequences, they are amenable to a variety of assays.

     

RAPD 

Randomly Amplified Polymorphic DNA 

A PCR primer that randomly amplifies different size products based on DNA templates of two or more individuals. 
An easy, anonymous marker assay to apply, since no prior information required.
Serious limitations due to data quality.

     

RFLP 

Restriction Fragment Length Polymorphism 

Labeled DNA probe and restriction enzyme combination that reveal DNA fragments of different sizes in two or more individuals.
An "old school" technology that still has its place.
Requires a probe sequence, but basis of polymorphism not known beforehand.

 

SNP

Single Nucleotide Polymorphism

A single site in a nucleotide sequence that contains two to four allelic variations within a population at relatively high frequencies.

Since ESTs are based on sequence polymorphisms, they are amenable to a variety of assays. Known SNPs may be targeted for assay (e.g. Illumina Bead Station)
or unknown SNPs discovered by the assay (e.g. Floragenex RADs).

     

SSRs in barley  

in hazelnut

Simple Sequence Repeat (Microsatellite) 

PCR primers based on conserved regions flanking a region of short tandem repeats amplify the repeat region, which is variable between two or more individuals. 
Requires up front discovery.
A principal advantage is that SSRs are multi-allelic; they are amenable to a range of assay platforms.

 
     

STS 

Sequence Tagged Site 

A PCR primer set based on a known sequence that amplifies a specified genome region, revealing a polymorphism in two or more individuals. 

A summary of marker types

Marker design based on sequence: Mapping MFT1 in the OWBs

IX. Utility of linkage maps

  1. Establish evolutionary relationships: homoeology, synteny and orthology.
    The classic: comparative genetics in grasses (Gale and Devos.1998) Gramene - go to comparative maps
    Homoeology: Refers to chromosomes, or chromosome segments, and which are similar in terms of the order and function of the genetic loci. Homoeologous chromosomes may occur within a single allopolyploid individual (e.g. the A,B, and D, genomes in wheat), or they may be found in related species (e.g. the 1A, 1, B, 1D series and wheat and the 1H of barley).
    Orthology: Refers to genes in different species which are so similar in sequence that they are assumed to have originated from a single ancestral gene.
    Refers to genetic loci that are linked on the same chromosome. PPD-H1 in barley and rice
  2. Determine if trait associations are due to linkage or pleiotropy.
  3. Finding genes determining qualitative and quantitative phenotypes (Marker assisted selection for barley stripe rust)
  4. Map-based cloning (tomato; barley disease resistance; Vrs1).

Text  Readings:

Useful Links:

PCR: More text, animations, and technical aspects

Plant Molecular Genetics Classes