In this post we’ll introduce a few key ideas relating to the molecular clock.
What is the Molecular Clock?
The molecular clock is a framework to translate genetic change into time.
Through the process of evolution, mutations, or changes in DNA sequence (A,T,C and G), accumulate along lineages.
If these mutations accumulate in a ‘clock-like’ manner the rate of evolutionary change is constant through time.
Under this assumption genetic change becomes proportional to time, allowing us to estimate divergence events in units of time, eg Millions of years.
Why use trees?
Trees naturally let us translate continuous processes into structured events.
In a tree, branch lengths reflect accumulated genetic change while topology reflects the order of divergence events.
Trees allow us to translate the continuous process of evolution into discrete events, such as the point at which two lineages diverge from a shared common ancestor.
When a phylogenetic tree is calibrated using known dates, such as fossil ages or biogeographic events, the tree becomes a chronogram or time tree.
See Introduction to tree diagrams for a recap on different types of trees.
In a chronogram, branch lengths are scaled to time, allowing divergence events to be estimated in units of time such as Millions of Years.
Different Genes, Different Clocks
Genes in a genome do not all tick at the same rate.
Mutation rates can vary between populations, genomes (eg. nuclear vs mitochondrial), between genes and even in different regions of the same genes.
As a result we need to choose our molecular clock markers carefully, depending on the questions we want to answer.
Slow evolving markers are better suited to resolve deep events whereas fast evolving markers are more suited to resolving recent events.
Below is a table of commonly used molecular markers (generalised for eukaryotes):
| Molecular marker | Genome | Clock Rate (relative) | Temporal Resolution | Saturation risk |
| rRNA genes (18S, 28S) | Nuclear | Very slow | Deep time | Low |
| cox1(COI) | Mitochondrial | Moderate | Mid-recent time | Moderate |
| ITS (ITS1 / ITS2) | Nuclear | Moderate to fast | Recent – mid time | Moderate |
| Introns | Nuclear | Fast | Recent time | high |
| Microsatellites (STRs) | Nuclear | Very fast | Recent time | Very high |
Slowly evolving markers resolve deep events but lack resolution for fine-scale population structure (eg. Cox1, rRNA).
Conversely, fast evolving markers resolve recent events but lose information over long timescale (eg microsatellites).
How is information lost over time?
Genetic change is most informative as evolutionary information when it increases approximately linearly with time.
Fast evolving genes, however, are susceptible to substitution saturation.
Saturation occurs over long timescales, when the same nucleotide positions mutate multiple times.
As a result older substitutions are overwritten by newer ones obscuring historical signal and leading to underestimations in divergence times.
Substitution models can account for some degree of saturation by modelling hidden substitution events, however their effectiveness is limited when substitution saturation is high.
We’ll cover more on substitution models and states in a later post.
The Relative Rate Test

Figure 1.
Representing the three-point condition as (I) an isosceles triangle (II) a bifurcating tree.
So how do we know if a gene is ‘clock like‘ ?
Remember the 3 point condition and ultrametric trees? For a quick recap browse through Foundations of phylogenetics.
Consider our three sequences, x, y and z, and let z be the out-group.
We can see in fig 1, that x and y are more closely related to each other than either is to z.
Under the molecular clock hypothesis d(x,z) == d(z,y) irrespective of the substitution model and whether or not the substitution rate varies with the site.
In our isosceles triangle fig1(II) we can see the distances d(x,z) and d(z,y) are equal and shorter than the distance between x and y.
The relative rate test (Tajima 1993) builds on this idea, using site by site comparisons between lineages, using the outgroup (in our example z) for reference.
Under the molecular clock hypothesis, both x and y have existed for the same amount of time since their divergence from their MRCA, and therefore should have accumulated the same number of substitutions.
If one lineage, eg x has accumulated significantly more unique changes than expected by chance compared to y, then the null hypothesis (i.e the molecular clock hypothesis) cannot be accepted.
This is evidence of rate variation, or rate heterogeneity, between lineages.
Rate Heterogeneity
Rate heterogeneity can arise due to a number of factors, we’ll cover this in the next post where we’ll explore evolutionary forces and molecular evolution.
There are a number of different molecular clock models designed to handle genetic data with heterogenous rates.
Such models account for rate heterogeneity between and/or within lineages, by allowing the rate to vary.
These include global (or strict)clocks, local clocks, autocorrelated clocks and relaxed clocks. The figure below illustrates some of the available models and how they describe rate variation:

Figure 2.
Taken from Ho and Duchêne et al 2014, illustrating types of molecular clock models, where each evolutionary rate is assigned a different colour.
To summarise, the molecular clock turns genetic change into evolutionary time.
Not all clocks tick at the same rate but this doesn’t prevent us from making use of the molecular clock for estimating divergence times.
The molecular clock has become an invaluable tool for tracing historic evolutionary events and in monitoring disease outbreaks.
We’ll explore more about the molecular clock in future posts!