We’ve previously discussed the sheer size of tree space.
As the number of taxa (or leaves) increases, the number of possible trees grows so large that evaluating every possible topology becomes computationally impossible.
In maximum liklihood phylogenetics the goal is to find the tree with the greatest likelihood. As we cannot evaluate all possible topologies, we instead need to explore smaller regions of tree space.
To tackle this, phylogenetic algorithms use search strategies to explore tree space in structured moves, without examining every possible tree.
One such method is Subtree Pruning and Regrafting (SPR).
If we randomly sampled trees without SPR, the search would be very inefficient.
This is because tree space has local neighbourhood structure. Small, local changes are informative as likelihood changes gradually across nearby trees, allowing you to identify which structural modifications improve the fit.
In other words, we gain information on which region of tree space is better or worse (something you would not get through random exploration of tree space).
Stepping stones
Imagine tree space as stepping stones in a river at night, your goal is to get across safely but there’s a catch.
You can see the first stone (your starting tree) but it is too dark to see the other stones.
You step off the riverbank and land on the first stone. From there you need to decide where to go next.
Luckily as you land, SPR lights up the possible stones around you, where each illuminated stone represents a neighbouring hypotheses you can reach.
Next you need to find a way of choosing which stone to hop onto next.
You have one tool to help you on your journey, your handy likelihood meter.
Using your meter, you evaluate each of the illuminated stones, measuring how well your data supports each possible move (i.e. how well that hypothesis fits the data).
You repeat this process as you make your way across the river.
Every time you land on a new stone, you re-evaluate your choice by checking the likelihood of the next move, gradually exploring different evolutionary hypotheses as you go.
SPR operations
SPR is a tree rearrangement operation, meaning we always need an initial starting tree before we can make rearrangements. This starting tree can be neighbour-joining (see this previous post for more on NJ) , a random tree or a parsimony tree.
During SPR, a subtree is pruned (removed) from the tree and then regrafted onto a different branch, creating a new topology. The likelihood is then recalculated for this new tree.

Each regrafting position represents a new evolutionary hypothesis that can be evaluated using the likelihood function. If the new arrangement provides a better fit to the data, the search can continue from this updated tree.
By repeatedly pruning and regrafting subtrees, phylogenetic algorithms can explore large regions of tree space while evaluating only a tiny fraction of all possible topologies.
SPR is also widely used in Bayesian analyses. In this setting, rather than searching for a single best tree, the algorithm proposes new topologies using SPR moves and samples trees according to their posterior probability, allowing measurement of uncertainty in evolutionary relationships.
We’ll explore bayesian phylogenetic methods some more in the next post.