# Akemi Kashiwada

Harvey Mudd College Mathematics 2005

Thesis Proposal: | Game Theoretic Indices for Phylogenetic Trees |
---|---|

Thesis Advisor: | Prof. Francis E. Su |

Second Reader: | Prof. Stephen Adolph |

E-Mail: | akashiwada@hmc.edu |

Senior Thesis: | Constructing Phylogenetic Trees from Subsplits |

Poster: | Poster presented at Harvey Mudd College 2005 Presentation Days |

Presentation Slides: | Presentation Slides |

## Constructing Phylogenetic Trees from Subsplits

Phylogenetic trees represent theoretical evolutionary relationships
among various species. Mathematically they can be described as
weighted binary trees and the leaves represent the taxa being
compared. Reconstruction of such trees from biological data is currently
one of the big questions in mathematical biology. If biologists are able
to separate the taxa into two distinct groups, then we call such a partition
a *split*. In 1971, Peter
Buneman published a paper showing that each split can be thought of
as an edge of some tree and proved that a labeled tree *T* is
uniquely defined by compatible splits. Since the leaves of phylogenetic
trees are taxa, splits reveal which taxa are more closely related;
those in the same subset share a more recent common ancestor than with the
taxa in the other subset. If biologists can only get partial splits, called
*subsplits* from their data, can a tree still be constructed?

This question was considered in a paper published in 2002 by Charles Semple and Mike Steel. They were able to provide criteria necessary to reconstruct a unique tree from partial split information. One of the conditions required that the subsplits be consistent with a tree. However, determining which trees satisfied this condition was not apparent in their paper. So my senior thesis question is how do we know when a set of subsplits consistent with a tree and what does that tree (or set of trees) look like.

I was able to develop a reconstruction algorithm that would allow us to
determine when a set of compatible *(n-1)*-leaf subsplits could be
realized by an *n*-leaf tree. The idea of the algorithm is to add the
missing leaf to each subsplit while maintaining the compatibility of the
splits. Thus using Buneman's result we can reconstruct the tree that is
consistent with our given set of subsplits. My main theorem proves that my
algorithm will halt after transforming all of the subsplits into splits
*if and only if* the subsplits are consistent with an *n*-leaf tree.