Tuesday, March 2, 2010

Please don't use Clustal for tree construction!

{{en|A phylogenetic tree of life, showing the ...Image via Wikipedia

There are reams of books, articles, and websites about the correct way to build a phylogenetic tree. My post is not to argue about what is the best method, but rather point out that most people do not consider Clustal (e.g. ClustalX or ClustalW) to be an optimal solution in almost any circumstance. Countless times I have asked people how they built their particular tree and they give me the vague "Clustal" answer. Of course this answer is fine if this is the first tree you ever constructed, but beware you will be labelled as a phylogenetic newbie.

Clustal is technically a multiple alignment algorithm, but it also includes methods for tree construction in the same interface. Most of these methods are not really considered "good" tree building methods. If you do use Clustal, at least specify what tree building method you used (ie. "Clustal with neighbor joining"). Most people don't use Clustal even for multiple alignment anymore, because Muscle has been shown to be at least as accurate as Clustal and is much faster.

For tree construction, most people would agree that a Maximum Likelihood or Bayesian method would almost always be a better solution; PhyML and Mr. Bayes seem to be the most popular implementations for these methods. Advanced users might also want to look into using Beast.

I usually interact with most of these programs through a command line interface, so I don't have an expansive knowledge of the best graphical tool. However, I did come across, "Robust Phylogenetic Analysis For The Non-Specialist" which does a good job allowing easy interaction between various methods for multiple sequence alignment, tree construction, and tree viewing.

Whatever you use to build trees, just make sure it isn't Clustal!
Reblog this post [with Zemanta]