index pagewebsites.jpg

banner
PROTEIN PHYLOGENETICS


Exercise 1. Calculate a distance matrix with PROTDIST

Execute the program PROTDIST by typing:
protdist

You will be asked the name of a file, type:
inf6.760

This  allows protdist to read the sequence alignment (note that if a file is called infile, all PHYLIP programs automatically read this file whether or not it contains the appropriate data).

You will see a selection of commands, type
p

to change the distance matrix to be used. This will select the Kimura formula (it approximates the PAM matrix and has the advantage of being much faster then the PAM estimates) see the PHYLIP WWW page for more information on this.

The distances are written to a file called outfile (note that _all_ PHYLIP programs will output _all_ results to a file called "outfile" which will overwrite any existing "outfile").

You can have a look at the distances by typing:
more outfile

and scroll along the results by either typing enter or the space bar. To allow latter inspections of these distance save the outfile under a different name by using either the UNIX command "mv" or the command "cp":
mv  This command means 'move'
cp This command means 'copy

Here are 2 examples of their usage:
mv outfile newfile This command deletes 'outfile'
cp outfile newfile This command does not delete 'outfile'

To estimate a tree from these distances you should first copy the outfile as infile. Then execute the FITCH program by typing:
fitch

As above you will have several options. Typically the user selects:

(1)  An outgroup, type
o

and select taxon number 6 (note that all trees inferred by FITCH are formally unrooted).

(2)  The jumble option to perform tree search with random addition of taxa, type
j

and type an odd number say 67, and then select the number of jumble to be performed say 10

Type
y
to start the analysis.

The results of a PHYLIP tree search is found in two files. The outfile contains in a text file the diagram of the tree topology, the number of trees searched and the branch lengths. The treefile contains the tree topology and branch lengths (if a distance methods is used) in a format (Newick or New Hampshire format) allowing the viewing and manipulation of trees in a series several programs such as
 
TREEVIEW http://evolution.genetics.washington.edu/phylip/software.etc2.html#TreeView,
NJPLOT http://evolution.genetics.washington.edu/phylip/software.etc2.html#NJplot
TREETOOL  

For a quick look at the result type:
more outfile

What is the phylogenetic position of the Micropsporidion Vairimorpha necatrix among the other eukaryotes?

Abbreviations:   Tri.vag: Trichomonas vaginalis, Ara.tha: Arabidopsis thaliana, Hom.sap: Homo sapiens, Sac.cer: Saccharomyces cerevisiae, Pla.fal.Plasmodium falciparum.

<<Prev | Next >>