version 3.5c
MIX - Mixed method parsimony
(c) Copyright 1986-1993 by Joseph Felsenstein and by the University of
Washington. Written by Joseph Felsenstein. Permission is granted to copy this
document provided that no fee is charged for it and that this copyright notice
is not removed.
MIX is a general parsimony program which carries out the Wagner and
Camin-Sokal parsimony methods in mixture, where each character can have its
method specified separately. The program defaults to carrying out Wagner
parsimony.
The Camin-Sokal parsimony method explains the data by assuming that
changes 0 --> 1 are allowed but not changes 1 --> 0. Wagner parsimony allows
both kinds of changes. (This under the assumption that 0 is the ancestral
state, though the program allows reassignment of the ancestral state, in which
case we must reverse the state numbers 0 and 1 throughout this discussion).
The criterion is to find the tree which requires the minimum number of changes.
The Camin-Sokal method is due to Camin and Sokal (1965) and the Wagner method
to Eck and Dayhoff (1966) and to Kluge and Farris (1969).
Here are the assumptions of these two methods:
1. Ancestral states are known (Camin-Sokal) or unknown (Wagner).
2. Different characters evolve independently.
3. Different lineages evolve independently.
4. Changes 0 --> 1 are much more probable than changes 1 --> 0 (Camin-
Sokal) or equally probable (Wagner).
5. Both of these kinds of changes are a priori improbable over the
evolutionary time spans involved in the differentiation of the group in
question.
6. Other kinds of evolutionary event such as retention of polymorphism are
far less probable than 0 --> 1 changes.
7. Rates of evolution in different lineages are sufficiently low that two
changes in a long segment of the tree are far less probable than one change in
a short segment.
That these are the assumptions of parsimony methods has been documented in
a series of papers of mine: (1973a, 1978b, 1979, 1981b, 1983b, 1988b). For an
opposing view arguing that the parsimony methods make no substantive
assumptions such as these, see the papers by Farris (1983) and Sober (1983a,
1983b), but also read the exchange between Felsenstein and Sober (1986).
INPUT FORMAT
The input for MIX is the standard input for discrete characters programs,
described above in the documentation file for the discrete-characters programs.
States "?", "P", and "B" are allowed.
The options are selected using a menu:
Mixed parsimony algorithm, version 3.5c
Settings for this run:
U Search for best tree? Yes
X Use Mixed method? No
P Parsimony method? Wagner
J Randomize input order of species? No. Use input order
O Outgroup root? No, use as outgroup species 1
T Use Threshold parsimony? No, use ordinary parsimony
A Use ancestral states in input file? No
M Analyze multiple data sets? No
0 Terminal type (IBM PC, VT52, ANSI)? ANSI
1 Print out the data at start of run No
2 Print indications of progress of run Yes
3 Print out tree Yes
4 Print out steps in each character No
5 Print states at all nodes of tree No
6 Write out trees onto tree file? Yes
Are these settings correct? (type Y or the letter for one to change)
The options U, J, O, T, A, and M are the usual User Tree, Jumble,
Outgroup, Ancestral States, and Multiple Data Sets options, described either in
the main documentation file or in the Discrete Characters Programs
documentation file. The user-defined trees supplied if you use the U option
must be given as rooted trees with two-way splits (bifurcations). The O option
is acted upon only if the final tree is unrooted and is not a user-defined
tree. One of the important uses of the the O option is to root the tree so
that if there are any characters in which the ancestral states have not been
specified, the program will print out a table showing which ancestral states
require the fewest steps. Note that when any of the characters has Camin-Sokal
parsimony assumed for it, the tree is rooted and the O option will have no
effect.
The option P toggles between the Camin-Sokal parsimony criterion and the
default Wagner parsimony criterion. Option X invokes mixed-method parsimony.
If the A option is invoked, the ancestor is not to be counted as one of the
species.
In the input file the W (Weights) option is available, as usual. The A
(Ancestral states) and X (Mixed methods) also require the options to be
declared on the first line of the input file and other information to be
present in the input file. The Ancestral States option also requires that you
choose the A option in the input menu. Note that the X option requires that
the option M (Mixture) be declared and the mixture information provided in the
input file. This is admittedly confusing. The F (Factors) option is not
available in this program, as it would have no effect on the result even if
that information were provided in the input file.
OUTPUT FORMAT
Output is standard: a list of equally parsimonious trees, which will be
printed as rooted or unrooted depending on which is appropriate, and, if the
user chooses, a table of the number of changes of state required in each
character. If the Wagner option is in force for a character, it may not be
possible to unambiguously locate the places on the tree where the changes
occur, as there may be multiple possibilities. If the user selects menu option
5, a table is printed out after each tree, showing for each branch whether
there are known to be changes in the branch, and what the states are inferred
to have been at the top end of the branch. If the inferred state is a "?"
there will be multiple equally-parsimonious assignments of states; the user
must work these out for themselves by hand.
If the Camin-Sokal parsimony method is invoked and the Ancestors option is
also used, then the program will infer, for any character whose ancestral state
is unknown ("?") whether the ancestral state 0 or 1 will give the fewest state
changes. If these are tied, then it may not be possible for the program to
infer the state in the internal nodes, and these will all be printed as ".".
If this has happened and you want to know more about the states at the internal
nodes, you will find helpful to use MOVE to display the tree and examine its
interior states, as the algorithm in MOVE shows all that can be known in this
case about the interior states, including where there is and is not amibiguity.
The algorithm in MIX gives up more easily on displaying these states.
If the A option is not used, then the program will assume 0 as the
ancestral state for those characters following the Camin-Sokal method, and will
assume that the ancestral state is unknown for those characters following
Wagner parsimony. If any characters have unknown ancestral states, and if the
resulting tree is rooted (even by outgroup), a table will also be printed out
showing the best guesses of which are the ancestral states in each character.
You will find it useful to understand the difference between the Camin-Sokal
parsimony criterion with unknown ancestral state and the Wagner parsimony
criterion.
If the U (User Tree) option is used and more than one tree is supplied,
the program also performs a statistical test of each of these trees against the
best tree. This test, which is a version of the test proposed by Alan
Templeton (1983) and evaluated in a test case by me (1985a). It is closely
parallel to a test using log likelihood differences invented by Kishino and
Hasegawa (1989), and uses the mean and variance of step differences between
trees, taken across characters. If the mean is more than 1.96 standard
deviations different then the trees are declared significantly different. The
program prints out a table of the steps for each tree, the differences of each
from the highest one, the variance of that quantity as determined by the step
differences at individual sites, and a conclusion as to whether that tree is or
is not significantly worse than the best one. It is important to understand
that the test assumes that all the binary characters are evolving
independently, which is unlikely to be true for many suites of morphological
characters.
At the beginning of the program are a series of constants, which can be
changed to help adapt the program to different computer systems. The constant
"nmlngth" is the length of the species names. "maxtrees" is the maximum number
of trees which the program will store for output. "maxuser" is the maximum
number of user trees which can be input to the program. You may want to change
these constants to reduce memory requirements.
The program is descended from earlier programs SOKAL and WAGNER which have
since been removed from the PHYLIP package, since MIX has all their capabilites
and more.
-------------------------------TEST DATA SET----------------------------
5 6
Alpha 110110
Beta 110000
Gamma 100110
Delta 001001
Epsilon 001110
---------- TEST SET OUTPUT (with all numerical options on) -------------
Mixed parsimony algorithm, version 3.5c
Wagner parsimony method
Name Characters
---- ----------
Alpha 11011 0
Beta 11000 0
Gamma 10011 0
Delta 00100 1
Epsilon 00111 0
4 trees in all found
+--Epsilon
+-----4
! +--Gamma
+--2
! ! +--Delta
--1 +-----3
! +--Beta
!
+-----------Alpha
remember: this is an unrooted tree!
requires a total of 9.000
steps in each character:
0 1 2 3 4 5 6 7 8 9
*-----------------------------------------
0! 2 2 2 1 1 1
From To Any Steps? State at upper node
( . means same as in the node below it on tree)
1 1?011 0
1 2 no .?... .
2 4 maybe .0... .
4 Epsilon yes 0.1.. .
4 Gamma no ..... .
2 3 yes .?.00 .
3 Delta yes 001.. 1
3 Beta maybe .1... .
1 Alpha maybe .1... .
+--------Gamma
!
+--2 +--Epsilon
! ! +--4
! +--3 +--Delta
--1 !
! +-----Beta
!
+-----------Alpha
remember: this is an unrooted tree!
requires a total of 9.000
steps in each character:
0 1 2 3 4 5 6 7 8 9
*-----------------------------------------
0! 1 2 1 2 2 1
From To Any Steps? State at upper node
( . means same as in the node below it on tree)
1 1?011 0
1 2 no .?... .
2 Gamma maybe .0... .
2 3 maybe .?.?? .
3 4 yes 001?? .
4 Epsilon maybe ...11 .
4 Delta yes ...00 1
3 Beta maybe .1.00 .
1 Alpha maybe .1... .
+--------Epsilon
+--4
! ! +-----Gamma
! +--2
--1 ! +--Delta
! +--3
! +--Beta
!
+-----------Alpha
remember: this is an unrooted tree!
requires a total of 9.000
steps in each character:
0 1 2 3 4 5 6 7 8 9
*-----------------------------------------
0! 2 2 2 1 1 1
From To Any Steps? State at upper node
( . means same as in the node below it on tree)
1 1?011 0
1 4 maybe .0... .
4 Epsilon yes 0.1.. .
4 2 no ..... .
2 Gamma no ..... .
2 3 yes ...00 .
3 Delta yes 0.1.. 1
3 Beta yes .1... .
1 Alpha maybe .1... .
+--------Gamma
+--2
! ! +-----Epsilon
! +--4
--1 ! +--Delta
! +--3
! +--Beta
!
+-----------Alpha
remember: this is an unrooted tree!
requires a total of 9.000
steps in each character:
0 1 2 3 4 5 6 7 8 9
*-----------------------------------------
0! 2 2 2 1 1 1
From To Any Steps? State at upper node
( . means same as in the node below it on tree)
1 1?011 0
1 2 maybe .0... .
2 Gamma no ..... .
2 4 maybe ?.?.. .
4 Epsilon maybe 0.1.. .
4 3 yes ?.?00 .
3 Delta yes 0.1.. 1
3 Beta yes 110.. .
1 Alpha maybe .1... .