COMPTABLE

[ Program Manual | User's Guide | Data Files | Databases ]

Table of Contents
FUNCTION
DESCRIPTION
EXAMPLE
OUTPUT
INPUT FILES
RELATED PROGRAMS
CONSIDERATIONS
COMMAND-LINE SUMMARY
LOCAL DATA FILES
OPTIONAL PARAMETERS

FUNCTION

[ Top | Next ]

CompTable creates a scoring matrix using equivalences defined in a simplification scheme such as the one used for Simplify. (See the Chapter 4, Using Data Files in the User's Guide for more information.)

DESCRIPTION

[ Previous | Top | Next ]

Scientists comparing protein sequences sometimes want to consider similar amino acids as equivalent. Sequence simplification can be done either by changing the symbols in the sequences being compared (see Simplify) or, for programs that use scoring matrices, by creating a table that scores matches between the symbols you consider to be equivalent.

EXAMPLE

[ Previous | Top | Next ]

Here is a session using CompTable to make a scoring matrix with the standard simplification file used by Simplify (you can use Fetch to make a copy of simplify.txt and modify it to create the input file for CompTable):


% comptable

 COMPTABLE from what simplification file ?  simplify.txt

 What is the comparison match value (* 10 *) ?

 What is the comparison mismatch value (* -2 *) ?  0

 Are you creating a protein scoring matrix (* Yes *) ?

 What should I call the output file (* simplify.cmp *) ?

%

OUTPUT

[ Previous | Top | Next ]

Here is part of the output scoring matrix file:


!!AA_SCORING_MATRIX_RECT 1.0
 COMPTABLE of: simplify.txt  FileCheck: 327

A standard simplification used by SIMPLIFY and WORDSEARCH to simplify
peptide sequences.  The first line below means "for all of the P, A, G,
S, or T characters in the sequence, substitute A." The program COMPTABLE
can construct a symbol comparison table with the equivalences from this
file.

                     October 18, 1996 12:19   ..

{
GAP_CREATE 20
GAP_EXTEND 1
}

      A    B    C    D    E     F    G    H    I    J     K    L  ...  ..
A    10    0    0    0    0     0   10    0    0    0     0    0  ...
B     0   10    0   10   10     0    0    0    0    0     0    0  ...
C     0    0   10    0    0     0    0    0    0    0     0    0  ...
D     0   10    0   10   10     0    0    0    0    0     0    0  ...
E     0   10    0   10   10     0    0    0    0    0     0    0  ...

See Appendix VII for more information about scoring matrices.

INPUT FILES

[ Previous | Top | Next ]

CompTable accepts a simplification table file as input. Here is the input file for the example above:


!!SIMPLIFY 1.0
A standard simplification used by SIMPLIFY and WORDSEARCH to simplify
peptide sequences.  The first line below means "for all of the P, A, G,
S, or T characters in the sequence, substitute A." The program COMPTABLE
can construct a symbol comparison table with the equivalences from this
file.

10/7/84 ..

A PAGST
D QNEDBZ
H HKR
I LIVM
F FYW
C C

RELATED PROGRAMS

[ Previous | Top | Next ]

Simplify simplifies a sequence file with the simplifications from a simplification table.

CONSIDERATIONS

[ Previous | Top | Next ]

CompTable calculates default gap creation and extension penalties to write in the auxiliary data block in the output scoring matrix file that are appropriate for the type of scoring matrix you are creating (protein or nucleotide ) and for the comparison match and mismatch values that you specify. You can use -GAPweight and -LENgthweight to specify alternative gap penalties if you don't want to accept the default values.

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

Complete command-line control is not available for this program.

LOCAL DATA FILES

[ Previous | Top | Next ]

None.

OPTIONAL PARAMETERS

[ Previous | Top | Next ]

The parameters listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.

-GAPweight

specifies the default gap creation penalty associated with the scoring matrix. This penalty is written in the auxiliary data block in the output scoring matrix file. If you don't specify a default gap creation penalty with -GAPweight, the program calculates a reasonable default and writes it in the auxiliary data block. (See Appendix VII for information about the auxiliary data block in scoring matrix files.)

-LENgthweight

specifies the default gap extension penalty associated with the scoring matrix. This penalty is written in the auxiliary data block in the output scoring matrix file. If you don't specify a default gap extension penalty with -LENgthweight, the program calculates a reasonable default and writes it in the auxiliary data block. (See Appendix VII for information about the auxiliary data block in scoring matrix files.)

Printed: November 18, 1996 13:08 (1162)

[ Program Manual | User's Guide | Data Files | Databases ]


Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com

Copyright (c) 1982, 1983, 1985, 1986, 1987, 1989, 1991, 1994, 1995, 1996, 1997 Genetics Computer Group, Inc. a wholly owned subsidiary of Oxford Molecular Group, Inc. All rights reserved.

Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

Genetics Computer Group

www.gcg.com