TOSTADEN

[ Program Manual | User's Guide | Data Files | Databases ]

Table of Contents
FUNCTION
DESCRIPTION
EXAMPLE
OUTPUT
INPUT FILES
RELATED PROGRAMS
CONSIDERATIONS
COMMAND-LINE SUMMARY
LOCAL DATA FILES
OPTIONAL PARAMETERS

FUNCTION

[ Top | Next ]

ToStaden writes a GCG sequence into a file in Staden format. If the file contains a nucleotide sequence, the ambiguity codes are converted as shown in Appendix III of the Program Manual.

DESCRIPTION

[ Previous | Top | Next ]

Any sequence file in GCG format can be converted with ToStaden into a format suitable for use in the Staden programs. GCG sequence symbols that aren't recognized by the Staden programs are converted to hyphens (-).

EXAMPLE

[ Previous | Top | Next ]

Here is a session using ToStaden to convert the sequence file test.seq into a Staden-format file:


% tostaden

 TOSTADEN of what GCG sequence ?  test.seq

                  Begin (* 1 *) ?
                End (*   389 *) ?

 What should I call the output file (* test.sdn *) ?

%

OUTPUT

[ Previous | Top | Next ]

Here is the output file test.sdn:


GCTGCCGCAGCGGCNGATGACAATAACRAYTGTTGCTGYGATGACGAYGA
AGAGGARTTTTTCTTYGGTGGCGGAGGGGGNCATCACCAYATTATCATAA
THAAAAAGAARTTGTTACTTCTCCTACTGTTRCTNYTAYTGYTRYTNATG
AATAACAAYCCTCCCCCACCGCCNCAACAGCARCGTCGCCGACGGCGGAG
AAGGCGNAGRMGAMGGMGRMGNTCTTCCTCATCGAGTAGCTCNAGYWSNA
CTACCACAACGACNGTTGTCGTAGTGGTNTGGNNNTATTACTAYGAAGAG
CAACAGSARTAATAGTGATARTRATRRABCD--GH--K-MN---RST-VW
NY------abcd--gh--k-mn---rst-vwny------

INPUT FILES

[ Previous | Top | Next ]

ToStaden accepts a single nucleotide or protein sequence as input. Here is the input file for the example above:


!!NA_SEQUENCE 1.0
This sequence contains every symbol in the alphabet of
legitimate GCG sequence characters (Appendix III).

 Test.Seq  Length: 389  January 7, 1996 21:57  Type: N  Check: 3365  ..

       1
          >starts with the codons from appendix iii>
          GCTGCCGCAG CGGCXGATGA CAATAACRAY TGTTGCTGYG ATGACGAYGA

      51  AGAGGARTTT TTCTTYGGTG GCGGAGGGGG XCATCACCAY ATTATCATAA

     101  THAAAAAGAA RTTGTTACTT CTCCTACTGT TRCTXYTAYT GYTRYTXATG

     151  AATAACAAYC CTCCCCCACC GCCXCAACAG CARCGTCGCC GACGGCGGAG

     201  AAGGCGXAGR MGAMGGMGRM GXTCTTCCTC ATCGAGTAGC TCXAGYWSXA

     251  CTACCACAAC GACXGTTGTC GTAGTGGTXT GGXXXTATTA CTAYGAAGAG

     301  CAACAGSART AATAGTGATA RTRATRR
                                       >continues with all
          uppercase sequence characters>
                                       ABC DEFGHIJKLM NOPQRSTUVW

     351  XYZ.~@&*ab cdefghijkl mnopqrstuv wxyz*@&~.

                                                    <ends with
          all lowercase sequence characters<

The function of ToStaden depends on whether your input sequence(s) are protein or nucleotide. Programs determine the type of a sequence by the presence of either Type: N or Type: P on the last line of the text heading just above the sequence. If your sequence(s) are not the correct type, turn to Appendix VI for information on how to change or set the type of a sequence.

RELATED PROGRAMS

[ Previous | Top | Next ]

The following programs convert sequences between other formats and GCG format: FromEMBL, FromGenBank, FromIG, FromPIR, FromStaden, FromFastA, ToIG, ToPIR, ToStaden and ToFastA.

DataSet creates a GCG data library from any set of sequences in GCG format. GCGToBLAST creates a database that can be searched by the BLAST program from any set of sequences in GCG format.

CONSIDERATIONS

[ Previous | Top | Next ]

All documentation and numbering is lost in the Staden-format output file. You should be sure that the Staden program you intend to use is compatible with any ambiguity codes used in your sequence.

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

All parameters for this program may be put on the command line. Use the parameter -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.


Minimal Syntax: % tostaden [-INfile1=]test.seq -Default

Prompted Parameters:

-BEGin=1 -END=389         range of interest
[-OUTfile1=]test.sdn      output file name

Local Data Files: None

Optional Parameters: None

LOCAL DATA FILES

[ Previous | Top | Next ]

None.

OPTIONAL PARAMETERS

[ Previous | Top | Next ]

None.

Printed: November 18, 1996 13:07 (1162)

[ Program Manual | User's Guide | Data Files | Databases ]


Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com

Copyright (c) 1982, 1983, 1985, 1986, 1987, 1989, 1991, 1994, 1995, 1996, 1997 Genetics Computer Group, Inc. a wholly owned subsidiary of Oxford Molecular Group, Inc. All rights reserved.

Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

Genetics Computer Group

www.gcg.com