TRANSLATE

[ Program Manual | User's Guide | Data Files | Databases ]

Table of Contents

FUNCTION

DESCRIPTION

FUNCTION [ Top | Next ]

Translate translates nucleotide sequences into peptide sequences.

DESCRIPTION [ Previous | Top | Next ]

Translate creates a peptide sequence by translating nucleic acid sequences that you specify. In addition to translating a single range of a given nucleotide sequence, it can concatenate exons into a single assembly for translation. The exons can be of any length, can come from either strand, and can come from more than one sequence file. Unlike most Wisconsin Package^(TM) programs, Translate lets you specify ranges that extend across the end and into the beginning of a sequence. The terminal bell rings when a circular range is chosen.

Translate can be run either interactively or noninteractively. When you specify a single sequence to translate and -Default is not on the command line, it works interactively, prompting you for each segment to translate. To run Translate noninteractively, either use-Default on the command line or supply a multiple file specification for the input file by means of a wild card file specification or a list file. (See the INPUT FILES topic below for more detailed information.)

Translate supports the IUB-IUPAC character set for the representation of nucleotide ambiguity. See Appendix III for a list of the IUB codes and their meanings.

EXAMPLE [ Previous | Top | Next ]

Here is a session using Translate to translate the G-gamma gene in gamma.seq into the peptide sequence for the human fetal beta globin G gamma:


% translate

 TRANSLATE from what sequence ?  gamma.seq

                  Begin (* 1 *) ?  2179
                End (* 11375 *) ?  2270
               Reverse (* No *) ?

 Range begins ATGGG and ends GGAAG.  Is this correct (* Yes *) ?

 That is done, now would you like to:

   A) Add another exon from this sequence
   B) Add another exon from a new sequence

   C) Translate and then add more genes from this sequence
   D) Translate and then add more genes from a new sequence

   W) Translate assembly and write everything into a file

 Please choose one (* W *):  a

                  Begin (* 1 *) ?  2393
                End (* 11375 *) ?  2615
               Reverse (* No *) ?

 Range begins GCTCC and ends TCAAG.  Is this correct (* Yes *) ?

 That is done, now would you like to:

   A) Add another exon from this sequence
   B) Add another exon from a new sequence

   C) Translate and then add more genes from this sequence
   D) Translate and then add more genes from a new sequence

   W) Translate assembly and write everything into a file

 Please choose one (* W *):  a

                  Begin (* 1 *) ?  3502
                End (* 11375 *) ?  3630
               Reverse (* No *) ?

 Range begins CTCCT and ends ACTGA.  Is this correct (* Yes *) ?

 That is done, now would you like to:

   A) Add another exon from this sequence
   B) Add another exon from a new sequence

   C) Translate and then add more genes from this sequence
   D) Translate and then add more genes from a new sequence

   W) Translate assembly and write everything into a file

 Please choose one (* W *):

 What should I call the output file (* gamma.pep *) ?  ggamma.pep

%

OUTPUT [ Previous | Top | Next ]: Here is the output file ggamma.pep:


!!AA_SEQUENCE 1.0
TRANSLATE of: gamma.seq check: 6474 from: 2179 to: 2270
      and of: gamma.seq check: 6474 from: 2393 to: 2615
      and of: gamma.seq check: 6474 from: 3502 to: 3630
generated symbols 1 to: 148.

Human fetal beta globins G and A gamma
from Shen, Slightom and Smithies,  Cell 26; 191-203.
Analyzed by Smithies et al. Cell 26; 345-353.

ggamma.pep  Length: 148  October 17, 1996 14:36  Type: P  Check: 6924  ..

       1  MGHFTEEDKA TITSLWGKVN VEDAGGETLG RLLVVYPWTQ RFFDSFGNLS

      51  SASAIMGNPK VKAHGKKVLT SLGDAIKHLD DLKGTFAQLS ELHCDKLHVD

     101  PENFKLLGNV LVTVLAIHFG KEFTPEVQAS WQKMVTGVAS ALSSRYH*

INPUT FILES [ Previous | Top | Next ]

Translate accepts multiple (one or more) nucleotide sequences as input. You can specify multiple sequences in a number of ways: by using a list file, for example @project.list; by using an MSF or RSF file, for example project.msf{*}; or by using a sequence specification with an asterisk (*) wildcard, for example GenEMBL:*. If Translate rejects your nucleotide sequence, turn to Appendix VI to see how to change or set the type of a sequence.

Single Sequence Input

If you specify a single sequence on the command line or in response to the first program prompt, and -Default is not on the command line, Translate prompts you for the sequence range and strand. After reading that sequence fragment, the program prompts you to specify other sequence fragments either before or after translating this fragment. You can continue to choose single sequences as input until you decide to write out the entire translated sequence.

If -Default is on the command line, Translate translates the sequence without prompting you, in accordance with any command-line parameters that are present.

Multiple Sequence Input

When you specify multiple sequences, Translate runs noninteractively. By default, Translate will translate each sequence separately and write out each translation to a separate sequence file without prompting you for the range and strand of each sequence. If you use the -ONEPEPtide command-line parameter, all of the input sequences are assembled together first and then translated into a single peptide sequence.

If you use a list file to specify multiple sequences as input, you can addbegin, end, and strand attributes for each sequence. You can use thejoin sequence attribute to selectively assemble some of the sequence entries in a list file together before translation. All sequences listed contiguously in the list file that share the samejoin attribute (i.e. share the same sequence name following the join token) are assembled together before translation and the translated sequence is given the name of the join attribute. All other sequences in the list file are translated separately. Here is an example of an input list file, hsp70dna.list, for Translate.


!!SEQUENCE_LIST 1.0
Example list file of 70kD heat shock coding sequences used as input
for TRANSLATE
  ..
gb_in:tchsp70   Begin:  302 End: 2341
gb_pl:phhsp70g  Begin:  240 End:  453 Join: hsp70_petunia
gb_pl:phhsp70g  Begin: 1076 End: 2817 Join: hsp70_petunia
gb_pr:humhsp70  Begin:  228 End:  968

Using this file as input, Translate writes three output files. The first output file contains a translation of the first sequence entry in this list file. The second output file, hsp70_petunia.pep, is a translation of an assembly of the next two sequence entries. The last output file contains a translation of the last sequence entry in this list file. For more information about list files, see "Using List Files" in Chapter 2, Using Sequence Files and Databases in the User's Guide.

ExtractPeptide can write one or more of the translation frames from the Map program output into peptide sequence files. The PepData program translates sequences in all six frames.

RESTRICTIONS [ Previous | Top | Next ]

Unknown.

CONSIDERATIONS [ Previous | Top | Next ]

Translate allows you to translate sequences where the reading frame is interrupted. This frame-interruption commonly occurs across intervening sequences as in the example above, where a single codon is divided by the first intervening sequence. To accommodate frame interruption, Translate allows you to specify ranges (exons) that are not an even multiple of three in length. Translate concatenates the nucleotide ranges (exons) that you define and translates the concatenated nucleotide sequence assembly (gene) only at the moment you choose a menu item that starts with the word Translate.

If you continue after translation, you are in effect building a new assembly (gene) and concatenating the peptide sequence from the new gene onto the peptide sequence you have already created.

COMMAND-LINE SUMMARY [ Previous | Top | Next ]

All parameters for this program may be put on the command line. Use the parameter -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.


Minimal Syntax: % translate -[INfile=]@Hsp70DNA.List -Default

Prompted Parameters:

[-OUTfile=]hsp70.pep       output file name (single output sequence only)

Local Data Files:

-TRANSlate=translate.txt   contains the genetic code

Optional Parameters:

-BEGin=1 -END=100          range of interest for each sequence
-REVerse                   strand for each sequence
-ONEPEPtide                translate all concatentated DNA fragments into
                             a single peptide
-NOJOIN                    ignore all "join" sequence attributes
                             specified in a list file
-LIStfile[=translate.list] writes a list file of output sequence names
-EXTension=.pep            sets the file name extension for output
                             sequence files
-NOMONitor                 suppresses the screen monitor

LOCAL DATA FILES [ Previous | Top | Next ]

The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.

The translation of codons to amino acids, the identification of potential start codons and stop codons, and the mappings of one-letter to three-letter amino acid codes are all defined in a translation table in the file translate.txt. If the standard genetic code does not apply to your sequence, you can provide a modified version of this file in your working directory or name an alternative file on the command line with an expression like -TRANSlate=mycode.txt. Translation tables are discussed in more detail in Appendix VII.

OPTIONAL PARAMETERS [ Previous | Top | Next ]

The parameters listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.

-BEGin=1

sets the beginning position for all input sequences. When the beginning position is set from the command line, Translate ignores beginning positions specified for individual sequences in a list file.

-END=100

sets the ending position for all input sequences. When the ending position is set from the command line, Translate ignores ending positions specified for sequences in a list file.

-REVerse

sets the program to use the reverse strand for each input sequence. When -REVerse or -NOREVerse is on the command line, Translate ignores any strand designation for individual sequences in a list file.

-NOJOIN

sets Translate to ignore alljoin sequence attributes specified in the input list file. All nucleotide sequences specified in the list file are translated into separate output sequence files.

-ONEPEPtide

concatenates all input sequences together and then translates them all into a single peptide sequence.

-TRANSlate=filename.txt

Usually, translation is based on the translation table in a default or local data file called translate.txt. This parameter allows you to use a translation table in a different file. (See Appendix VII for information about translation tables.)

-LIStfile=translate.list

writes a list file with the names of the output sequence files. This list file is suitable for input to other Wisconsin Package programs that support list files (see Chapter 2, Using Sequence Files and Databases in the User's Guide.) If you don't specify a file name, then Translate makes one up using translate for the file name and .list for the file name extension.

-EXTension=.pep

This program normally creates output file names by using the original input file name for the base name and the program name for the name extension. Use this parameter to specify some other file name extension.

-MONitor

This program normally monitors its progress on your screen. However, when you use -Default to suppress all program interaction, you also suppress the monitor. You can turn it back on with this parameter. If you are running the program in batch, the monitor will appear in the log file.

Printed: November 18, 1996 13:08 (1162)

[ Program Manual | User's Guide | Data Files | Databases ]

Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com

Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.