[ Program Manual | User's Guide | Data Files | Databases ]
Translate translates nucleotide sequences into peptide sequences.
Translate creates a peptide sequence by translating nucleic acid sequences that you specify. In addition to translating a single range of a given nucleotide sequence, it can concatenate exons into a single assembly for translation. The exons can be of any length, can come from either strand, and can come from more than one sequence file. Unlike most Wisconsin Package(TM) programs, Translate lets you specify ranges that extend across the end and into the beginning of a sequence. The terminal bell rings when a circular range is chosen.
Translate can be run either interactively or noninteractively. When you specify a single sequence to translate and -Default is not on the command line, it works interactively, prompting you for each segment to translate. To run Translate noninteractively, either use -Default on the command line or supply a multiple file specification for the input file by means of a wild card file specification or a list file. (See the INPUT FILES topic below for more detailed information.)
Translate supports the IUB-IUPAC character set for the representation of nucleotide ambiguity. See Appendix III for a list of the IUB codes and their meanings.
Here is a session using Translate to translate the G-gamma gene in gamma.seq into the peptide sequence for the human fetal beta globin G gamma:
% translate TRANSLATE from what sequence ? gamma.seq Begin (* 1 *) ? 2179 End (* 11375 *) ? 2270 Reverse (* No *) ? Range begins ATGGG and ends GGAAG. Is this correct (* Yes *) ? That is done, now would you like to: A) Add another exon from this sequence B) Add another exon from a new sequence C) Translate and then add more genes from this sequence D) Translate and then add more genes from a new sequence W) Translate assembly and write everything into a file Please choose one (* W *): a Begin (* 1 *) ? 2393 End (* 11375 *) ? 2615 Reverse (* No *) ? Range begins GCTCC and ends TCAAG. Is this correct (* Yes *) ? That is done, now would you like to: A) Add another exon from this sequence B) Add another exon from a new sequence C) Translate and then add more genes from this sequence D) Translate and then add more genes from a new sequence W) Translate assembly and write everything into a file Please choose one (* W *): a Begin (* 1 *) ? 3502 End (* 11375 *) ? 3630 Reverse (* No *) ? Range begins CTCCT and ends ACTGA. Is this correct (* Yes *) ? That is done, now would you like to: A) Add another exon from this sequence B) Add another exon from a new sequence C) Translate and then add more genes from this sequence D) Translate and then add more genes from a new sequence W) Translate assembly and write everything into a file Please choose one (* W *): What should I call the output file (* gamma.pep *) ? ggamma.pep %
!!AA_SEQUENCE 1.0 TRANSLATE of: gamma.seq check: 6474 from: 2179 to: 2270 and of: gamma.seq check: 6474 from: 2393 to: 2615 and of: gamma.seq check: 6474 from: 3502 to: 3630 generated symbols 1 to: 148. Human fetal beta globins G and A gamma from Shen, Slightom and Smithies, Cell 26; 191-203. Analyzed by Smithies et al. Cell 26; 345-353. ggamma.pep Length: 148 October 17, 1996 14:36 Type: P Check: 6924 .. 1 MGHFTEEDKA TITSLWGKVN VEDAGGETLG RLLVVYPWTQ RFFDSFGNLS 51 SASAIMGNPK VKAHGKKVLT SLGDAIKHLD DLKGTFAQLS ELHCDKLHVD 101 PENFKLLGNV LVTVLAIHFG KEFTPEVQAS WQKMVTGVAS ALSSRYH*
Translate accepts multiple (one or more) nucleotide sequences as input. You can specify multiple sequences in a number of ways: by using a list file, for example @project.list; by using an MSF or RSF file, for example project.msf{*}; or by using a sequence specification with an asterisk (*) wildcard, for example GenEMBL:*. If Translate rejects your nucleotide sequence, turn to Appendix VI to see how to change or set the type of a sequence.
If you specify a single sequence on the command line or in response to the first program prompt, and -Default is not on the command line, Translate prompts you for the sequence range and strand. After reading that sequence fragment, the program prompts you to specify other sequence fragments either before or after translating this fragment. You can continue to choose single sequences as input until you decide to write out the entire translated sequence.
If -Default is on the command line, Translate translates the sequence without prompting you, in accordance with any command-line parameters that are present.
When you specify multiple sequences, Translate runs noninteractively. By default, Translate will translate each sequence separately and write out each translation to a separate sequence file without prompting you for the range and strand of each sequence. If you use the -ONEPEPtide command-line parameter, all of the input sequences are assembled together first and then translated into a single peptide sequence.
If you use a list file to specify multiple sequences as input, you can add begin, end, and strand attributes for each sequence. You can use the join sequence attribute to selectively assemble some of the sequence entries in a list file together before translation. All sequences listed contiguously in the list file that share the same join attribute (i.e. share the same sequence name following the join token) are assembled together before translation and the translated sequence is given the name of the join attribute. All other sequences in the list file are translated separately. Here is an example of an input list file, hsp70dna.list, for Translate.
!!SEQUENCE_LIST 1.0 Example list file of 70kD heat shock coding sequences used as input for TRANSLATE .. gb_in:tchsp70 Begin: 302 End: 2341 gb_pl:phhsp70g Begin: 240 End: 453 Join: hsp70_petunia gb_pl:phhsp70g Begin: 1076 End: 2817 Join: hsp70_petunia gb_pr:humhsp70 Begin: 228 End: 968
Using this file as input, Translate writes three output files. The first output file contains a translation of the first sequence entry in this list file. The second output file, hsp70_petunia.pep, is a translation of an assembly of the next two sequence entries. The last output file contains a translation of the last sequence entry in this list file. For more information about list files, see "Using List Files" in Chapter 2, Using Sequence Files and Databases in the User's Guide.
ExtractPeptide can write one or more of the translation frames from the Map program output into peptide sequence files. The PepData program translates sequences in all six frames.
Unknown.
Translate allows you to translate sequences where the reading frame is interrupted. This frame-interruption commonly occurs across intervening sequences as in the example above, where a single codon is divided by the first intervening sequence. To accommodate frame interruption, Translate allows you to specify ranges (exons) that are not an even multiple of three in length. Translate concatenates the nucleotide ranges (exons) that you define and translates the concatenated nucleotide sequence assembly (gene) only at the moment you choose a menu item that starts with the word Translate.
If you continue after translation, you are in effect building a new assembly (gene) and concatenating the peptide sequence from the new gene onto the peptide sequence you have already created.
All parameters for this program may be put on the command line. Use the parameter -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.
Minimal Syntax: % translate -[INfile=]@Hsp70DNA.List -Default Prompted Parameters: [-OUTfile=]hsp70.pep output file name (single output sequence only) Local Data Files: -TRANSlate=translate.txt contains the genetic code Optional Parameters: -BEGin=1 -END=100 range of interest for each sequence -REVerse strand for each sequence -ONEPEPtide translate all concatentated DNA fragments into a single peptide -NOJOIN ignore all "join" sequence attributes specified in a list file -LIStfile[=translate.list] writes a list file of output sequence names -EXTension=.pep sets the file name extension for output sequence files -NOMONitor suppresses the screen monitor
The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.
The translation of codons to amino acids, the identification of potential start codons and stop codons, and the mappings of one-letter to three-letter amino acid codes are all defined in a translation table in the file translate.txt. If the standard genetic code does not apply to your sequence, you can provide a modified version of this file in your working directory or name an alternative file on the command line with an expression like -TRANSlate=mycode.txt. Translation tables are discussed in more detail in Appendix VII.
The parameters listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.
sets the beginning position for all input sequences. When the beginning position is set from the command line, Translate ignores beginning positions specified for individual sequences in a list file.
sets the ending position for all input sequences. When the ending position is set from the command line, Translate ignores ending positions specified for sequences in a list file.
sets the program to use the reverse strand for each input sequence. When -REVerse or -NOREVerse is on the command line, Translate ignores any strand designation for individual sequences in a list file.
sets Translate to ignore all join sequence attributes specified in the input list file. All nucleotide sequences specified in the list file are translated into separate output sequence files.
concatenates all input sequences together and then translates them all into a single peptide sequence.
Usually, translation is based on the translation table in a default or local data file called translate.txt. This parameter allows you to use a translation table in a different file. (See Appendix VII for information about translation tables.)
writes a list file with the names of the output sequence files. This list file is suitable for input to other Wisconsin Package programs that support list files (see Chapter 2, Using Sequence Files and Databases in the User's Guide.) If you don't specify a file name, then Translate makes one up using translate for the file name and .list for the file name extension.
This program normally creates output file names by using the original input file name for the base name and the program name for the name extension. Use this parameter to specify some other file name extension.
This program normally monitors its progress on your screen. However, when you use -Default to suppress all program interaction, you also suppress the monitor. You can turn it back on with this parameter. If you are running the program in batch, the monitor will appear in the log file.
[ Program Manual | User's Guide | Data Files | Databases ]
Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com
Copyright (c) 1982, 1983, 1985, 1986, 1987, 1989, 1991, 1994, 1995, 1996, 1997 Genetics Computer Group, Inc. a wholly owned subsidiary of Oxford Molecular Group, Inc. All rights reserved.
Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.
All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.