What's New in Version 9.1

[ Program Manual | User's Guide | Data Files | Databases ]

Table of Contents

New Programs
New Documentation
SeqLab Enhancements and Bug Fixes
Program Bug Fixes
Package-Wide Bug Fixes


New Programs

[ Top | Next ]

The programs listed below are new to Version 9.1 of the Wisconsin Package.

Database Searching

NetBLAST

Searches for sequences similar to a query sequence. The query and the database searched can be either peptide or nucleic acid in any combination. NetBLAST can search only databases maintained at the National Center for Biotechnology Information (NCBI) in Bethesda, Maryland, USA.

NetBLAST is very similar to BLAST, but it differs in two respects. First, whereas BLAST can either search a local database, or request remote service from NCBI, NetBLAST can only perform remote searches. Second, BLAST obtains its remote service from NCBI's "experimental" BLAST server, which is destined to be taken out of service in the near future. NetBLAST uses a new "official" BLAST server, which we expect to be stable. Currently, the two programs are algorithmically identical, but the official service used by NetBLAST may evolve from the experimental service in the future.

Evolution

PAUPSearch

Provides a GCG interface to the tree-searching options in PAUP (Phylogenetic Analysis Using Parsimony). Starting with a set of aligned sequences, you can search for phylogenetic trees that are optimal according to parsimony, distance, or maximum likelihood criteria; reconstruct a neighbor-joining tree; or perform a bootstrap analysis. The program PAUPDisplay can produce a graphical version of a PAUPSearch trees file.

PAUP is the copyrighted property of the Smithsonian Institution. Use the program Fetch to obtain a copy of paup-license.txt to read about rights and limitations for using PAUP.

PAUPDisplay

Provides a GCG interface to tree manipulation, diagnosis, and display options in PAUP (Phylogenetic Analysis Using Parsimony). Starting with a trees file that contains a sequence alignment and one or more trees reconstructed from this alignment (such as the output from PAUPSearch), you can plot the tree(s); compute the score of the tree(s) according to the criteria of parsimony, distance, or maximum likelihood; or calculate a consensus tree (two or more input trees). PAUPDisplay can also plot the trees from a GrowTree trees file.

PAUP is the copyrighted property of the Smithsonian Institution. Use the program Fetch to obtain a copy of paup-license.txt to read about rights and limitations for using PAUP.

Protein Analysis

CoilScan

Locates coiled-coil segments in protein sequences.

SPScan

Scans protein sequences for the presence of secretory signal peptides (SPs).

HTHScan

Scans protein sequences for the presence of helix-turn-helix motifs, indicative of sequence-specific DNA-binding structures often associated with gene regulation.

New Documentation

[ Previous | Top | Next ]

Command-Line Summary

New to Version 9.1 is the Command-Line Summary. This quick-reference guide provides a condensed summary of the required parameters, local data files, and optional parameters you can use with each program in the Wisconsin Package.

SeqLab Tutorial

The SeqLab Tutorial was released prior to Version 9.1. (Each site received one copy of the SeqLab Tutorial in July 1997.) This Tutorial introduces you to the major features of SeqLab, including loading sequences, running programs, aligning sequences, working with program output, and viewing and annotating sequence features.

SeqLab Enhancements and Bug Fixes

[ Previous | Top | Next ]

SeqLab: Editor Mode

Enhancement: You can now set the colors of the residues used in the SeqLab Editor by editing the SeqLabConfig:SeqLab resource file. This gives you the option of creating your own set of colors for DNA and protein characters. For example, it is now possible to assign one color to hydrophillic amino acids and another to hydrophobic amino acids.

To change these colors, copy the file SeqLabConfig:SeqLab to your home directory and edit the resources SeqLab*EditorColorNA and SeqLab*EditorAA. Instructions are included in this file. If you run SeqLab with the -SMAll or -LARge parameters, you will need to make a similar change in the SeqLabConfig:SeqLabSmall or SeqLabConfig:SeqLabLarge files.

Problem: If you loaded certain sequence files into the SeqLab Editor, it crashed. This was a rare occurrence which arose when you edited the sequence comments by using a text editor.

Update: Now SeqLab can accept all correctly formatted GCG sequence files.

Problem: If you saved a file from the SeqLab Editor whose name differed only by case to one already in the SeqLab Main List (for example MyAlignment.rsf vs. myalignment.rsf), the new file was not added to the Main List.

Update: Now SeqLab allows files with the same name but different case to be added to the Main List or Editor.

Problem: In the SeqLab Editor, when you set the horizontal scale to greater than 64:1, sequences sometimes disappeared or flashed.

Update: Now sequences are properly displayed regardless of the horizontal scale setting.

Problem: Sometimes windows "froze" in the SeqLab Editor when you had multiple windows and dialog boxes open. For example, if you opened the Sequence Information window, and then brought another window to the front of the screen, SeqLab sometimes froze until the Sequence Information window was brought forward and closed.

Update: This behavior has been fixed.

Problem: The SeqLab Editor did not allow two sequences to have the same name. If you attempted to create or load a sequence with the same name as one already loaded. SeqLab renamed the original sequence to sequence_1. This was a problem if you did not want the name of the original sequence to be changed.

Update: Now the SeqLab Editor renames the newer sequence to sequence_1 and leaves the original sequence name unchanged.

Problem: In the SeqLab Editor, if you set the Editor Properties Preferences (Options menu -> Preferences -> Editor Properties) to Keyboard: Check (which allows you to proofread a sequence instead of editing it), SeqLab beeped every time you typed a character that did not match the one under the cursor. Unfortunately, if you forgot that you had switched to Keyboard:Check mode, it was not clear why your edits caused SeqLab to beep.

Update: Now SeqLab displays a message on the right side of the status line at the bottom of the SeqLab window that explains the beep as a "Checking mismatch."

Problem: In the SeqLab Editor, if you used the Consensus operation (Edit menu -> Consensus) to create a protein consensus sequence and used a scoring matrix other than Identity, the "Minimum score that represents a match" option in the Consensus dialog box sometimes did not keep the value you set, and its maximum setting changed on subsequent runs.

Update: Now the "Minimum score that represents a match" value and its maximum allowed value are correctly maintained when running the program multiple times.

Problem: Sometimes when you loaded sequences into the SeqLab Editor, the order of previously loaded sequences changed. This usually happened after using the Cut and Paste options to reorder sequences in the Editor.

Update: Now the order of loaded sequences doesn't change, even when newer sequences are loaded.

Problem: In the SeqLab Editor, if you set the General Preferences (Options menu -> Preferences -> General) to "Keep the program window open" after the Run button is pressed, and you ran a program, the program reported an error such as *** ERROR: PROGRAM requires at least one input sequence! *** in the Job Manager.

Update: When you run a program from the Editor, SeqLab now closes the program window after you press the Run button, regardless of how you have set the General Preferences.

Programs run through SeqLab

Comparison

Problem: If you ran PileUp from SeqLab, and you tried to allow more total gap characters by setting the "Maximum number of gap characters (. and ~) added to any sequence" option, SeqLab limited you to a maximum of 2,000 gap characters.

Update: SeqLab now allows you to set the "Maximum number of gap characters (. and ~) added to any sequence" option as high as 7,000 gap characters.

Problem: If you ran PileUp on a set of sequences that were a mixture of DNA and RNA, and loaded the resulting MSF file into the SeqLab Editor, overwriting the original sequences, the new RNA sequences did not properly overwrite the existing RNA sequences.

Update: SeqLab now correctly overwrites existing DNA and RNA sequences in the Editor.

Database Searching

Problem: If you ran StringSearch from SeqLab, it crashed sometimes when many matches were found.

Update: StringSearch now runs correctly in SeqLab, regardless of the number of matches found.

Fragment Assembly

Problem: (SGI IRIX only) If you attempted to run the fragment assembly programs from SeqLab, they sometimes failed to run.

Update: The fragment assembly programs now run correctly from within SeqLab.

Importing and Exporting

Problem: If you used the Reformat program to reformat an RSF file into a single sequence file or into an MSF file, and then loaded that file into the SeqLab Editor, SeqLab misinterpreted its features table. This misinterpretation resulted in misplaced features, especially for aligned sequences in an MSF file.

Update: When you use the Reformat program to reformat an RSF file to a single sequence or MSF file, all features information is lost. Only RSF files can contain features information.

Problem: If you exported files in GenBank format from the SeqLab Editor (File menu -> Export), the output file could have more than one line with the keyword ORIGIN. This made the output file unusable by the program FromGenBank.

Update: Now when exporting to GenBank format, only one ORIGIN line is written to the output file.

Primer Selection

Problem: If you ran Prime from SeqLab, the option "Entropies for DNA melting temperature determination" mistakenly specified a file of enthalpies, and the option "Enthalpies for DNA melting temperature determination" mistakenly specified a file of entropies.

Update: Now when you run Prime from SeqLab, the option "Entropies for DNA melting temperature determination" correctly specifies a file of entropies, and the option "Enthalpies for DNA melting temperature determination" correctly specifies a file of enthalpies.

Problem: If you ran Prime from SeqLab, the default minimum and maximum values for "Product melting temperature" were incorrectly displayed as 50.0 to 65.0 (Celsius), respectively. The appropriate defaults were actually used by the program though.

Update: Now Prime displays the correct default minimum and maximum values for "Product melting temperature" as 70.0 and 95.0 (Celsius), respectively.

Protein Analysis

Problem: If you ran ProfileScan from SeqLab with multiple input sequences, the output files did not show up in the Output Manager.

Update: Now you can run ProfileScan with multiple sequences from SeqLab, and all output files will appear in the Output Manager.

RNA Secondary Structure

Problem: If you ran MFold from SeqLab, and you wanted to save the energy matrix output file for later use with PlotFold, the energy matrix output file was deleted.

Update: Now SeqLab treats the MFold energy matrix as a standard output file, and this file appears in the Output Manager. Note that the energy matrix is a binary file, and thus cannot be displayed from the Output Manager. Also, if you try to rename this file from the Output Manager using Save As..., the file will be corrupted.

Output Manager

Problem: If you selected an output file in the Output Manager and clicked the Save As... button, and the file name you chose already existed on disk, the Output Manager overwrote the existing file without warning you.

Update: Now SeqLab always notifies you when a file with an identical name exists and asks your permission before overwriting the file.

Problem: If you loaded an output file created during a previous session with SeqLab into the Output Manager, and then tried to delete it by pressing the Delete from Disk button, it was not deleted. SeqLab deleted only those output files generated during the current session.

Update: Now the Output Manager will delete any output file when you press the Delete from Disk button, whether it is created during the current session or a previous one.

Printing and Graphics

Problem: Within SeqLab, if you printed sequences from the Sequence Information window or from the Output Manager, the printout did not mention the sequence name.

Update: SeqLab now prints the sequence name as well as the sequence itself.

Problem: If you used the GIF graphics driver from SeqLab, and in the Edit Graphics Devices dialog box (Options menu -> Graphics Devices) you specified $program$.gif in the "Port:" text box, the output was not written to a GIF file or printed. (Note: The GIF graphics driver is sold separately from the Wisconsin Package.)

Update: Now the $program$.gif syntax works correctly from SeqLab when sending output to the GIF graphics driver. (For example if you ran the PileUp program, this file name specification would create a file called "pileup.gif" and send its plot to the GIF driver.)

Other

Problem: If you set the Output Preferences (Options menu -> Preferences -> Output) to delete output files (by turning off the option "Automatically save all output files"), and then added an output file from the Output Manager to the Editor, this output file was not deleted as expected upon exiting SeqLab.

Update: Now when you turn off the option "Automatically save all output files" in SeqLab Output Preferences, all output files created during your current session are deleted.

Problem: GDE compatibility sometimes caused SeqLab to exit if the Genetic Data Environment (GDE) was not properly installed. If you set the GDE_HELP_DIR environment variable, but you did not have a proper installation of GDE on the system, SeqLab reported that it could not locate the .GDEmenus file and exited when you switched to the Editor.

Update: Now SeqLab will disable GDE compatibility if it does not find a complete installation of GDE on the system.

Problem: Some PC X servers (for example eXceed) simulated the middle mouse button by clicking both the right and left mouse buttons at the same time. In some instances, the SeqLab Editor interpreted this as a double click instead of a middle mouse click.

Update: SeqLab now correctly interprets pressing the right and left mouse buttons as the middle mouse button. SeqLab will accept a double click only when your primary mouse button (for example, the left mouse button if configured for a right-handed person, or the right mouse button if configured for a left-handed person) is pressed twice.

Problem: (Solaris only) If you ran SeqLab using a non-English locale setting (observed in the French version), and if you ran a program in batch, the Job Manager reported an error in the job submission. The problem came from a spelling difference in the output generated from the "batch" program.

Update: Now SeqLab correctly operates with non-English locale settings.

Program Bug Fixes

[ Previous | Top | Next ]

Comparison

ProfileMake

Problem: With the new integer matrices introduced in Version 9.0, ProfileMake sometimes created profiles with values that were too large to be written in the five spaces provided for each score in the output file. This often occurred if you used the -MATrix=oldpep.cmp parameter. A profile with this problem could not be used by ProfileSearch.

Update: Now ProfileMake provides more spacing to allow for larger values within a profile.

Database Searching

LookUp

Problem: If you ran LookUp with the -FRAGments parameter to retrieve sequence fragments based on the beginning and ending coordinates of a desired feature, LookUp sometimes crashed.

Update: Now Lookup runs correctly with the -FRAGments parameter.

Problem: LookUp had several problems with searching its databases. For example, LookUp sometimes identified sequences that did not match your query, and sometimes LookUp did not identify all the appropriate sequences.

Update: Lookup's searching problems have been fixed, and LookUp now correctly identifies database entries.

BLAST

Problem: (SGI IRIX only) When you ran the BLAST program, you sometimes saw the error
FATAL: WordFinderComplete failed: out of memory. This error was caused by a bug in the SGI operating system.

Update: This problem has been fixed as recommended by NCBI.

Problem: If you set the "Igonore hits expect to occur by chance more than n times" program prompt to a value smaller than 0.01 to try to limit the number of database hits reported by BLAST, the program incorrectly reported the value as being 0.00, and no matches were found. This error also occurred if you used the -EXPect command-line parameter.

Update: You can now specify values smaller that 0.01 in response to the "Igonore hits expect to occur by chance more than n times" program prompt or when using the -EXPect parameter.

GCGToBLAST

Problem: If you ran GCGToBLAST on any sequence containing non-IUPAC characters, GCGToBLAST created an empty database file. GCGToBLAST did not report an error, only a warning about which sequence had illegal characters. You discovered that the BLAST database was invalid only when you ran BLAST on this database.

Update: Now GCGToBLAST maps all non-IUPAC characters to N for nucleotides or to X for proteins.

FastA

Problem: If you used the Smith-Waterman alignment parameter -SWalign in a nucleotide search, and you found a match on the reverse strand, the alignment displayed the forward strand of that sequence instead.

Update: Now FastA properly displays the correct forward or reverse strand when using the -SWalign parameter.

FastA and TFastA

Problem: If you ran FastA or TFastA with a query sequence containing non-IUPAC characters, it crashed.

Update: Now FastA and TFastA can correctly search a query using non-IUPAC characters. These characters are mapped to N for nucleotides and to X for proteins.

ProfileSearch

Problem: If you ran ProfileSearch on a profile of exactly 1,000 positions in length (the maximum allowed for ProfileSearch), you received an error message
ERROR in READPROFILE, Record number 1001 has the wrong number of fields. ***

Update: Now you can use the maximum length profile (1,000 positions) with ProfileSearch.

FindPatterns

Problem: If you specified that only the top strand of a DNA sequence be searched for a pattern (by placing a "." in the fourth field of a pattern data file, as documented), FindPatterns searched both strands.

Update: Now FindPatterns correctly searches only the top strand for a pattern when instructed to do so.

Editing and Publication

LineUp and Pretty

Problem: If you saved aligned sequences with the default FOSN "Command Mode" option, the list file created contained sequence weights of 0.0 if the original sequences came from single sequences, an MSF file, or an RSF file. This caused problems when the list file was used with Pretty to create a consensus. Pretty gave all sequences a weight of 0.0, and so the consensus was undefined in all positions.

Update: LineUp now uses the weight value specified for each sequence in a list file or RSF file, or if a weight value is not specified, LineUp gives each a default weight of 1.0.

Evolution

PAUPSearch and PAUPDisplay

Known Bug: When you use distance as the optimality criterion for a tree search, or when you use neighbor joining to construct a phylogenetic tree, PAUPSearch and PAUPDisplay may incorrectly compute the pairwise distances for protein sequence data. In cases where this has occurred, the groupings in the resulting tree(s) were so unbelievable that it was obvious that something had gone wrong.

We recommend that you double check your results when analyzing protein sequence data, either by comparing the disance tree(s) with trees obtained using a different optimality criterion, or by using the Distances and GrowTree programs to verify a neighbor-joining tree.

Distances

Problem: If you ran Distances on protein sequences that contained mixed-case amino acids (that is, some residues appear in lowercase and some in uppercase), the distances calculated were incorrect. Nucleic acid sequences were not affected.

Update: Now Distances handles mixed case protein sequences and comparisons correctly.

Diverge

Problem: According to Wen-Hsiung Li ( Ch.4, p. 89, Molecular Evolution, Sinauer and Associates, Inc. 1997), the equations for the variances in Ka and Ks as published by Li and by Pamilo and Bianchi were incorrect.

Update: Diverge now uses the corrected equations.

Fragment Assembly

GelEnter

Problem: If you ran GelEnter, sometimes a relations file was mistakenly labeled as a sequence file. This problem did not affect the operation of the fragment assembly programs.

Update: GelEnter now correctly labels relations files to distinguish between them and sequence files.

Mapping

PlasmidMap

Problem: If you ran PlasmidMap with an input file that had a path name longer than 50 characters, the program crashed.

Update: Now PlasmidMap can accept input files with path names up to 256 characters in length.

Primer Selection

Prime

Problem: If you supplied Prime with a file of primers using the parameter -PRImers=filename, Prime sometimes did not consider all of the primers in the file. This occurred when the primer length range that you specified in response to the "Minimum primer length" and "Maximum primer length" program prompts (or the -MINPRImer and -MAXPRImer command-line parameters) rejected at least one of the primers in the file of primers you supplied. For example, if you were searching for primers from 18 to 22 bases long, and there was a primer in your file of primers that was 23 bases long, Prime stopped searching the list of primers at the point that it read the long primer.

Update: Now when you use the -PRImers=filename parameter with Prime, the program uses all primers in the file of primers that meet the minimum and maximum primer lengths specified.

Problem: If you ran Prime with the -CHEck parameter, the optional parameter -DATa2=filename, which specifies a file of enthalpies for DNA melting determination, mistakenly specified entropies, and -DATa3=filename, which specifies a file of entropies for DNA melting determination, mistakenly specified enthalpies.

Update: When you run Prime with the -CHEck parameter, it correctly lists -DATa2=filename as specifying entropies for DNA melting determination, and -DATa3=filename as specifying enthalpies for DNA melting determination.

Problem: If you ran Prime with the -CHEck parameter, the default values for the -TMMINPROduct and -TMMAXPROduct parameters were incorrectly displayed as 50.0 and 65.0 (Celsius), respectively.

Update: When you run Prime with the -CHEck parameter, it now correctly displays the correct default values for the -TMMINPROduct and -TMMAXPROduct parameters as 70.0 and 95.0 (Celsius) respectively.

Protein Analysis

ProfileScan

Problem: If you ran ProfileScan with multiple input sequences (for example from an MSF file, RSF file, list file, or ambiguous database specification), matches to one query sequence were also reported as matches to later query sequences, whether the matches were real or not. If the later matches were not real, then you found a score of 0 in the .sum file, and only the profile abstract appeared in the .scan file. If the matches were relevant, then you saw the profile twice for that query.

Update: Now ProfileScan correctly reports only relevant matches between a query and a profile, regardless of the number of queries run.

Problem: ProfileScan determined the pair display thresholds for the placement of "|", ":", and "." from the first profile, and then used these thresholds on all subsequent profiles.

Update: Now ProfileScan determines the pair display thresholds separately for each profile.

Problem: If you searched a profile library with ProfileScan, and your sequence matched the last profile in the library, that match was not reported.

Update: Now ProfileScan correctly reports all profile matches.

Isoelectric

Problem: If you ran Isoelectric on a sequence whose net charge was positive across all pH values 1 through 13, the text output obtained with -OUTfile=filename reported that the net charge for all those values was negative.

Update: When the net charge is positive across all pH values, the output file correctly reports that the net charge is positive for all values.

Problem: In some rare instances, Isoelectric would hang without crashing or exiting.

Update: Now Isoelectric will always exit properly.

Isoelectric, PepPlot, and HelicalWheel

Problem: If you ran Isoelectric, PepPlot, or HelicalWheel with a sequence containing tilde (~) characters, the programs sometimes crashed.

Update: Now these programs correctly report that ~ is not a valid amino acid character.

Moment

Problem: If you ran Moment with very long sequences, it sometimes crashed.

Update: Moment now accepts sequences of any length.

Package-Wide Bug Fixes

[ Previous | Top | Next ]

XWindows Graphics Driver

Problem: If you used the XWindows graphic driver on a machine with greater than 8-bit displays (TrueColor 16-, 24-, or 32-bit displays), a plot appeared as a monochrome display.

Update: Now XWindows works in color on all TrueColor 16-, 24-, or 32-bit displays.

GIF Graphics Driver

Problem: Sometimes a program creating graphics in GIF format did not generate all of its output, even though the GIF file would be successfully generated.

Update: Programs writing GIF format files generate all program output correctly.

Figure files

Problem: If you included one Figure file within another (for example, to place multiple graphs on a single page), the Figure file "magic number" identifier, !!FIGURE 1.0, was plotted on the subsequent graphs.

Update: Figure files now plot correctly.


[ Program Manual | User's Guide | Data Files | Databases ]


Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com

Copyright (c) 1982, 1983, 1985, 1986, 1987, 1989, 1991, 1994, 1995, 1996, 1997 Genetics Computer Group Inc., a wholly owned subsidiary of Oxford Molecular Group, Inc. All rights reserved.

Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

Genetics Computer Group

www.gcg.com