STACPACK is available by download from www.sanbi.ac.za/CODES Also available at the same site is a set of tools developed for easy insal and loading for linux users called SANBI-BABY. This toolset is precompiled for linux and includes instalation instructions. Its designed for folks who want to set up small servers in developing countries. Register at www.sanbi.ac.za/CODES and download for free for STACKPACK. Install instructions come with the package, but a directory for StackPack on the CD should include the install guide so as to give folks an idea of what is involved if they wish to go ahead with an install. Here is the txt for the readme STACPACK is available by download from www.sanbi.ac.za/CODES Please register there and download. README for stackPACK 2.1 30 March 2001 This version of stackPACK replaces stackPACK v2.0. StackPACK v2.1 contains significant improvements over stackPACK v2.0, and concentrates mainly on improving the ability to handle large data sets and large clusters. New viewing and reporting functions have been included which highlight potential areas of interest, and which enables data exchange with alignment editing programs. Enhancements in terms of user flexibility and control of the pipeline processes and applications have also been implemented in this release. Several documentation files exist for this release: * README this file * ReleaseNotes.txt details about the release including updates, new features and known problems * INSTALL installation instructions for a new stackPACK * INSTALL_UPDATE instructions for updating an existing stackPACK installation We hope that the stackPACK transcript reconstruction and variation analysis tools prove valuable to your research efforts. --------------09E02F312505F270DF258D7C Content-Type: text/html; charset=us-ascii; name="INSTALL.html" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="INSTALL.html"
StackPACK 2.1 Installation Instructions New User Installation --------------------------------------------------------------------- StackPACK Update: StackPACK version 2.1 Platforms: Irix 6.5.x, Tru 64 UNIX 4.0 and 5.0, Solaris 7, Linux Red Hat 6.2 Date: 30 March 2001 This document describes the procedures for installing stackPACK v2.1 --------------------------------------------------------------------- TABLE OF CONTENTS ----------------- I. SYSTEM REQUIREMENTS II. PRE-INSTALLATION PROCEDURES III. UNPACKING THE DISTRIBUTION IV. INSTALLING THE DISTRIBUTION V. POST-INSTALLATION PROCEDURES VI. TESTING THE stackPACK VII. CUSTOMIZING YOUR stackPACK CONFIGURATION VIII. TROUBLESHOOTING --------------------------------------------------------------------- I. SYSTEM REQUIREMENTS 1. Supported platforms Hardware OS Version ------------------------------------------ ----------------- Compaq Tru 64 UNIX 4.0 Compaq Tru 64 UNIX 5.0 Intel-based PC Linux Red Hat 6.2 Silicon Graphics - 64bit architecture only Irix 6.5.x Sun Microsystems Solaris 7 2. RAM Minimum Requirements: 256K Recommended Requirements: The user requirements depend on the size of the datasets and clusters to be processed. Increased RAM significantly increases stackPACK performance. 3. Disk space Minimum Requirements: stackPACK requires 150MB to install and 75MB to run. Recommended Requirements: The user requirements depend on the size of the datasets and clusters to be processed. In addition to the minimum requirements, stackPACK requires 5x the datasize to run. 4. Required third-party software Software Version Location ------------ ------------------ -------------------------------- Phrap and 1996 or 1999 http://www.phrap.org Cross_Match RepBase user's choice http://www.girinst.org/Repbase_Update.html RepeatMasker April 1999 or newer http://repeatmasker.genome.washington.edu/cgi-bin/RepeatMasker Apache 1.3 or newer http://www.apache.org MySQL 3.23.27 or newer http://www.mysql.com Python 1.5.2 http://www.python.org II. PRE-INSTALLATION PROCEDURES This installation assumes a working knowledge of Unix and MySQL. 1. Please ensure that you have installed, configured and tested the web server and MySQL. 2. You will need the following information in order to proceed with your stackPACK installation: * The stackPACK hostname * The stackPACK installation target, usually: usr/local/stackpack * The location of your web server html root, usually: /home/httpd/html or /var/www/htdocs * The location of your web server cgi-bin root, usually: /home/httpd/cgi-bin or /var/www/cgi-bin * The MySQL hostname * The username and password of a MySQL account with privileges to create new MySQL users III. UNPACKING THE DISTRIBUTION 1. Copy the stackPACK distribution file to a temporary directory with at least 75MB of free disk space. 2. Unpack it with: > gzip -d stackpack-2.1---------------09E02F312505F270DF258D7C--.tar.gz > tar xf stackpack-2.1- .tar 3. Type ls in order to display the directory contents. The directory should contain the following: * cgi-bin/ - cgi scripts required for the web interface * etc/ - The stackPACK configuration files * html/ - html files required for the web interface * setup.sh - The stackPACK installation script * stackpack/ - The main stackPACK installation tree * stackpack-2.1- .tar - The tarred up stackPACK distribution file IV. INSTALLING THE DISTRIBUTION 1. Log in as root. If you do not have access to the root password, please contact Electric Genetics at: e-mail : support@egenetics.com tel : +27 (21) 959 3964 fax : +27 (21) 959 2512 2. Ensure that you are in the same directory as the untarred stackPACK distribution file. 3. Execute the setup script by typing the following command: > sh setup.sh This script will configure your stackPACK installation. A number of questions need to be answered in order for configuration to occur. The default answers to the questions are shown in square brackets - if these answers are appropriate, simply hit return. Otherwise, type the correct information at the prompt, and hit return. * stackPACK hostname: [local.hostname] The host on which the stackPACK is installed. This host is also used by the web interface as a mail server for sending status updates to users. * stackPACK installation target: [/usr/local/stackpack] The directory in which stackPACK will be installed. For the remainder of this document the stackPACK installation target will be referred to as the . * Location of stackPACK config files: [/etc] The directory in which stackPACK will install its configuration files. The configuration files are stackpack, odbc.ini and odbcinst.ini. For the remainder of this document the directory containing the stackPACK configuration files will be referred to as the . NOTE: If you do not wish to install the stackPACK configuration files in the /etc directory, you may install them in the stackPACK installation directory by specifying /etc. You must then follow the instructions in Section VI.1. and set specific environment variables before stackPACK will run properly. * Install WWW interface? (y/n) Answer yes if you want the stackPACK web interface to be installed. If you answer no at this point, only the command-line stackPACK tools will be installed. * Web server html directory [/home/httpd/html] The directory where the HTML files for the stackPACK web interface will be installed. The system creates a stackpack sub-directory in the HTML directory you specify. * Web server cgi directory [/home/httpd/cgi-bin] The directory where the CGI-BIN scripts for the stackPACK web interface will be installed. The system creates a stackpack sub-directory within this directory. * MySQL server hostname: [local.hostname] The host on which the MySQL database is installed. * Please enter a MySQL account that has user creation privileges:[root] The name of a MySQL account that has privileges to create new users. * Please enter a password for root: The password for the above-mentioned account. * Please enter stackPACK MySQL account name: [stackpack] The name of the MySQL account that will be created to give stackPACK access to the MySQL database. * Please enter stackPACK MySQL account password: [stackpack] The password of the MySQL account through which stackPACK gains access to the MySQL database. Upon completion of all the questions, the script will display all the answers you have given. Once these answers are confirmed, the script will go ahead with the stackPACK installation. V. POST-INSTALLATION PROCEDURES After completion of the installation script, the following tasks must be completed before attempting to run stackPACK. 1. If the stackPACK configuration files have not been installed in /etc, you must set two environment variables, STACKPACK_ROOT and ODBCSYSINI, for both the webserver and each user. To set STACKPACK_ROOT, specify the directory _above_ the location of the stackPACK configuration files. For example, if the stackPACK configuration files are installed in /usr/local/stackpack/etc, set STACKPACK_ROOT to /usr/local/stackpack. To set ODBCSYSINI, specify the location of these configuration files, e.g., /usr/local/stackpack/etc. As an example, for Apache servers: Edit the httpd.conf file to contain the following: SetEnv STACKPACK_ROOT /usr/local/stackpack SetEnv ODBCSYSINI /usr/local/stackpack/etc This will ensure that the web-based stackPACK works properly. To ensure that the command-line stackPACK works properly, ensure that the variables STACKPACK_ROOT and ODBCSYSINI are set for each user as well. 2. Ensure that each user's PATH environment variable has the following added to it: /bin For C-shell users, for example: setenv PATH ${PATH}:/usr/local/stackpack/bin For Bash-shell users, for example: export PATH=$PATH:/usr/local/stackpack/bin 3. Ensure that each user's LD_LIBRARY_PATH environment variable has the following added to it: /lib: /lib.ext C-shell users, for example: setenv LD_LIBRARY_PATH /usr/local/stackpack/lib:/usr/local/stackpack/lib.ext Bash-shell users, for example: export LD_LIBRARY_PATH /usr/local/stackpack/lib:/usr/local/stackpack/lib.ext 4. Obtain phrap and cross_match (http://www.phrap.org) and install them into: /bin.ext NOTE: The standard configuration on stackPACK is for the latest (1999) phrap and cross_match. If you would like to use the older version of phrap (1996), some parameter settings need to be changed. Please see Section VII.7. for further details. 5. If the user does not wish to use the masking database supplied with stackPACK, replace the following file (default location of the masking database) with your own database: /supporting/repeat.seq The masking database supplied with stackPACK only contains the following: * Common vector sequences, distributed by NCBI * Other potential contaminants such as rodent, mitochondrial and ribosomal DNA. 6. If necessary, customize the configuration of your stackPACK installation by verifying whether /stackpack, /odbc.ini and /odbcinst.ini files contain the desired settings. The default location for the stackPACK configuration directory is /etc. See section VII, Customizing your stackPACK Configuration, for more details on these files. StackPACK should now be installed on your system. You can access it either via the command line or through the web interface at http:// /stackpack/. Please continue to the following sections for details on testing and customization of your stackPACK installation. VI. TESTING THE stackPACK INSTALLATION 1. StackPACK processing via the web interface: * To test the database connectivity, go to http:// /stackpack, and click on WebProjectManager. An empty project table should be displayed if the database connectivity is properly set up. * To test whether the programs are working correctly via the web interface go to http:// /stackpack, and click on WebPipe. Complete the WebPipe data submission form in order to submit a clustering job. Please fill in all the details: o Project owner: Enter your full e-mail address. o Project name: Enter a brief one-word alphanumeric project name. o Project description: Enter a one-line project description for your reference. o Input Data File: Enter the name of the input data file. Use the browse option to specify the location of the input data file. A test datafile is in /doc and is called alt_splice.seq. o Data Format Select "Mixed or unknown FASTA Format" o Submit Click on "Go Ahead" to initialize the clustering process. This dataset will be processed quickly and you should receive an e-mail upon project submission as well as upon project completion. Verify that all programs executed properly by reviewing the log sent upon project completion. The log contains details of each step in the pipeline, including parameter settings and any errors which may have occurred. If the entire pipeline and thus the project fails to complete, the log report is still e-mailed to the user. NOTE: MYSQL has a default database and thus a default project called test. You may not have two projects with the same name. Hence you may not use "test" as a project name when running the testdata. 2. StackPACK processing via command line: To test whether the programs are working correctly via the command line, please carry out the following instructions: * Create a stackPACK project with the following command: stack_ProjectManager -create altdata testing * Import the sequences from the input data file into stackPACK's database with the following command: stack_ImportFasta altdata /doc/alt_splice.seq. GUESS * Mask the input data with the following command: stack_Mask altdata * Cluster the data with the following command: stack_Cluster altdata * Assemble the data with the following command: stack_Assemble altdata * Analyze data with the following command: stack_Analysis altdata * Link data with the following command stack_Link altdata Verify that all programs executed properly by inspecting the output produced by each of of the pipeline steps. The output contains details such as parameter settings and any errors which may have occurred. 3. If stackPACK fails to process the data, either via the web interface or command line, please confirm that: * all path locations in the /stackpack file are valid * all the programs in /bin and /bin.ext are executable for all users * the temp directory specified in STACKPACK_TMP under the [STACKPACK] heading in the directory is writable for all users. 4. If stackPACK still fails to process the data, please read through section VII, Troubleshooting. 5. If you are unable to troubleshoot with these suggestions, please contact Electric Genetics at: e-mail: support@egenetics.com tel: +27 (21) 959 3964 fax: +27 (21) 959 2512 6. If you have managed to test successfully, you may delete the stackpack, odbc.ini and odbcinst.ini files, which you had moved to a temporary location in step 2 of section III, Uninstalling Existing stackPACK Copies 7. If you have managed to test successfully, you may delete the contents of the temporary directory that was created in step 1 of section III, Unpacking The Distribution. VII. CUSTOMISING YOUR stackPACK CONFIGURATION 1. The stackPACK software has a system-wide configuration file located /stackpack from where most of the stackPACK configuration is done. This configuration file first lists the locations of the stackPACK system components, then the parameters of the stackPACK pipeline steps, and lastly the algorithm parameters. The system administrator should take a critical look at these variables when first installing stackPACK. 2. Moving the configuration files out of /etc : If you do not wish to install stackPACK configuration files in the /etc directory, you can install them in the stackPACK installation directory. Please carry out the following instructions: * Create a /etc directory * Move /etc/stackpack, /etc/odbc.ini and /etc/odbcinst.ini to this directory. * Set the environment variables, STACKPACK_ROOT and ODBCSYSINI for both the webserver and each user. See Section VI.1. for details on how this can be done. 3. Configuring the temp directory: StackPACK writes out temporary files to the location specified for STACKPACK_TMP under the [STACKPACK] heading in the stackpack configuration file. It is important to ensure that enough disk space is allocated to this directory, especially if there are a lot of users or if large datasets are processed. The temp directory can be set in /stackpack. STACKPACK_TMP=/server/stackpack/tmp 4. Changing the database settings: To change the username and password used by stackPACK to connect to the database go to the [DATABASE] heading in /stackpack and edit the DSN_LOGIN and DSN_PASSWORD variables. [DATABASE] ODBCSYSINI=/etc DSN_NAME=stacksys DSN_LOGIN=stackpack <-- -------- DSN_PASSWORD=stackpack <-- -------- 5. Changing the SMTP server: To change the server used to send the e-mail, for example if stackPACK is not delivering a notification e-mail upon project completion, change the SMTP_SERVER value under the [WEBPIPE] heading in /stackpack to point to the correct machine. SMTP_SERVER=smtp.server.com 6. Configuring stackPACK for multi-processor machines: In /stackpack, under the [stack_Mask], [stack_Cluster] and [stack_Assemble] headings there is a variable "num_cpus". Set this variable to the desired amount of processors you want to allocate to each task. Although the number of cpus can exceed the number of actual cpus on the system, it is not generally. Furthermore, we recommend that the nature of typical projects, and the available RAM are taken into consideration when setting this variable. Projects with long sequences and/or large clusters will require more RAM, and the system may run out of memory if too many cpus are set. [stack_Mask] ............ num_cpus=8 batch_size=250 [stack_Cluster] num_cpus=8 [stack_Assemble] num_cpus=8 7. Configuring stackPACK for the old (1996) version of phrap: In /stackpack under the [phrap] heading there is a variable called old_ace. Set this variable to 0 in order to use the older version of phrap. old_ace=0 8. Changing the machine used as a database server: To change the machine used as a database server, edit the Server variable in /odbc.ini 9. User configuration: StackPACK system configuration can be adjusted by the user with the creation of an individual configuration file placed in their home directory named ".stackpackrc". The easiest way to create the .stackpackrc file is to copy /stackpack to the user's home directory as .stackpackrc and further edit it. Each user can do this as follows: cp /stackpack ~/.stackpackrc vi .stackpackrc Typical parameters that are adjusted by the user using .stackpackrc include: the repeat masking file, the number of processors for the masking, clustering and assembly steps and, for expert users, parameters for each of the programs called externally by stackPACK. Even though the most commonly used parameters for each of these programs are listed in the configuration file, any parameter can be set as a flag. VIII. TROUBLESHOOTING _____________________________________________________________________ 1. Symptom: Stack_Assemble suddenly stops generating contigs. When running from command line, the log is as follows: No contig generated for cluster: 78 No contig generated for cluster: 79 No contig generated for cluster: 80 When running from the web-interface, the log is as follows: stack_Assemble finished Processed 120 clusters Total contigs generated: 53 Total clusters that had multiple contigs: 3 Total clusters that did not have a contig: 70 NOTE: Only 50 out of 120 clusters generated contigs. Diagnosis: This is typically due to lack of disks pace Solution: * StackPACK writes out temporary files to the location specified for STACKPACK_TMP under the [STACKPACK] heading in the stackpack configuration file. These temporary files are deleted as the data is processed. If the processing pipeline is interrupted for some reason, the temporary files may fail to delete and must be deleted manually. It is important to inspect this location periodically for accumulated files, and to ensure that enough disk space is allocated. * If a very large project is being processed, the temporary files for that project may fill up the temporary directory. Using the .stackpackrc user configuration file, the user can select another temporary space for that particular project, or the system administrator may change the temporary location for the whole system. The temp directory can be set in STACKPACK_TMP under the [STACKPACK] heading in the /stackpack. * StackPACK interacts with the MySQL database regularly during data processing. The MySQL database may be configured to write out a log file to the default location of /usr/local/mysql/var/ . This log file records every interaction with MySQL and can grow extremely large. We recommend that this function is turned off to avoid disk space problems. The user may retain it for troubleshooting purposes, but must ensure that ample space is available. NOTE: Irregular data processing during other steps in the pipeline may also be attributed to lack of disk space. _____________________________________________________________________ 2. Symptom: The following message is displayed when attempting to use the web-interface: The CORBA server (stackCORBAd) is failing to start up. Diagnosis: The CORBA server (stackCORBAd) fails to start up. StackCORBAd provides a language and platform independent interface for querying the stackPACK database. Solution * The contains a log file. Please confirm that this file can be read and written by everyone on the system. * Please confirm that the ILUbinding directory is present in the location specified for ILU_BINDING under the [ILU] heading in the stackPACK configuration file, and that it can be read and written by everyone on the system. _____________________________________________________________________ 3. Symptom: Extraneous text is appended to WebPipe upon data submission. Diagnosis: This occurs when Netscape FastTrack is used as the web server Solution: The integrity of the data and the stackPACK results are not affected. To avoid this, use Apache as the web server. _____________________________________________________________________ 4. Symptom: The following message may be displayed when requesting a project summary report: The document contained no data. Try again later, or contact the server's administrator Diagnosis: This may occur when summary reports for large datasets are requested. It is generally due to the connection between the browser and the server timing out Solution: It can be prevented by increasing the Time Out Value on the web server. _____________________________________________________________________ 5. Symptom: The hierarchical navigational icons which represent the various cluster consensus and alignment views within WebProbe are misaligned. Diagnosis: This is due to incompatible font settings when running Netscape under Linux. Solution: It can be avoided by setting the Netscape variable width font to 14 in Edit: Preferences: Font. _____________________________________________________________________