PEPPLOT(+)

[ Program Manual | User's Guide | Data Files | Databases ]

Table of Contents
FUNCTION
DESCRIPTION
EXAMPLE
OUTPUT
GARNIER OUTPUT FILE
CHOU AND FASMAN OUTPUT FILE
HYDROPHOBIC MOMENT OUTPUT FILE
INPUT FILES
RELATED PROGRAMS
RESTRICTIONS
CONSIDERATIONS
GRAPHICS
<CTRL>C
COLOR
COMMAND-LINE SUMMARY
ACKNOWLEDGEMENT
LOCAL DATA FILES
OPTIONAL PARAMETERS

FUNCTION

[ Top | Next ]

PepPlot plots measures of protein secondary structure and hydrophobicity in parallel panels of the same plot.

DESCRIPTION

[ Previous | Top | Next ]

PepPlot shows several common measures of protein secondary structure together on one coordinated plot. Most of the curves are the average, sum, or product of some residue-specific attribute within a window. In a few cases, the attribute is both specific to the residue and dependent on its position in the window. Throughout the plot, the blue curves are for beta-sheets and the red curves are for alpha-helices; black is used for turns and hydropathy. If your plotter does not have four colors, then dashed lines are for alpha-helix and solid lines are for beta-structures.

This document is only a description of what PepPlot does. You may want to read some of the articles cited below to help you interpret what the curves really mean.

There are ten different panels that can be plotted in any combination and in any order. In the descriptions below they are referred to from top to bottom as if you had plotted them all in the default order as in the example session and figure.

The Sequence

The first part of the plot shows the sequence itself. This panel is extremely crowded if you use a density of more than 100 residues per page.

The Residue Schematic

The second part of the plot shows a schematic representation of the sequence. Each residue is represented by a line at the position where it occurs in the sequence. The lengths and colors of the lines are used to indicate chemically similar groups of amino acids as follows.

Color        Category

Green        hydrophilic, charged
                 down = acidic
                 up   = basic

Red          hydrophilic, uncharged
                 short = amides
                 long  = alcohols

Blue         hydrophobic
                 short = aliphatic
                 long  = aromatic

Black        Proline

Unmarked     Alanine, Glycine, Cysteine

Chou and Fasman Beta-Sheet Forming and Breaking Residues

The third panel is a display of the residues that are beta-sheet forming and breaking as defined by Chou and Fasman (Adv. Enz. 47; 45-147 (1978)). To nucleate beta-structures, there should be at least three beta-forming residues and not more than one breaking residue within a window of five.

Chou and Fasman Alpha and Beta Propensities

The fourth panel of the plot shows the Chou and Fasman (1978 cited above) propensity measures for alpha-helix and beta-sheet. As each curve rises past the threshold for its color, it satisfies one criterion for propagation of an alpha-helix or beta-sheet structure. If the curves for alpha and beta propagation drop below the black threshold (at value of the 1.00 level) and if there is at least one breaking residue in four, then the structure may terminate. Both curves are the average of a residue-specific attribute over a window of four.

Chou and Fasman Alpha-Helix Forming and Breaking Residues

The fifth panel shows the residues that are alpha-helix forming and breaking, as defined by Chou and Fasman (1978 cited above). For alpha-helices to nucleate, there should be four or more alpha-forming residues and not more than one breaking residue within six residues.

Chou and Fasman Amino Ends

The sixth panel shows regions of the sequence that resemble sequences typically found at the amino end of alpha-helices and beta-structures (Chou and Fasman, 1978 cited above). The curves plot the probabilities for a window of six that the first three residues in the window precede the end of the structure and the last three residues are within the structure. There are two different residue-specific attributes used, one for each half of the product.

Chou and Fasman Carboxyl Ends

The seventh panel shows regions of the sequence typically found at the carboxyl end of alpha-helices and beta-structures (Chou and Fasman, 1978 cited above). The two curves show the probability for a window of six that the first three residues in the window are within the structure and the last three residues are outside the structure. Two different residue-specific attributes are used, one for each half of the product.

Chou and Fasman Turns

The eighth panel shows regions of the sequence typically found in turns (Chou and Fasman, 1978 cited above). The curve is the product of a residue-specific, position-dependent attribute (probability) multiplied across a window of four. The calculated values are multiplied by 10,000 for plotting.

Hydrophobic Moment

The ninth panel shows the helical hydrophobic moment at each position of the sequence. These curves rise when the molecule forms either an alpha-helix or a beta-sheet at the interface between the solvent and the interior of the molecule. Said another way, the moment statistic is the probability that the sequence at each position is amphiphilic, that is, it appears to have hydrophobic residues on one side and hydrophilic residues on the other. The hydrophobic moment is calculated as described by Eisenberg et al. (Proc. Natl. Acad. Sci. USA 81; 140-144 (1984)), except that we have normalized the hydrophobic moment for the local hydrophobicity of the amino acids in the window where the moment is being determined. This makes the method equivalent to that described by Finer-Moore and Stroud (Proc. Natl. Acad. Sci. USA, 81; 155-159 (1984)).

In a typical alpha-helix, each residue is oriented about 100 degrees from the preceding residue. The alpha moment that we plot in this panel is the maximum for all inter-residue angles between 95 and 105 degrees The alpha moment curve is calculated for a window of eight residues.

Typical beta-strands have 160 degrees of rotation between adjacent residues. The beta hydrophobic moment curve is the maximum for all inter-residue angles between 159 to 161 degrees calculated over a window of six residues.

Moment is a tool that makes a continuous contour plot of the helical hydrophobic moment for rotation angles between 0 and 180 degrees per residue.

Kyte and Doolittle Hydropathy

The tenth panel has two curves based on the average hydrophobicity. The black curve is the Kyte and Doolittle hydropathy measure (J. Mol. Biol. 157; 105-132 (1982)). This curve is the average of a residue-specific hydrophobicity index over a window of nine residues. When the line is in the upper half of the frame, it indicates a hydrophobic region, and when it is in the lower half, a hydrophilic region. You can set the Kyte-Doolittle window to a number other than nine from the command line using the -HWINdow=n parameter.

Goldman, Engelman, and Steitz Transbilayer Helices

The green curve in the tenth panel is the Goldman, Engelman, and Steitz (GES) curve for identifying nonpolar transbilayer helices (reviewed in Ann. Rev. Biophys. Biophys. Chem. 15; 321-353 (1986)). The curve is the average of a residue-specific hydrophobicity scale (the GES scale) over a window of 20 residues. When the line is in the upper half of the frame, it indicates a hydrophobic region and when it is in the lower half, a hydrophilic region. You can suppress the GES curve in this panel with the command-line parameter -NOGES. You can set the GES window to a number other than 20 with the command-line parameter -GESWindow=n.

Garnier Predictions Can Be Written Into a File

Secondary structure prediction using the method of Garnier, et al. (J. Mol. Biol. 120; 97-120 (1978)) can also be calculated by PepPlot and written into a file (see the command-line parameter -GARnier).

EXAMPLE

[ Previous | Top | Next ]

Here is a session using PepPlot to plot the secondary structure measures for adenylate kinase (PIR:Kihua). This session with PepPlot also writes the Garnier predictions, Chou and Fasman values, and helical hydrophobic moment values to separate output files.


% pepplot -GARnier -CFFile -MOMentfile

  PEPPLOT of what protein sequence ?  PIR:Kihua

                      Begin (* 1 *) ?
                    End (*   194 *) ?  100

  The minimum density for a one-page plot is 87.0 residues/100 platen units.
  What density do you want (* 87 *) ?

 What Panels do you want to plot?

     a) Sequence
     b) Charged-polar-hydrophobic residue schematic
     c) Beta forming-breaking symbols
     d) Chou-Fasman Alpha-Beta prediction curves
     e) Alpha forming-breaking symbols
     f) Chou-Fasman NH2-end prediction curves
     g) Chou-Fasman CO2-end prediction curves
     h) Chou-Fasman Turn    prediction curve
     i) Helical Hydrophobic Moment for Alpha and Beta
     j) Hydropathy and Hydrophilicity

  Please choose one or more (* ABCDEFGHIJ *):

  When your LaserWriter attached to tty07 is ready, press <Return>.

%

OUTPUT

[ Previous | Top | Next ]

Here are parts of the text output files. If you are reading the Program Manual, you can see the plot from this session in the figure at the end of this program entry.

GARNIER OUTPUT FILE

[ Previous | Top | Next ]

Secondary structure prediction using the method of Garnier et al. (J. Mol. Biol. 120; 97-120 (1978)) is also performed by PepPlot when the program is run with the command-line parameter -GARnier. The Garnier method calculates a statistic for alpha-helix, beta-sheet, turns, and random coil structures using position-dependent, residue-specific information within a window of 17. The predicted structure for the residue in the center of the window is whichever statistic is largest.

Output File Structure

The results of the Garnier prediction are written into a file. The file shows different predictions for several different combinations of decision constants. The predicted structure is represented with an A for alpha, B for beta, C for random coil, and T for turn. Question marks indicate that two or more structures are equally probable.

Decision Constants From Physical Measurements

If you have physical data on the proportion of the protein's secondary structure that is alpha-helix and beta-strand, decision constants ("fudge factors") can be used to bias the Garnier predictions.

Decision Constants Without Physical Measurements

If you have no physical measurement of the percentage of alpha-helix and beta-strand in your protein, Garnier recommends using the percent alpha and beta with no decision constants. These two percentages appear at the top of the file.

Here is part of the Garnier output file, kihua.gar:


PEPPLOT (Garnier prediction) of: Kihua check: 1665 from: 1 to: 100

                           October 2, 1996 16:11

adenylate kinase (EC 2.7.4.3) 1 - human
N;Alternate names: myokinase
C;Species: Homo sapiens (man)
C;Date: #sequence_revision 23-Oct-1981 #text_change 16-Feb-1996
C;Accession: A33508; A00679
R;Matsuura, S.; Igarashi, M.; Tanizawa, Y.; Yamada, M.; Kishi, F.; Kajii, T.;
 Fujii, H.; Miwa, S.; Sakurai, M.; Nakazawa, A

Structural composition for no decision constant: alpha = 32.0%  beta = 21.0%

%Alpha  No DC    <20    <20    <20  20-50  20-50  20-50    >50
    >50
% Beta  No DC    <20  20-50    >50    <20  20-50    >50    <20
  20-50
   Pos ----------------------------------------------------------- ..
     9      A      B      B      B      A      A      A      A      A
    10      B      B      B      B      B      B      B      B      B
    11      B      B      B      B      B      B      B      B      B

/////////////////////////////////////////////////////////////////////

    90      B      B      B      B      B      B      B      B      B
    91      B      B      B      B      B      B      B      B      B
    92      B      B      B      B      B      B      B      B      B

CHOU AND FASMAN OUTPUT FILE

[ Previous | Top | Next ]

With the command-line parameter -CFFile, PepPlot writes a file with the Chou and Fasman (1978, cited above) values for every position in the sequence written out as a table of numbers. Here is part of the Chou and Fasman output file, kihua.cho:


PEPPLOT (Chou/Fasman) of: Kihua check: 1665 from: 1 to: 100

                           October 2, 1996 16:11

adenylate kinase (EC 2.7.4.3) 1 - human
N;Alternate names: myokinase
C;Species: Homo sapiens (man)
C;Date: #sequence_revision 23-Oct-1981 #text_change 16-Feb-1996
C;Accession: A33508; A00679
R;Matsuura, S.; Igarashi, M.; Tanizawa, Y.; Yamada, M.; Kishi, F.; Kajii, T.;
 Fujii, H.; Miwa, S.; Sakurai, M.; Nakazawa, A

          Alpha  Alpha   Beta   Beta      Alpha         Beta
 Pos Res   Stat   Ave    Stat    Ave   NH2    COOH   NH2    COOH   Turn  HPhob
   ------------------------------------------------------------------------ ..
   1   M   1.45   1.41   1.05   0.63   0.23   4.50   0.57   0.11   0.30  -1.96
   2   E   1.51   1.35   0.37   0.69   0.28   5.25   1.20   0.09   0.17  -1.67
   3   E   1.51   1.26   0.37   0.79   0.42   4.90   0.54   0.40   0.22  -0.78

//////////////////////////////////////////////////////////////////////////////

  98   E   1.51   1.20   0.37   1.07   1.32   1.28   0.15   0.70   0.10   0.00
  99   V   1.06   0.96   1.70   1.16   1.91   1.06   0.04   0.92   0.34   0.00
 100   Q   1.11   1.08   1.10   0.83   0.00   0.00   0.00   0.00   0.88   0.00

HYDROPHOBIC MOMENT OUTPUT FILE

[ Previous | Top | Next ]

With the command-line parameter -MOMentfile, PepPlot writes a file with the helical hydrophobic moment values for every position in the sequence written out as a table of numbers. Here is part of the moment output file, kihua.mom:


PEPPLOT (Hydrophobic Moment) of: Kihua check: 1665 from: 1 to: 100

                           October 2, 1996 16:11

adenylate kinase (EC 2.7.4.3) 1 - human
N;Alternate names: myokinase
C;Species: Homo sapiens (man)
C;Date: #sequence_revision 23-Oct-1981 #text_change 16-Feb-1996
C;Accession: A33508; A00679
R;Matsuura, S.; Igarashi, M.; Tanizawa, Y.; Yamada, M.; Kishi, F.; Kajii, T.;
 Fujii, H.; Miwa, S.; Sakurai, M.; Nakazawa, A

  Pos Res    Alpha    Alpha     Beta     Beta
             Angle    Value    Angle    Value
    ----------------------------------------- ..
    1   M     95.0     0.52    161.0     0.46
    2   E     95.0     0.32    161.0     0.29
    3   E     95.0     0.28    159.0     0.21

/////////////////////////////////////////////

   98   E    105.0     0.42    159.0     0.37
   99   V      0.0     0.00    161.0     0.24
  100   Q      0.0     0.00      0.0     0.00

INPUT FILES

[ Previous | Top | Next ]

PepPlot accepts a single protein sequence as input. If PepPlot rejects your protein sequence, turn to Appendix VI to see how to change or set the type of a sequence.

RELATED PROGRAMS

[ Previous | Top | Next ]

PeptideStructure and PlotStructure were sent to us by Dr. Berthold Foertsch of the Max Planck Institute of Munich. Used together, these two programs let you see a graphics representation of the best choice Chou-Fasman or Garnier prediction with hydrophobicity or antigenic index superimposed.

Moment makes a contour plot of the helical hydrophobic moment for all rotation angles between 0 and 180 degrees per residue (Eisenberg, 1984, and Finer-Moore and Stroud, 1984, cited above). HelicalWheel plots a peptide sequence as a helical wheel to help you recognize amphiphilic regions.

RESTRICTIONS

[ Previous | Top | Next ]

The residue-specific attributes for all of the measurements in PepPlot are only defined for the standard alphabet of protein sequence characters, including B, X, Z, and * (see Appendix III). Sequences containing any other symbols, such as the gap symbols period (.) and tilde (~), are not suitable as input for PepPlot.

CONSIDERATIONS

[ Previous | Top | Next ]

You should realize that secondary structure predictions are not very reliable, especially for proteins that are not soluble or globular.

Plots with more than about 250 residues per 100 platen units may be too compressed to be useful for structure prediction, although they may be useful for comparing two protein sequences for structurally similar regions. When multiple-page plots are made, successive pages are overlapped by one residue so that the plots can be spliced together. The curves stop one-half window width from the ends of the sequence.

GRAPHICS

[ Previous | Top | Next ]

The Wisconsin Package must be configured for graphics before you run any program with graphics output! If the % setplot command is available in your installation, this is the easiest way to establish your graphics configuration, but you can also use commands like % postscript that correspond to the graphics languages the Wisconsin Package supports. See Chapter 5, Using Graphics in the User's Guide for more information about configuring your process for graphics.

<CTRL>C

[ Previous | Top | Next ]

If you need to stop this program, use <Ctrl>C to reset your terminal and session as gracefully as possible. Searches and comparisons write out the results from the part of the search that is complete when you use <Ctrl>C. The graphics device should stop plotting the current page and start plotting the next page. If the current page is the last page, plotters should put the pen away and graphic terminals should return to interactive mode.

COLOR

[ Previous | Top | Next ]

PepPlot uses dashed lines when four-color plotting is not available. Alpha curves are red when color is available, dashed in black and white. Beta curves are blue in color, solid in black in white. In the hydrophilicity panel, the GES curve is green if color is available and dashed otherwise. In the residue schematic, hydrophilic and charged residues are red and green in the color plot and dashed in black and white. Hydrophobic residues are blue in color and solid in black and white.

There are three threshold lines across the Chou-Fasman panel (panel D). From top to bottom, these lines are as follows: the blue line is the threshold for the beginning of a beta-sheet; the red line is the threshold for the beginning of an alpha-helix; and the black line is the breaking line below which either kind of structure is no longer predicted. In black and white these lines are solid, short dashed, and long dashed, respectively.

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

All parameters for this program may be put on the command line. Use the parameter -CHEck to see the summary below and to have a chance to add things to the command line before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.


Minimal Syntax: % pepplot PIR:Kihua -Default

Prompted Parameters:

-BEGin=1-END=100 the range of interest
-DENsity=87      density in residues per 100 platen units
-MENu=a          sequence display
      b          charged-polar-hydrophobic residue cartoon
      c          beta forming-breaking symbols
      d          Chou-Fasman alpha-beta prediction curves
      e          alpha forming-breaking symbols
      f          Chou-Fasman NH2-Ends prediction curves
      g          Chou-Fasman CO2-Ends prediction curves
      h          Chou-Fasman turn     prediction curve
      i          helical hydrophobic moment for alpha and beta
      j          hydropathy and hydrophilicity

Local Data Files:

-DATa1=pepplot.dat        amino acid attributes except for Garnier
-DATa2=ges.dat            hydrophobicities for the GES curve
-DATa3=garnier.dat        amino acid attributes for Garnier

Optional Parameters:

-CFFile[=kihua.cho]       writes out the Chou and Fasman predictions
-GARnierfile[=kihua.gar]  writes out the Garnier predictions
-MOMentfile[=kihua.mom]   writes out the Hydrophobic moment values
-NOPLOt                   suppresses the whole plot
-HWINdow=9                sets the window for hydropathy averaging
-NOGES                    suppresses the GES curve (default)
-GESWindow=20             sets the window for GES scale averaging
-SHOwseq                  insists on showing the sequence in panel 1
-BOXES                    draws a box around each quantitative panel
-NOTITle                  suppresses the plot's title

All GCG graphics programs accept these and other switches. See the Using
Graphics chapter of the USERS GUIDE for descriptions.

-FIGure[=FileName]  stores plot in a file for later input to FIGURE
-FONT=3             draws all text on the plot using font 3
-COLor=1            draws entire plot with pen in stall 1
-SCAle=1.2          enlarges the plot by 20 percent (zoom in)
-XPAN=10.0          moves plot to the right 10 platen units (pan right)
-YPAN=10.0          moves plot up 10 platen units (pan up)
-PORtrait           rotates plot 90 degrees

ACKNOWLEDGEMENT

[ Previous | Top | Next ]

PepPlot was written by Drs. Michael Gribskov and John Devereux of the Genetics Computer Group. It was first described in Nucl. Acids Res. 14(1); 327-334 (1986). The original code was revised by John Devereux to support command-line control for Version 5 and to support plotting the panels independently for Version 6.

LOCAL DATA FILES

[ Previous | Top | Next ]

The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.

PepPlot reads three different data files to find the residue-specific attributes: pepplot.dat, which contains Chou-Fasman and hydropathy values; ges.dat, which contains the GES scale; and garnier.dat, which contains the Garnier measures.

OPTIONAL PARAMETERS

[ Previous | Top | Next ]

The parameters listed below can be set from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.

-CFFile=kihua.cho

causes PepPlot to write an output file with the Chou and Fasman predictions for the sequence. The filename is the sequence name plus the filename extension .cho, unless you set it to something else.

-GARnierfile=kihua.gar

causes PepPlot to write an output file with the Garnier (1978, cited above) predictions for the sequence. The filename is the sequence name plus the filename extension .gar, unless you set it to something else.

-MOMentfile=kihua.mom

causes PepPlot to write an output file with the helical hydrophobic moments for the sequence. The filename is the sequence name plus the filename extension .mom, unless you set it to something else.

-HWINdow=9

sets the window for the Kyte and Doolittle hydropathy curve to some number other than nine. The hydropathy window must be between 1 and 50.

-GESWindow=20

sets the window for the Goldman, Engelman, and Steitz hydropathy curve to some number other than 20. The GES window must be between 1 and 50.

-NOPLOt

suppresses the plot.

-NOTITle

suppresses the plot's title.

-BOXES

draws a box around each quantitative panel (the ones with the tick marks).

-NOGES

suppresses the Goldman, Engelman, and Steitz curve in the hydropathy (eighth) panel.

-SHOwseq

The sequence in the top panel is normally suppressed if it seems too crowded. Use this parameter to insist that it be plotted no matter how crowded it seems.

The parameters below apply to all GCG graphics programs. These and many others are described in detail in Chapter 5, Using Graphics of the User's Guide.

-FIGure=programname.figure

writes the plot as a text file of plotting instructions suitable for input to the Figure program instead of drawing the plot on your plotter.

-FONT=3

draws all text characters on the plot using Font 3 (see Appendix I).

-COLor=1

draws the entire plot with the pen in stall 1.

The parameters below let you expand or reduce the plot (zoom), move it in either direction (pan), or rotate it 90 degrees (rotate).

-SCAle=1.2

expands the plot by 20 percent by resetting the scaling factor (normally 1.0) to 1.2 (zoom in). You can expand the axes independently with -XSCAle and -YSCAle. Numbers less than 1.0 contract the plot (zoom out).

-XPAN=30.0

moves the plot to the right by 30 platen units (pan right).

-YPAN=30.0

moves the plot up by 30 platen units (pan up).

-PORtrait

rotates the plot 90 degrees. Usually, plots are displayed with the horizontal axis longer than the vertical (landscape). Note that plots are reduced or enlarged, depending on the platen size, to fill the page.

-NOCLIpping

If the data points on a line fall outside of the window in which the data are supposed to be represented, most programs will clip the graph at the edge of the window. This switch disables that clipping.

Printed: November 18, 1996 13:07 (1162)


[ Program Manual | User's Guide | Data Files | Databases ]


Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com

Copyright (c) 1982, 1983, 1985, 1986, 1987, 1989, 1991, 1994, 1995, 1996, 1997 Genetics Computer Group, Inc. a wholly owned subsidiary of Oxford Molecular Group, Inc. All rights reserved.

Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

Genetics Computer Group

www.gcg.com