![]() |
|||||||
|
Reading: Deitel 13.1-13.4; 13.6, 13.7 advanced reading: man perlref (not the easiest of guides)
So far we have dealt with different kinds of variables; we have encountered scalars, which are single valued, and arrays and hashes, which are multiple valued. scalars
|
@sequences = ( "GTTGATTGC", "CGCTTGNNNN", "ATATAGGATTCC" ); push(@sequences, "ATGGCTGTTGCTAAT"); |
%codons = ( ATG => 'M', TCA => 'S', TCG => 'S', TCC => 'S', TCT => 'S', TTT => 'F', TTC => 'F', TTA => 'L', TTG => 'L' ); print $codons{"ATG"}; |
Scalars can be numeric or they can be strings; scalars can also hold a reference. Think of a reference as a pointer to another variable.
Creating a
reference to another variable
We use the backslash symbol to create a reference to another variable
program 4
we can have references to any variable type:
program 5
# ref_to_hash.pl |
Initializing
references to arrays
We use the square brackets to compose a reference to an array
program 6
We use the curly brackets to compose a reference to a hash
program 7
$re_hash = { 'Eco47III'=> 'AGCGCT', 'EcoNI' => 'CCTNNNNNAGG', 'EcoRI' => 'GAATTC', 'EcoRII' => 'CCWGG', 'HincII' => 'GTYRAC', 'HindII' => 'GTYRAC', 'HindIII' => 'AAGCTT', 'HinfI' => 'GANTC' }; |
Indexing
elements in an array reference
We use the arrow operator -> to to index an element in an array reference. Think of it as arrow pointing to the array being referenced by the reference variable.
program 8
$genes = [ "CDC1", "ACT1", "ORC1" ]; |
This will output:
CDC1ACT1ORC1ORC1 |
we can also dereference the array and access the scalar like this
program 9
$genes = [ "CDC1", "ACT1", "ORC1" ]; |
This will output:
CDC1ACT1ORC1ORC1 |
alternatively we can turn an array reference into an array like this:
program 10
$genes = [ "CDC1", "ACT1", "ORC1" ]; |
This will output:
CDC1ACT1ORC1ORC1 |
Similar rules apply to hash references too.
program 11
# gene_hash_refence.pl |
Let's say we want to write a subroutine to multiply all the elements in one array a with all the elements in another b; i.e.
( a1 x b1, a2 x b2, an x bn ) |
program 12
#!/usr/local/bin/perl # test.pl |
What is wrong with the program above? It turns out that it is impossible to write a subroutine that will work in the desired way if it is called in the manner above.
This is because a subroutine takes a single list of arguments; the subroutine call above is equivalent to
@c = arraymult((1, 2, 3), (4, 5, 6) |
which is equivalent to
@c = arraymult(1, 2, 3, 4, 5, 6) |
because the comma operator colapses the lists together
We can instead pass in references to the arrays. We create a reference to a variable using the backslash operator (just when you thought perl couldn't get any more full of typographical symbols...)
@c = arraymult(\@a, \@b); |
remember, references are just scalar variables that hold pointers. We are still passing the arraymult subroutine a list of two scalars.
That's how we call the subroutine. How do we actually write the subroutine? Well, we have to do the opposite - we have to dereference the references to get the the arrays they hold.
program 13
sub arraymult { my ($listref1, $listref2) = @_; |
Altering
the value of variables passed to subroutine
program 14
# swapvars.pl |
program 15
# annotations.pl |
Let's try running this program:
Please enter a gene symbol CDC1 Annotations for CDC1 = metal ion homeostasis DNA replication DNA recombination |
Rewrite the re_hash.pl program from Lecture3 to use a hash reference. |
Deitel exercise 13.5
OR
Create a program that reads in hexamer output, or the output from some other prediction program. Turn each line into a hash reference, with keys such as start, end, strand, score. Store the hash references in an array. After the program has done this, it should prompt the user for a range. The program will then loop through the array, finding all predictions in that range, and put these into another array. It should then iterate though this array, printing the results in GFF format.
Chris Mungall
cjm@fruitfly.org
Berkeley
Drosophila Genome Project