1.2 Why Perl Modules?
Building a medium- to large-sized program usually requires you to divide tasks into several
smaller, more manageable, and more interactive pieces. (A rule of thumb is that each "piece"
should be about one or two printed pages in length, but this is just a general guideline.) An
analogy can be made to building a microarray machine, which requires that you construct
separate interacting pieces such as housing, temperature sensors and controls, robot arms to
position the pipettes, hydraulic injection devices, and computer guidance for all these
systems.
1.2.1 Subroutines and Software Engineering
Subroutines divide a large programming job into more manageable pieces. Modern
programming languages all provide subroutines, which are also called functions, coroutines, or
macros in other programming languages.
A subroutine lets you write a piece of code that performs some part of a desired computation
(e.g., determining the length of DNA sequence). This code is written once and then can be
called frequently throughout the main program. Using subroutines speeds the time it takes to
write the main program, makes it more reliable by avoiding duplicated sections (which can get
out of sync and make the program longer), and makes the entire program easier to test. A
useful subroutine can be used by other programs as well, saving you development time in the
future. As long as the inputs and outputs to the subroutine remain the same, its internal
workings can be altered and improved without worrying about how the changes will affect the
rest of the program. This is known as encapsulation
.
The benefits of subroutines that I've just outlined also apply to other approaches in software
engineering. Perl modules are a technique within a larger umbrella of techniques known as s
oftware encapsulation and reuse. Software encapsulation and reuse are fundamental to
object-oriented programming.
A related design principle is abstraction, which involves writing code that is usable in many
different situations. Let's say you write a subroutine that adds the fragment TTTTT to the end
of a string of DNA. If you then want to add the fragment AAAAA to the end of a string of DNA,
you have to write another subroutine. To avoid writing two subroutines, you can write one
that's more abstract and adds to the end of a string of DNA whatever fragment you give it as
an argument. Using the principle of abstraction, you've saved yourself half the work.
Here is an example of a Perl subroutine that takes two strings of DNA as inputs and returns
the second one appended to the end of the first:
sub DNAappend {
my ($dna, $tail) = @_;
return($dna . $tail);
}
This subroutine can be used as follows:
my $dna = 'ACCGGAGTTGACTCTCCGAATA';
my $polyT = 'TTTTTTTT';
print DNAappend($dna, $polyT);
If you wish, you can also define subroutines
polyT
and
polyA
like so:
sub polyT {
my ($dna) = @_;
return DNAappend($dna, 'TTTTTTTT');
}