Perl Basics and Biostatistics

Simple Comparisons

In this lesson the student will learn how to:
  1. Use the greater than and less than operators to perform simple comparisons
  2. Write comments in a PERL script
  3. Differentiate between the bang line and a comment line
  4. Undertand the importance of comparison in science
By the end of this lesson the student will be able to:

  Write a simple PERL script which determines 
  the largest of five numbers.

Science and Comparing Things

You do a lot of comparing in your daily life. At the store you might compare the price and quality of two very similar products, for instance. Scientists also do a lot of comparing of things. One of the most basic types of scientific inquiry is to compare two groups. For instance, scientists many decades ago noticed that yellow mice tended to be more obese than white mice. To document this they took a large group of yellow mice and calulated their mean weight and did the same for a large group of white mice. This basic observation is interesting, but it begs the question: Why do the yellow mice tend to be more obese than other mice? As it turns out mice carry a gene called agouti on their second chromosome. This gene is most observably associated with coat color. There are many different forms of this gene (and many corresonding different coat colors which get expressed depending on which form of the gene an individual mouse happens to have inherited from its parents). Interestingly, one form of the gene, called viable yellow, is also associated with obesity in mice. (NOTE: Many genes influence obesity and coat color, but the agouti gene was one of the first to be associated with both obesity and coat color. Also, genes come in pairs: one inherited from the mother and the other from the father. We will discuss more about this in later units.)

Determining The Greatest of Two Values

In Perl we can perform comparisons between two values. Consider the following short script:

#!/usr/bin/perl $a = 10; $b = 12; $biggest = ( $a > $b ? $a : $b); print "Input: $a, $b\n"; print "Largest Value: $biggest\n\n"; exit;
Remember to make this script executable (otherwise you won't be able to try it out).

Adding Comments

Often programmers add comments to a script in order to help them remember what the script is for and to explain tricky parts of a program. Comments are readable by humans, but the computer ignores them. Even though this is a really simple program we will add some comments to it anyways just so you can see what comments look like. You should notice that all comments begin with a # sign. (The bang line also begins with a #, but the computer does look at this line to know what to do with the script and the bang line is always the first line of a script whereas comments can be inserted anywhere - except for the first line.)

#!/usr/bin/perl # script to compare two numbers #data: $a = 10; $b = 12; #determine largest: $biggest = ( $a > $b ? $a : $b); #report findings: print "Input: $a, $b\n"; print "Largest Value: $biggest\n\n"; exit;
In a short and simple script like this one, the comments don't do a lot for you, but in a really long and complicated script they might be more important and useful. Comments are also useful in making it easier for someone else to make sense of your code.
NOTE: The construct which is used to decide which value is larger may be slightly confusing to you. $biggest = ( $a > $b ? $a : $b); What happens here is that if $a is larger than $b (that is if the inequality evaluates to TRUE), then the first option after the question mark (in this case the value stored in the variable $a) gets assigned to the variable $biggest. If the inequality evaluates to FALSE (that is if $b is greater than $a, then the value stored in $b gets stored in $biggest.

You should think carefully about how a sequence of lines like this could be used to find the largest of several numbers. Consider three values ($a, $b, and $c). Once you have stored the larger of $a and $b in $biggest, then all you have to do is compare the values in $biggest and $c to identify the largest value from the original three values.

ASSIGNMENT:

Write a script similar to the sample script which finds the largest of five values. You will need to use four comparison lines similar to the one shown in the sample script.