Perl Basics and Biostatistics

Slope and Intercept

In this lesson the student will learn how to:
  1. calculate the slope of the best-fit line given a set of data points
  2. calculate the y-intercept of the best-fit line
  3. use redo, last, and next to modify loop behavior
By the end of this lesson the student will be able to:

  Write scripts to find slope and y-intercept. Also demonstrate
  ability to modify loop behavior with redo, last, and next.


The following equations can be used to calculate the slope and y-intercept for the best fit line for a data set.

            summation(xi - mean of x)*(yi - mean of y)
   slope = -----------------------------------------------
                        summation(xi)2


   y-intercept = mean of y - slope * mean of x

Example:

Data Set: (1,1), (2,3), (3,1), (4,5), (5,3), (5,7), (6,6)
mean of x: 3.7
mean of y: 3.7
summation of xi - mean of x = -2.7 + -1.7 + -.7 + .3 + 1.3 + 1.3 + 2.3  = .1
summation of yi - mean of y = -2.7 + -.7 + -2.7 + 1.3 + -.7 + 3.3 + 2.3 = .1
slope = 1
y-intercept = 3.7 - 1 * 3.7 = 0

PERL: Loop Behavior Modification with Redo, Next, and Last

Consider the following script:

#!/usr/bin/perl -w $x = 1; while($x < 10){ print "$x. Do I have to ask you again? (yes/no)\n"; $ans = <STDIN>; chomp($ans); if($ans eq "no" ){ last; } $x++; } exit;
Run this program many times and discover exactly what it does and under what circumstances it does it.

Now consider this slight modification.

#!/usr/bin/perl -w $x = 1; while($x < 10){ print "$x. Do I have to ask you again? (yes/no)\n"; $ans = <STDIN>; chomp($ans); if($ans eq "no" ){ redo; } $x++; } exit;
This script is a little annoying since there is no quick way out of it. But it does illustrate the behavior of redo quite nicely.

Now consider this version of the script:

#!/usr/bin/perl -w $x = 1; while($x < 10){ print "$x. Do I have to ask you again? (yes/no)\n"; $ans = <STDIN>; chomp($ans); if($ans eq "no" ){ next; } $x++; } exit;
The behavior of next and redo in this context seem very similar. While redo repeats the loop, next skips the rest of the code in the loop (which has the effect in this case of repeating the loop). Take a look at next in this context:
#!/usr/bin/perl -w for($x = 1; $x < 10; $x++){ print "$x. Do I have to ask you again? (yes/no)\n"; $ans = <STDIN>; chomp($ans); if($ans eq "no" ){ next; print "HELLO NUMBER: $x\n"; } } exit;
Now consider redo:
#!/usr/bin/perl -w for($x = 1; $x < 10; $x++){ print "$x. Do I have to ask you again? (yes/no)\n"; $ans = <STDIN>; chomp($ans); if($ans eq "no" ){ redo; print "HELLO NUMBER: $x\n"; } } exit;

ASSIGNMENT:

  1. Write a perl script which calculates the slope and y-intercept given a set of data points. Make your own sets (at least three) of data points to test your script. (Make sure that at least one of your sets of data points has a negative slope.)
  2. Write an amino acid quiz script which uses last (when the user accumulates five wrong answers) and redo (whenever the user supplies a wrong answer). You make two arrays (one for amino acid names and the other for amino acid symbols). You will use a loop to iterate through these arrays one at a time. If the user finishes the quiz successfully (without accumulating five wrong answers), supply a congratulations message. Here's sample output for your script:
    
    What is the symbol for glutamate
    G
    Try again!
    What is the symbol for glutamate
    E
    CORRECT
    
    What is the symbol for glycine
    G
    CORRECT
    
    GREAT JOB!