Python Basics and Biostatistics

Variability and Range

Back to Index
In this lesson the student will learn how to:
  1. Create and populate lists
  2. Find the range of a group of numbers
  3. Use the for loop
  4. Use parentheses for printing multiple lines
By the end of this lesson the student will be able to:

   Write a Python script which will calculate the mean 
   and range for sets of numbers of any size.

Let's say that we conduct a study of 100 subjects with high blood pressure. Fifty of these subjects get an experimental treatment and the other 50 get a traditional treatment. Let's say for our purposes we are only interested in the systolic pressure (the higher of the two values in a blood pressure reading). How do we analyze the data we've collected? One thing we can do is to calculate the mean of each group. Another thing we can do is to determine the amount of variability within the group. The crudest and least informative way of doing this is to determine the range for each set of data points. The range is simply the differnce between the highest and lowest value for a data set. We have two data sets and just as we can figure out the mean for each data set, we also can figure out the range for each data set.

Here's a simple script which calculates the mean and range for two sets of ten numbers:

#!/usr/bin/python #data points g1 = [ 120, 135, 140, 150, 133, 141, 146, 155, 137, 144 ] g2 = [ 133, 136, 139, 152, 139, 140, 147, 152, 150, 149 ] #calculate means t1 = 0; for v in g1: t1 = t1 + v n = len(g1) m1 = float(t1) / n t2 = 0 for v in g2: t2 = t2 + v n = len(g2) m2 = (float)t2 / n #find highest h1 = 0 for v in g1: h1 = (h1 if h1>v else v) h2 = 0 for v in g2: h2 = (h2 if h2>v else v) #find lowest low1 = 1000 #set to value higher than any of the data points for v in g1: low1 = (low1 if low1<v else v) low2 = 1000 for v in g2: low2 = (low2 if low2<v else v) #calculate ranges r1 = h1 - low1 r2 = h2 - low2 #generate report print "GROUP ONE: " print g1 print ("MEAN: " + str(m1) + ", HIGH: " + str(h1) + ", LOW: " + str(low1) + ", RANGE: " + str(r1) + "\n") print "GROUP TWO: " print g2 print ("MEAN: " + str(m2) + ", HIGH: " + str(h2) + ", LOW: " + str(low2) + ", RANGE: " + str(r2) )
Get this script up and run it.

There are several new constructs in this script which we need to discuss.

ASSIGNMENT:

You will alter the example script in the following ways:

  1. You will populate your arrays with 20 values each. Your array values will represent the weights of individuals in a population. You should specify the age group, gender, and species of your population and then produce a list of reasonable weights (in pounds, kilograms, ounces, or grams) for that population.
  2. You may use only two for loops. That is one for loop for each group. The sample script uses six for loops which is a bit redundant since each group only needs one for loop which performs all the calculations within that single loop. So your loops will find the highest value, lowest value, and total for each group. (Remember that any indented lines following a for statement will be included in the loop.)