Python Basics and Biostatistics

Paired Groups

In this lesson the student will learn how to:

manipulate the contents of a list
different methods to iterate through the contents of a list
calculate the 95% CI of the difference between means for paired groups

By the end of this lesson the student will be able to:


    Write a script which calculates the 95% CI of the difference
    between the means for paired groups based on individual scores
    stored in a pair of lists.

Let's redesign our fat camp experiment. This time instead of just randomly assigning our chubby campers to either the experimental or control group, we are going to only include pairs of chubby twins or siblings in our study. One member from each pair will be given the experimental treatment and the other will receive the placebo. Otherwise, everything will be as it was in the first version of the fat camp experiment. When analyzing our data we will calculate the difference between EACH PAIR and then find the mean difference for all pairs before doing further statistical analysis on the data. (Remember in the first experiment we found the average weight loss for each group and THEN found the difference before proceeding on to more statistical analysis.)

In this type of experiment we use the number of pairs to determine df:


   df = N_pairs - 1

Our experiment has 42 pairs of fat campers and so our df is 41 and the 95% CI value corresponding to a df of 41 is 2.021. Our average difference between the two groups comes out to 4.4 pounds (this indicates that our experimental group lost more than our control group; a negative number would indicate the reverse situation). Our SD is 1.5 pounds. To calculate 95% CI we plug our values into this equation:


   95% CI of mean difference = MEAN +/- t * SE of paired differences

   SE of paired differences = SD / sqrt(N)

   SE = 1.5 / sqrt(41)
      = .23

   95% CI of mean difference = 4.4 +/- 2.021 * .23
   high: 4.86
   low: 3.94

Based on our statistical analysis of the data we can conclude that our treatment made a definite change in the weight of our subjects.

List Manipulation and Iteration

Consider this sample Python Script:

#!/usr/bin/python a = [100,55,33,77,44,33,22,99,13, 16, 18, 123,17,24,11,33,15] print "regular order:" print a print "sorted order:" print sorted(a) print "count of 33 in list:" print a.count(33) a.append(101) print "after append:" print a print "range from inside list:" print a[3:8] a.insert(4,39) print "after insert:" print a print a.pop() print a.pop() print "after 2 pops:" print a a.remove(33) del a[2] print "after remove and delete:" print a #extra credit if you understand this one a = [x for x in a if x < 20 and x > 10] print "after list comprehension" print a

This script demonstrates the use of several operations that can be performed on lists. Now consider this next script.

#!/usr/bin/python a = [1,3,4,2,6,5,7,9,8,11,21,6,17,15] print "----REVERSED----" for i in reversed(a): print i print "----i in a----" for i in a: print i print "----i in range backwards----" for i in range(len(a)-1,-1,-1): print a[i] print "----i in range----" for i in range(0,len(a)): print a[i]

You should play around a bit with lists. Try a few things out. For instance:

Can you substitute a string value for an integer value in an list?
Can you add more than one item to an existing list using a single command?
Can you use list comprehension to print only odd numbers?

ASSIGNMENT:

Write a PYTHON script which contains two arrays of the same length (25 subjects each). One array will contain data for group one (experimental group) and the other will contain data for group two (control group). You will find the mean difference between these two groups and then you will calculate the 95% CI for the difference between these means. Report the raw mean and the high and low of the confidence interval.

The values in your arrays will represent paired groups. Thus, array1[0] and array2[0] will contain values for each member of a pair.