Python Basics and Biostatistics

Conversion Between SEM and SD

Back to index
In this lesson the student will learn how to:
  1. convert SEM to SD
  2. use an if/else construct
  3. calculate the t distribution
  4. write a dual function script
By the end of this lesson the student will be able to:

         Write a dual function script which gives the user the 
         choice between converting SEM to SD or vice versa

Central Limit Theorem

In discussing the central limit theorem we will consider two other topics: standard error of the mean (SEM) and the t distribution. We will also have to employ a little agebra to see how the whole thing works. The basic equation we start with is:


              t = (sample mean - population mean) / SEM


which can be rearranged like this:

              population mean = sample mean - t * SEM

You can prove this to yourself algebraically if you want, but you can derive one statement from the other by applying basic algegra.

Standard Error of the Mean

The standard error of the mean can be calculated from the SD of the population and from population size:


              SEM = SD/sqrt(N)

The SEM indicates the precision of the sample mean. The smaller the SEM, the closer the sample mean is to the true population mean. On the other hand, a large SEM means that the sample mean is not very close to the true population mean. A small SEM often indicates either a large sample size or tight data (low scatter).

The t Distribution

Statisticians derived the t distribution by starting with a known population and SD. They calculated the distribution of t for many samples from this known population. The sample size from which each sample SD and sample mean were calculated corresponds to the t values in the table from the last lesson. It should be possible to replicate this work, but it would be very tedious to do so. To make a long story short, t is calculated with this equation as shown above:


              t = (sample mean - population mean) / SEM

By turning the equation around you can calculate the population mean like this:

              population mean = sample mean - t * SEM

This is the essence of the central limit theorem.

Conversion Between SEM and SD

Consider these basic relationships:


              SEM = SD / sqrt(N)

              SD = SEM * sqrt(N)

It can happen that you have just the values of SD and N or just SEM and N and you need the missing value. Plugging your two known values into the equations shown above will yield the missing value. For instance, if N = 100 and SD happens to equal 10, SEM will equal 1. On the other hand, if SEM equals 2 and N just happens to equal 225, then the SD will equal 30.

Error Intervals

Often the results of a study are summarized as a mean plus or minus either the SD or the SEM. You should summarize your data as a mean plus or minus the SD to show the scatter of your data. You should use the mean plus or minus the SEM to show how well you know the population mean (how reliable your calculated mean really is). For instance, if you want to get across the idea that the average weight of a certain species of animal is 40 pounds and that most adult specimens of this species will weigh within 8 pounds of 40 pounds you would report this as a MEAN plus or minus SD. If on the other hand, you are basing your mean on a limited sample and you want to express the idea that your calculated mean is within a certain SEM of the population mean you would report this as a MEAN plus or minus SEM. You should keep these ideas in mind when reviewing statistics reported in reports you read.

Dual Function Script

Here's a basic script which gives the user two choices: 1) find sqrt from input or 2) find square of input. Follow the script logic to see how the user interacts with this script.

#!/usr/bin/python import math p="Select: 1) find square, 2) find square root\nENTER: " what = raw_input(p) if(what == "1" or what == "2"): n = raw_input("Enter number: ") n = float(n) if(what=="1"): n = n*n else: n = math.sqrt(n) print n else: print "Incorrect input. Read the directions."

Run this script several times and be able to account completely for its behavior.


Assignment:

  1. SEM/SD Conversion Script - Write a script which gives the user the choice between converting SEM to SD or vice versa. Remember that you will also have to take input for N. Otherwise your script will be very much like the sample script.