Python Basics and Biostatistics
Numeric Variables
Back to Index
In this lesson the student will learn how to:
- Store values in variables
- Calculate the arithmetic mean
- Use the four basic arithmetic operators in a PYTHON script
- Name and identify scalar variables
By the end of this lesson the student will be able to:
Write a PYTHON script which calculates the mean
of a group of numbers.
Most computer languages provide a way for the programmer to store values.
Python provides a type of variable called a numeric variable as it's simplest type of
variable. As just mentioned, variables are used to store values. You can
name variables almost anything you want, but there are certain limits. Here
are a few sample numeric variable declarations:
val = 88
v2 = 99
abc123 = 444
These are all valid names for variables which can contain numeric values.
Actually, it is perfectly permissible for variables such as these to contain
values such as:
xval = 3.44 #floating-point value
hotdog = 3.14e2 #floating-point value multiplied by 10^2
cold = 0x9aa #hex value
Numeric variables contain numeric values and cannot contain other types of
values. We will take a look at other types of values in future lessons. For
now, inspect the following Python script:
#!/usr/bin/python
v1 = 2
v2 = 1
answer = v1 + v2
print str(v1) + " + " + str(v2) + " = " + str(answer)
Run this script. (Remember to make it executable by typing: chmod +x
nvals.py. Assuming, of course, that you named it nvals.py.)
For the most part this script should be easy to understand. However, you
might have a couple questions about the last line. So, let's take a closer
look at it.
print str(v1) + " + " + str(v2) + " = " + str(answer)
First of all, the print command simply tells Python to print the line in
question. In this script the output will appear at the command prompt. The
rest of the line is what will be printed, but the complication here is that
we want to include a plus sign and an equals sign along with the numbers.
The values included inside quotation marks are known as string values and in
order to get string values and numeric values to print together in the same
line, numeric values must be converted to string values using the str()
function. If all went well, the following output is what you observed when
you ran this script.
2 + 1 = 3
In PYTHON we can also subtract, divide, and multiply. (We can do quite a few
other things, but we'll save that stuff for later.) Here's a script which
subtracts, divides, and multiplies.
#!/usr/bin/python
v1 = 22
v2 = 13
add = v1 + v2
sub = v1 - v2
mul = v1 * v2
div = v1 / v2
print str(v1) + " + " + str(v2) + " = " + str(add)
print str(v1) + " - " + str(v2) + " = " + str(sub)
print str(v1) + " * " + str(v2) + " = " + str(mul)
print str(v1) + " / " + str(v2) + " = " + str(div)
Also make sure that you understand the difference between this line:
sum = v1 + v2
And one that looks like this:
print str(v1) + " + " + str(v2) + " = " + str(sum)
The first line actually stores a value (in this case, the sum of the values
stored in v1 and v2 are stored in sum). The second line merely prints out
the contents of the variables, v1, v2, and sum, in a formatted manner. When
plus signs are used to join together strings, they no longer perform an
arithmetic operation. Instead they are used to perform an operation known as
concatenation. For instance, the concatenation of "thunder" and "struck" is
"thunderstruck".
The Arithmetic Mean
Hopefully you already know how to calculate the arithmetic mean, but if you
don't we'll go through the steps in just a moment. First, we should discuss
the importance of an arithmetic mean. Often we refer to the arithmetic mean
as the average of a group of values. For instance, we could weigh all the
students in the seventh grade class and from this set of data calculate a
mean weight for the class. Alternately, we could be dealing with people
suffering from some physical condition which we treat with some new
medication. In our study to determine the effectiveness of our new treatment
we could collect data on how long it took each person to recover from the
physical condition after beginning treatment with our new medication. We
could then calculate an average time to recovery for this group of people
using our medication. The mean is a handy way of summarizing data.
Remember the weight of living organisms and the time it takes a chemical or
biolgical compound to have an effect on an organism are both examples of
biological data. Keep in mind that we are discussing how to analyze
biological data. One of the most basic ways to begin such analysis is to
calculate the mean for the data you've gathered.
If someone asks, "How much do seventh graders weigh?" the most meaningful
answer is probably the mean weight of seventh graders. If we weigh forty
seventh graders there is a good chance that the heaviest one will weigh
somewhere around 140 pounds and the lightest one will weigh around 70
pounds. As you can see the heaviest one is likely to be twice as heavy as
the lightest one. The average weight of seventh graders is probably going to
be somewhere around 98 pounds (which is not exactly half way between 70 and
140). (NOTE: In the case of seventh graders it would probably be appropriate
to report separate means for each gender.)
Likewise, in the case of reporting the time to recovery after beginning
treatment with some therapeutic agent, it makes the most sense to report the
average time to recovery. For instance, you might have one patient recover
in 13 days after beginning treatment and another not recover until 31 days
after starting treatment (and some patients might not respond at all), but
the most meaningful single statistic is the mean time to recovery which
might be something like 26 days (which is definitely not exactly half way
between 13 and 31).
We have limited this discussion to the arithmetic mean. Obviously, the mean
or average is not the only statistic we can generate from a data set. We
will discuss more statistics in upcoming lessons.
Calculating the Mean
To calculate the mean we simply add up all the values in our data set and
then divide this total by the total number of values in the set. This can be
summarized like this:
MEAN = sum_of_all_values / number_of_values
We can add up a bunch of values in PYTHON like this:
total = a + b + c + d + e + f
In this case we have only six values and so we could calculate the mean like
this:
mean = total / 6
One lasts thing you should know before getting to the assignment is how to
force integer values to behave as floating-point values. As a quick
experiment run this simple PYTHON script:
#!/usr/bin/python
v1 = 22
v2 = 13
div = v1 / v2
print str(v1) + " / " + str(v2) + " = " + str(div)
You will notice that the answer produced is 1. This is normal behavior when
we are doing integer division. In order to get PYTHON to store the answer as
a floating-point value we need to make one small change:
#!/usr/bin/python
v1 = 22
v2 = 13
div = float(v1) / v2
print str(v1) + " / " + str(v2) + " = " + str(div)
There are actually a couple other ways to force floating-point division, but
this is the most explicit. Having employed the float() function, we now
derive the following output:
22 / 13 = 1.69230769231
ASSIGNMENT:
Write a PYTHON script which calculates the average for twelve values. Your
values must range between 60 and 110 and be more or less (approximately)
evenly distributed within this range. Format your output like this:
TOTAL: 857
NUMBER OF VALUES: 10
MEAN: 85.7
All values must be stored in variables and you must use a minimum of fifteen
variables in your script.
Main Index