Python Basics and Biostatistics
Finding the Median Value
Back to Index
In this lesson the student will learn how to:
- find the median of a data set
- use the sort function
- implement numeric or alphabetic sorting
- use an if/else construct
- use the modulus operator
- use array indexes
By the end of this lesson the student will be able to:
Write a Python script to find the
median of a group of values.
The Median
The median of a group of numbers is the middle value. Half the data points
will be above the median and half will be below. For data sets containing an
odd number of values it is really easy to find the median. Consider this
list of values:
5, 11, 14, 20, 22, 23, 24, 25, 30.
There are nine values in this list. They are sorted in order of size and so
the middle value is the fifth value which is 22. There are four values less
than this number and four numbers greater than this number. Twenty-two is
the middle value and is therefore the median.
Now what do we do if we have an even number of values? Consider the
following list:
5, 11, 14, 20, 22, 23, 24, 25.
Here the middle two values are 20 and 22. To find the median we add these
two values together and divide by two like this:
20 + 22 = 42
42 / 2 = 21
So, 21 is the median for this group of eight numbers.
The following script will find the median for any list of numeric values
stored in the variable vals.
#!/usr/bin/python
#data points
vals = [33, 23, 55, 39, 41, 46, 38, 52, 34, 29, 27, 51, 33, 28]
output=0
print "UNSORTED: ", vals
#sort data points
vals.sort()
print "SORTED: ", vals
#test to see if length of vals is even or odd
if(len(vals)%2==0):
#if even then:
sum = vals[len(vals)/2-1] + vals[len(vals)/2]
output = float(sum)/2
else:
#if odd then:
output = vals[len(vals)/2]
print "MEDIAN VALUE: " + str(output)
There are a number of things we need to discuss in order to understand this
script:
- LIST INDEXING: The sample script contains fourteen list elements. The
first element is indexed like this: vals[0]. The last one is indexed like
this: vals[13].
- The sort function simply sorts the values in a list. It does the work
of sorting for you. Otherwise you'd have to do some real work!
- The MODULUS operator: The modulus operator looks like a percent sign
(%). It returns the remainder of a division problem. This makes it real
useful for differentiating between odd and even numbers. The double equal
signs checks for equality. So, the statement "if( len(vals) % 2 == 0 )" is just a
way of saying "if the length of the list is even."
- The if/else construct is used to decide what to do depending on whether
or not the length of the array is even. If it is even we do the stuff in the
block following the if and if it is odd we do the stuff in the block after
the else.
ASSIGNMENT:
Write a script which finds the mean and the median for a group of numbers.
Next your script will count the number of values over the mean and under the
mean and issue a report that looks about like this:
MEAN: 22.25
MEDIAN: 24
OVER MEAN: 8
UNDER MEAN: 12
Your array will contain 20 values.