Write a Python script to calculate the 95% CI of the Odds Ratio based on information presented in a contingency table
Disease Treatment Progressed No Progression Total -------------------------------------------------------------- AZT 72 411 483 Placebo 154 335 489 Total 226 746 972 --------------------------------------------------------------A contingency table is a common way of presenting data. The rows and columns can have different meanings. Each cell in the table contains the number of subjects that are classified as part of one particular column and row.
Probabilities and Odds
Probabilities and odds are two different ways of expressing the same concept. Probabilities can be converted to odds, and odds can be converted to probabilities. The probability that an event will occur is the fraction of times you expect to see that event in many trials. The odds are defined as the probability that the event will occur divided by the probability that the event will not occur. Here are the conversion formulas for probability and odds:
Odds = probability / ( 1 - probability ) Probability = Odds / ( 1 + odds )For example, if we flip two coins the chances of getting two tails is 25% since there are four possible outcomes and only one of those outcomes is the one specified. The odds of getting two tails then are 25:75 which can be simplified to 1:3 or one to three or .33. Here are the relationships expressed as equations:
Probability = target outcome / possible outcomes P of 2 tails = 1 / 4 = .25 Odds = P / (1 - P) Odds of 2 tails = .25 / .75 = .33Probability values always range from 0 to 1.0. Odds range from 0 to infinity and interestingly as probability values go from .5 to 1.0, odds go from 1 to infinity.
Inspecting the contingency table from above we can make the following statements:
From the odds information we can calculate what is known as the odds ratio:
Odds ratio = ( odds AZT group / odds no AZT group ) = .18 / .46 = .39So we can say, compared to control patients, the odds of disease progression in AZT-treated subjects is .18/.46 which equals .39. In other words, the odds of disease progression in AZT-treated subjects is about two-fifths that of control patients.
As with the relative risk, the CI of the odds ratio is not symmetrical. This makes sense as the odds ratio cannot be negative but can be any positive number. The asymmetry is especially noticeable when the odds ratio is low. Several methods can be used to approximate the CI of the odds ratio. Here's the one we use:
A = 72 B = 411 C = 154 D = 335 OR = .39 95% CI of ln(OR) = ln(OR)+/-1.96*sqrt(1/A + 1/B + 1/C + 1/D) Take the antilogarithm of both values to obtain 95% CI of the OR. 95% CI of ln(OR) = ln(.39) +/- 1.96*sqrt(1/72 + 1/411 + 1/154 + 1/335) high =-0.942 + 1.96 * sqrt(.025) = -0.942 + 1.96 * .158 = -0.942 + .31 = -0.632 low = -0.942 - .31 = -1.252 95% CI of OR: high = antilog(-0.632) = .53 low = antilog(-1.252) = .29So, this means that we can be 95% sure that the true odds ratio is somewhere between .29 and .53.
More Array Stuff in Python
Consider this script which takes input from the command line and then stores it in an array:
Now inspect the behavior of the for-in loop shown in this script:
ASSIGNMENT:
Write a Python script to calculate the 95% CI of the Odds Ratio based on information presented in a contingency table. Create at least three contingency tables with which to test your script. Your script should report the raw contingency ratio as well as the high and low of the 95% CI for it.