|
After a clinical trial is executed there is a follow-up period, during which the scientists attempt to determine whether or not the proposed treatment was successful. One of the most common measures in clinical trials is death.
However, patients involved in trials do not always complete the follow-up, thus creating data that is not fully-accurate. For example, if the follow-up period is 12 years, patients may actually leave after 4 years. A patient that does this frustrates the trial as the result, in the predescribed timeframe can not be concluded.
Thus, the Kaplan-Meier Estimation for Survival Curves attempts to provide the researchers with a viable interpretation of the data.
|
|
Censored Data:
- Patient data that is not reflective of the entire follow-up period, data relating to a patient that has left the study (not through death).
|
|
Example:
Taken from Survival Curves: Accrual and The Kaplan-Meier Estimate at http://www.cancerguide.org/scurve_km.html. CancerGuide (May 28, 2003).
Dataset:
This example involves seven patients. The numbers below represent, in order, how long the patient lived following the trial.
The total desired follow-up period was 12 years.
It should be noted that a plus (+) sign after a number indicates that the patient left the study after the specified number of years and was, at that point, alive.
From this data, the following table could be constructed:
Interval (Start-End) | # at Risk at Start of Interval | # Censored During Interval | # at Risk at end of Interval | # Who Died at end of Interval | Proportion Surviving This Interval | Cumulative Surivival at end of Interval |
0-1 | 7 | 0 | 7 | 1 | 6/7 = 0.86 | 0.86 |
1-4 | 6 | 2 | 4 | 1 | 3/4 = 0.75 | 0.86*0.75 = 0.64 |
4-10 | 3 | 1 | 2 | 1 | 1/2 = 0.5 | 0.86*0.75*0.5 = 0.31 |
10-12 | 1 | 0 | 1 | 0 | 1/1 = 1 | 0.86*0.75*0.31*1 = 0.31 |
Using this information, a graph was constructed.
The Kaplan Meier analysis that I used was written by Wenting Zhou, a member of my research group. The program is fed a file containing both the survival time and whether or not the data was censored. The program outputted a file readable by Microsoft Excel, which was then used to plot the graph. Each grouping was entered into the program seperately so that the program created an independant curve for each group.
|
|
The Harvard Dataset can be found on the CAMDA website at www.camda.duke.edu/camda03/contest.asp. It contained five distinct tumor groups: adenocarcinomas (AD), squamous (SQ), cartoid (COID), SMLC, and normal lung (NL). There were approximately 200 tumors within the dataset, for each of which, 12,600 genes were analyzed.
|
|
It does not appear that the clusters are particularly related to survival. Please click on the link below to get to the Kaplan Meier Curves.
|
|
I will be working on a new clustering algorithm soon, when the new clusters are created, I will use this analysis again.
|