Cross-Validation

Relevant Links:

Clustering Methods and Analyses:

KNN Clustering

K-Median Clustering

Kaplan-Meier Estimation

Top Genes based on Variance

Cross-Validation

Significant Genes

Cross-Validation

Basic Ideas Behind Cross-Validation

Method Employed

My Results

Basic Ideas Behind Cross-Validation

Back to Top

When a clustering program is created in a supervised situation, it is necessary to be sure that it can perform in an unsupervised situation. Thus, cross-validation is used.

In cross-validation, a portion of the data is set aside as training data leaving the remainder as testing data.

The quality of performace of the program on the testing data reflects how well it would perform in an unsupervised setting.

Methods Employed

Back to Top

The user was first asked to input the desired level of cross-validation, CV.

Using this information, the data was partitioned into CV equal groups. CV-1 of these groups were training sets, and the last group was the test set.

The program was run and result obtained.

Then, the test and training datasets were switched. This occured CV times such that each group was the test group exactly once.

The success rates were averaged over all CV trials to arrive at the final success rate.

Basic Ideas Behind Cross-Validation Back to Top

Methods Employed Back to Top

My Results: Back to Top

Basic Ideas Behind Cross-Validation

Back to Top

Methods Employed

Back to Top

My Results:

Back to Top