Blaise Hanczar and Edward R. Dougherty Pages 29 - 39 ( 11 )
The aim of many microarray experiments is to build discriminatory diagnosis and prognosis models. A large number of supervised methods have been proposed in literature for microarray-based classification. Model comparison, which is based on the classification error estimation, is a critical issue. Previous studies have shown that error estimation is unreliable in high-dimensional small-sample settings. This leads naturally to questioning the validity of classificationrule comparison approaches being used in the literature. In this paper we present a brief review of the different comparison methods used in bioinformatics. Then, we test these methods on a set of simulations based on both synthetic and real data. These simulations include different feature-label distributions, classification rules, error estimators and variance estimators. The results show that none of these methods can provide reliable comparison across a wide spectrum of feature-label distributions and classification rules.
Microarray classification, error estimation, classifier comparison, variance study
LIPADE, University Paris Descartes, 45 rue des Saint-Peres, 75006 Paris, France.