Home News About Us Contact Contributors Disclaimer Privacy Policy Help FAQ

Quick Search
My eDoc
Session History
Support Wiki
Direct access to
document ID:

          Institute: MPI für Intelligente Systeme (ehemals Max-Planck-Institut für Metallforschung)     Collection: Abt. Schölkopf (Empirical Inference)     Display Documents

ID: 596802.0, MPI für Intelligente Systeme (ehemals Max-Planck-Institut für Metallforschung) / Abt. Schölkopf (Empirical Inference)
ccSVM: correcting Support Vector Machines for confounding factors in biological data classification
Authors:Li, L.; Rakitsch, B.; Borgwardt, K.
Date of Publication (YYYY-MM-DD):2011-07-01
Title of Journal:Bioinformatics
Issue / Number:13
Start Page:i342
End Page:i348
Sequence Number of Article:btr204
Title of Issue:ISMB/ECCB 2011
Review Status:not specified
Audience:Not Specified
Intended Educational Use:No
Abstract / Description:Motivation: Classifying biological data into different groups is a central task of bioinformatics: for instance, to predict the function of a gene or protein, the disease state of a patient or the phenotype of an individual based on its genotype. Support Vector Machines are a wide spread approach for classifying biological data, due to their high accuracy, their ability to deal with structured data such as strings, and the ease to integrate various types of data. However, it is unclear how to correct for confounding factors such as population structure, age or gender or experimental conditions in Support Vector Machine classification.

Results: In this article, we present a Support Vector Machine classifier that can correct the prediction for observed confounding factors. This is achieved by minimizing the statistical dependence between the classifier and the confounding factors. We prove that this formulation can be transformed into a standard Support Vector Machine with rescaled input data. In our experiments, our confounder correcting SVM (ccSVM) improves tumor diagnosis based on samples from different labs, tuberculosis diagnosis in patients of varying age, ethnicity and gender, and phenotype prediction in the presence of population structure and outperforms state-of-the-art methods in terms of prediction accuracy.
External Publication Status:published
Document Type:Article
Communicated by:Heide Klooz
Affiliations:MPI für Intelligente Systeme/Abt. Schölkopf
The scope and number of records on eDoc is subject to the collection policies defined by each institute - see "info" button in the collection browse view.