Please note that eDoc will be permanently shut down in the first quarter of 2021!      Home News About Us Contact Contributors Disclaimer Privacy Policy Help FAQ

Quick Search
My eDoc
Session History
Support Wiki
Direct access to
document ID:

          Institute: MPI für biologische Kybernetik     Collection: Biologische Kybernetik     Display Documents

ID: 548527.0, MPI für biologische Kybernetik / Biologische Kybernetik
Kernel Choice and Classifiability for RKHS
Embeddings of Probability Distributions
Authors:Sriperumbudur, B.K.; Fukumizu, K.; Gretton, A.; Lanckriet, G.R.G.; Schölkopf, B.
Editors:Bengio, Y.; Schuurmans, D.; Lafferty, J.; Williams, C.; Culotta, A.
Date of Publication (YYYY-MM-DD):2010-04
Title of Proceedings:Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009
Start Page:1750
End Page:1758
Physical Description:9
Audience:Not Specified
Intended Educational Use:No
Abstract / Description:Embeddings of probability measures into reproducing kernel Hilbert spaces have
been proposed as a straightforward and practical means of representing and comparing
probabilities. In particular, the distance between embeddings (the maximum
mean discrepancy, or MMD) has several key advantages over many classical
metrics on distributions, namely easy computability, fast convergence and low bias
of finite sample estimates. An important requirement of the embedding RKHS is
that it be characteristic: in this case, the MMD between two distributions is zero
if and only if the distributions coincide. Three new results on the MMD are introduced
in the present study. First, it is established that MMD corresponds to the
optimal risk of a kernel classifier, thus forming a natural link between the distance
between distributions and their ease of classification. An important consequence
is that a kernel must be characteristic to guarantee classifiability between distributions
in the RKHS. Second, the class of characteristic kernels is broadened to
incorporate all strictly positive definite kernels: these include non-translation invariant
kernels and kernels on non-compact domains. Third, a generalization of
the MMD is proposed for families of kernels, as the supremum over MMDs on
a class of kernels (for instance the Gaussian kernels with different bandwidths).
This extension is necessary to obtain a single distance measure if a large selection
or class of characteristic kernels is potentially appropriate. This generalization is
reasonable, given that it corresponds to the problem of learning the kernel by minimizing
the risk of the corresponding kernel classifier. The generalized MMD is
shown to have consistent finite sample estimates, and its performance is demonstrated
on a homogeneity testing example.
External Publication Status:published
Document Type:Conference-Paper
Communicated by:Holger Fischer
Affiliations:MPI für biologische Kybernetik/Empirical Inference (Dept. Schölkopf)
The scope and number of records on eDoc is subject to the collection policies defined by each institute - see "info" button in the collection browse view.