TY - JOUR
T1 - On convergence of kernel learning estimators
AU - Norkin, V.I.
AU - Keyzer, M.A.
PY - 2009
Y1 - 2009
N2 - The paper studies convex stochastic optimization problems in a reproducing kernel Hilbert space (RKHS). The objective (risk) functional depends on functions from this RKHS and takes the form of a mathematical expectation (integral) of a nonnegative integrand (loss function) over a probability measure. The problem is generally ill-posed, a difficulty that in statistical learning is addressed through Tihonov regularization, with Monte Carlo approximation of integrals, which also makes it possible to solve the problem by finite dimensional (convex) quadratic optimization. The approximate solutions are referred to as kernel learning estimators and are expressed as a linear combination of kernels evaluated at the sample points. They are functional random variables that depend on the full sample. The paper studies a probabilistic convergence of these approximate solutions under a gradual elimination of the regularization parameter with rising number of observations. Its intended contribution is to derive novel nonasymptotic bounds on the minimization error and exponential bounds on the tail distribution of errors and to establish novel sufficient conditions for uniform convergence of kernel estimators to the true (normal) solution with probability one, jointly with a rule for downward adjustment of the regularization factor with increasing sample size. Applications to least squares, median, and quantile regression estimation, as well as to binary classification, are discussed. © 2009 Society for Industrial and Applied Mathematics.
AB - The paper studies convex stochastic optimization problems in a reproducing kernel Hilbert space (RKHS). The objective (risk) functional depends on functions from this RKHS and takes the form of a mathematical expectation (integral) of a nonnegative integrand (loss function) over a probability measure. The problem is generally ill-posed, a difficulty that in statistical learning is addressed through Tihonov regularization, with Monte Carlo approximation of integrals, which also makes it possible to solve the problem by finite dimensional (convex) quadratic optimization. The approximate solutions are referred to as kernel learning estimators and are expressed as a linear combination of kernels evaluated at the sample points. They are functional random variables that depend on the full sample. The paper studies a probabilistic convergence of these approximate solutions under a gradual elimination of the regularization parameter with rising number of observations. Its intended contribution is to derive novel nonasymptotic bounds on the minimization error and exponential bounds on the tail distribution of errors and to establish novel sufficient conditions for uniform convergence of kernel estimators to the true (normal) solution with probability one, jointly with a rule for downward adjustment of the regularization factor with increasing sample size. Applications to least squares, median, and quantile regression estimation, as well as to binary classification, are discussed. © 2009 Society for Industrial and Applied Mathematics.
UR - https://www.scopus.com/pages/publications/73249151931
UR - https://www.scopus.com/inward/citedby.url?scp=73249151931&partnerID=8YFLogxK
U2 - 10.1137/070696817
DO - 10.1137/070696817
M3 - Article
SN - 1052-6234
VL - 20
SP - 1205
EP - 1223
JO - SIAM Journal on Optimization
JF - SIAM Journal on Optimization
IS - 3
ER -