VC dimension

A neural network $W$ can shatter a set of input vectors $\ve{x}_i \iff \forall$ training set $(\ve{x}_i,\ve{d}_i) \quad \exists \quad \ve{w_{opt}} \mid \hat{R}_{w_{opt}} = 0 $

The Vapnik-Chervonenkis dimension or VCdim for short is the cardinality2 of the largest set of input vectors that the network $W$ can shatter.

\begin{displaymath}VCdim = h\end{displaymath}

\includegraphics[clip,width=6cm]{vc_dim.eps}

With probability $1-\eta$

\begin{displaymath}R_w \leq \hat{R}_w + \sqrt{\frac{h log(\frac{2N}{h}+1)-log\frac{\eta}{4}}{N}}\end{displaymath}

So we can estimate the error with the training error and the Vapnik-Chervonenkis dimension.

If $\nu$ is the relative frequency of the classification error (measured) and $\zeta$ is the event of a classification error, so $P(\zeta)$ is the probability of a clasification error:

\begin{displaymath}P(sup_w\vert P(\zeta)-\nu\vert>\epsilon) < \frac{2eN}{h}^he^{-\epsilon^2N}\end{displaymath}

If $h$ is finite, then the empirical risk converges.

Pedro Larroy 2005-04-29