B.1Polynomials Kernels

The construction that we will follow below is to first write feature vectors products of subsets of input attributes, i.e. define features vectors as follows,

(B.1)

where we can put various restrictions on the possible combinations of indices which are allowed. For instance, we could require that their sum is a constants, i.e. there are preciselys_terms in the product. Or we could require that each_i__j= [0_,_1]. Generally speaking, the best choice depends on the problem you are modelling, but another important constraint is that the corresponding kernel must be easy to compute.

Let’s define the kernel as usual as,

(B.2)

whereI= [i_1,i2,...i_n]. We have already encountered the polynomial kernel as,

(B.3)

where the last equality follows from a binomial expansion. If we write out the

77

APPENDIXB. KERNELDESIGN

term,

T_ss_s!i_1_i_2_i

(x y) = (x_1_y_1+_x_2_y_2+...+_xny__n) =(x_1_y_1) (_x_2_y_2)...(_xny__n)n

X

i_1,i2,...,ini1!_i_2!...in!_i_1+_i_2+..._+_in=_s

(B.4) Taken together with eqn. B.3 we see that the features correspond to,

withi_1+_i_2+...+_i__n=s < d

(B.5) The point is really that in order to efficiently compute the total sum of(n__n+!d__d!)!terms we have inserted very special coefficients. The only true freedom we have left is in choosingR: for larger_R_we down-weight higher order polynomials more.

The question we want to answer is: how much freedom do we have in choosing different coefficients and still being able to compute the inner product efficiently.

results matching ""

    No results matching ""