69

CHAPTER14. KERNELCANONICALCORRELATIONANALYSIS

We want to maximize this objective, because this would maximize the correlation between the univariatesu_and_v. Note that we divided by the standard deviation of the projections to remove scale dependence.

This exposition is very similar to the Fisher discriminant analysis story and I encourage you to reread that. For instance, there you can find how to generalize to cases where the data is not centered. We also introduced the following “trick”. Since we can rescaleaandbwithout changing the problem, we can constrain them to be equal to1. This then allows us to write the problem as,

maximizea,b ρ=E[uv]
subject to E[_u_2] = 1
E[_v_2] = 1 (14.2)

Or, if we construct a Lagrangian and write out the expectations we find,

(14.3)

where we have multiplied by N. Let’s take derivatives wrt toaandbto see what the KKT equations tell us,

(14.4)

(14.5)

First notice that if we multiply the first equationwithaT_and the second withb_T_and subtract the two, while using the constraints, we arrive atλ1=λ2=λ_.

Next, renameS__xy=ixiyiT,S__x=ixixTi_and_S__y=iyiyiT. We define the following larger matrices:PSD_is the block diagonal matrix withP P_Sx_and_Sy_on the diagonal and zeros on the off-diagonal blocks. Also, we define_SO_to be the off-diagonal matrix with_S__xy_on the off diagonal. Finally we definec= [a,_b]. The two equations can then we written jointly as,

1which is again an regular eigenvalue equation forcc

14.1. KERNELCCA71

results matching ""

    No results matching ""