Nearest Neighbors Classification

Perhaps the simplest algorithm to perform classification is the “k nearest neighbors (kNN) classifier”. As usual we assume that we have data of the form{Xin,Yn}whereXin_is the value of attribute_i_for data-case_n_and_Yn_is the label for datacase_n. We also need a measure of similarity between data-cases, which we will denote withK(Xn__,Xm)where larger values of_K_denote more similar data-cases.

Given these preliminaries, classification is embarrassingly simple: when you are provided with the attributesXt_for a new (unseen) test-case, you first find the_k_most similar data-cases in the dataset by computing_K(Xt__,Xn)for alln. Call this setS. Then, each of these_k_most similar neighbors inScan cast a vote on the label of the test case, where each neighbor predicts that the test case has the same label as itself. Assuming binary labels and an odd number of neighbors, this will always result in adecision.

Although kNN algorithms are often associated with this simple voting scheme, more sophisticated ways of combining the information of these neighbors is allowed. For instance, one could weigh each vote by the similarity to the test-case. This results in the following decision rule,

Y__t= 1 if XK(Xt__,Xn)(2Y__n−1)_>_0 (5.1)
Y__t= 0 if XK(Xt__,Xn)(2Y__n−1)_<_0 (5.2)

and flipping a coin if it is exactly0.

Why do we expect this algorithm to work intuitively? The reason is that we expect data-cases with similar labels to cluster together in attribute space. So to

results matching ""

    No results matching ""