K Nearest Neighbour Algorithm

The K-Nearest Neighbour algorithm is similar to the Nearest Neighbour algorithm, except that it looks at the closest K instances to the unclassified instance. The class of the new instance is then given by the class with the highest frequency of those K instances. This is useful because the influence of anomalous instances is reduced.

Try this out below. If you diagnose 5 No's then the diagnosis will be 'Strepthroat', compared with the diagnosis of 'Allergy' with the standard Nearest Neighbour algorithm.

Choosing K K = 1 will be the same as nearest neighbour, as it only looks at the 1st closest. K = N (where N is the number of training instances) would be bad because it would base the classification on the class frequency of all the instances, not just the closest ones. So there must be an optimal value of K. Try changing K to see what happens.

Patient ID	Sore Throat	Fever	Swollen Glands	Congestion	Headache	Diagnosis
1	Yes	Yes	Yes	Yes	Yes	Strepthroat
2	No	No	No	Yes	Yes	Allergy
3	Yes	Yes	No	Yes	No	Cold
4	Yes	No	Yes	No	No	Strepthroat
5	No	Yes	No	Yes	No	Cold
6	No	No	No	Yes	No	Allergy
7	No	No	Yes	No	No	Strepthroat
8	Yes	No	No	Yes	Yes	Allergy
9	No	Yes	No	Yes	Yes	Cold
10	Yes	Yes	No	Yes	Yes	Cold
11

View Page Source

Back to Data Mining

It's not what you know, it's whoyouknow.co.uk