The nearest neighbour algorithm classifies a given instance based on a set of already classified instances (the training set), by calculating the distance to the nearest training case. The new instance is classified in the same class as the closest training case (i.e. the one that has the least differences/the one that is most the same).
Try this out below. There are 10 instances in the training set. Use the select boxes to diagnose a new patient using the nearest neighbour algorithm. The diagnosis is made by selecting the diagnosis (class) of the instance in the training set with the least differences (i.e. closest distance).
Patient ID | Sore Throat | Fever | Swollen Glands | Congestion | Headache | Diagnosis | Distance |
---|---|---|---|---|---|---|---|
1 | Yes | Yes | Yes | Yes | Yes | Strepthroat | |
2 | No | No | No | Yes | Yes | Allergy | |
3 | Yes | Yes | No | Yes | No | Cold | |
4 | Yes | No | Yes | No | No | Strepthroat | |
5 | No | Yes | No | Yes | No | Cold | |
6 | No | No | No | Yes | No | Allergy | |
7 | No | No | Yes | No | No | Strepthroat | |
8 | Yes | No | No | Yes | Yes | Allergy | |
9 | No | Yes | No | Yes | Yes | Cold | |
10 | Yes | Yes | No | Yes | Yes | Cold | |
Back to Data Mining
It's not what you know, it's whoyouknow.co.uk