K Nearest Neighbors

더기덕·2022년 4월 2일

Data Science KNN ML machine learning python 데이터사이언스 머신러닝 파이썬

0

Basic Concept

Training Algorithm :
- Store all the data

Prediction Algorithm :
- Calculate the distance from x to all the points
- Sort the points in the data by increasing distance from x
- predict the majorith label of the 'k' closest points

increasing k will smooth the boundaries at the cost of mislabeling some data

Pros and Cons

Pros
- Very simple
- few parameters (K / Distance Metric)
- easy to add new data
- works with any number of classes
- training is trivial

Cons
- High Prediction cost (worse for large data sets)
- Not good with high dimensional data (throw off the ability to measure distances in diverse dimensions)
- Categorical Features don't work well

이전 포스트

Logistic Regression

다음 포스트

Decision Trees and Random Forests

0개의 댓글