Bayes Optimal Classifier

김민재·2024년 4월 20일

ML

목록 보기

3/17

우리가 이제 새로운 instance가 들어왔을때, 어떻게 분류를 하는것이 최적일지 생각해 보아야 한다.

이전처럼 단순히 $h_{MAP}$ 으로 아래의 예제를 생각해보자.

Example. Classifying $\oplus$ and $\ominus$ by simply applying the MAP hypothesis

Given a new instance $x$
$h_1(x) = \oplus, \quad h_2(x) = \ominus, \quad h_3(x) = \oplus$
$\\$
Three possible h:
$\qquad P(h_1|D) = 0.4, \quad P(h_2|D) = 0.3, \quad P(h_3|D) = 0.3, \quad$
$\\$
If we simply apply the $h_{MAP} =\underset{h\in H}{\text{argmax}}\;P(h|D),$ then we have the most probable classification of $x$ as
$\qquad h_{MAP} = h_1, \;\text{hence } x = \oplus$

그러나 이렇게 단순히 $h_{MAP}$ 을 구하면 뭔가 이상함을 느낄것이다.

따라서 이렇게 계산하는것이 아닌, 다음처럼 생각해보자.

new example can take on any value $v_j$ from some set $V$ then the prob $P(v_j|D)$ that correct classification for the new instance is $v_j$

P(v_j|D) = \sum_{h_i \in H}P(v_j|h_i)P(h_i|D)

Bayes Optimal Classification

$\underset{v_j\in V}{\text{argmax}}\sum_{h_i \in H}P(v_j|h_i)P(h_i|D)$

이를 가지고 위의 예제를 다시 살펴보자.

Example. Classifying $\oplus$ and $\ominus$ (revisited)

The set of possible classification of the new instance is $V = \left \{ \oplus,\ominus \right \},$ and

\begin{aligned} P(h_1|D)=0.4, \; P(\ominus|h_1)=0, \; P(\oplus|h_1)=1\\ P(h_2|D)=0.4, \; P(\ominus|h_2)=1, \; P(\oplus|h_2)=0\\ P(h_3|D)=0.4, \; P(\ominus|h_3)=1, \; P(\oplus|h_3)=0 \end{aligned}

therefore

\begin{aligned} \sum_{h_i\in H}P(\oplus|h_i)P(h_i|D)=0.4\\ \sum_{h_i\in H}P(\ominus|h_i)P(h_i|D)=0.6 \end{aligned}

consequently,

\underset{v_j\in\left \{ \oplus,\ominus \right \}}{\text{argmax}}\sum_{h_i\in H}P(v_j|h_i)P(h_i|D) = \ominus

그러나 몇가지의 문제가 존재한다.

It is quite computationally costly to apply
It compute the posterior porb for every hypothesis in $H$ and then combines predictions of each hypothesis to classify