Multinomial Naive Bayes

김재욱·2023년 8월 30일

Naive Bayes Classifier

목록 보기

2/2

$P(Y_c|X_1,...,X_n)={P(Y_c) \prod_{i=1}^n P(X_i|Y_c) \over \prod_{i=1}^n P(X_i)}$

$\mathrm{Y}_{c}$ is a label

$\prod_{i=1}^n P(X_i|Y_c)$

$P(X_i|Y_c)={\sum tf(x_i,d\in Y_c) + \alpha \over \sum \mathrm{N}_{d\in Y_c}+\alpha\cdot V}$

$X_i$ : A word from the feature vector X of particilar sample.
$\sum tf(x_i,d\in Y_c)$ : The sum of raw term frequencies of word $X_i$ belong to class $Y_c$ in document
$\sum \mathrm{N}_{d\in Y_c}$ : The sum of all term frequencies in the training dataset for class $Y_c$
$\alpha$ : An additive smoothing parameter ( $\alpha = 1$ for Laplace smoothing)
$V$ : The size of the vocabulary (number of different words in the training set)

	Doc	Words	Class
Training	1	Chinese Beijing Chinese	c
	2	Chinese Chinese Shanghai	c
	3	Chinese Macao	c
	4	Tokyo Japan Chinese	j
Test	5	Chinese Chinese Chinese Tokyo Japan	?

		Chinese	Beijing	Shanghai	Macao	Tokyo	Japan	class
Training	1	2	1	0	0	0	0	c
	2	2	0	1	0	0	0	c
	3	1	0	0	1	0	0	c
	4	1	0	0	0	1	1	j
Test	5	3	0	0	0	1	1	?

$P(c|d5)\cdot P(c)P(Chinese|c)^3 P(Tokyo|c)P(Janpan|c)$
$={3 \over 4}\ast \left( \frac{5+1}{8+6} \right)^3 \ast \left( \frac{0+1}{8+6} \right) \ast \left( \frac{0+1}{8+6} \right)$

$P(j)\cdot P(Chinese|j)^3 P(Tokyo|j)P(Janpan|j)$
$={1 \over 4}\ast \left( \frac{1+1}{3+6} \right)^3 \ast \left( \frac{1+1}{3+6} \right) \ast \left( \frac{1+1}{3+6} \right)$