LLM Day13 - BERT

Soyee Sungยท2025๋…„ 2์›” 15์ผ
0

LLM

๋ชฉ๋ก ๋ณด๊ธฐ
15/34
post-thumbnail

๐Ÿ“Œ BERT ๊ธฐ๋ฐ˜ ๋ฌธ์„œ์™€ ์งˆ๋ฌธ์˜ ์œ ์‚ฌ๋„ ํ‰๊ฐ€ ์ƒ์„ธ ์„ค๋ช…

๐Ÿ’ก BERT๋ฅผ ํ™œ์šฉํ•ด ์งˆ๋ฌธ๊ณผ ๋ฌธ์„œ ๋‚ด์šฉ์„ ์ˆซ์ž๋กœ ๋ณ€ํ™˜ํ•œ ํ›„, ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„๋ฅผ ์ด์šฉํ•ด ์˜๋ฏธ์  ์œ ์‚ฌ์„ฑ์„ ๊ณ„์‚ฐํ•˜์—ฌ ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ฌธ์„œ๋ฅผ ์ฐพ๋Š”๋‹ค. ์ฝ”๋“œ์—์„œ BERT ๊ธฐ๋ฐ˜ ๋ฌธ์„œ์™€ ์งˆ๋ฌธ์˜ ์œ ์‚ฌ๋„ ํ‰๊ฐ€๋Š” evaluate_with_bert() ํ•จ์ˆ˜์—์„œ ์ˆ˜ํ–‰๋œ๋‹ค.
์ด ํ•จ์ˆ˜๋Š” Sentence-BERT (SBERT) ๋ชจ๋ธ์„ ์ด์šฉํ•ด ์งˆ๋ฌธ๊ณผ ๋ฌธ์„œ ๊ฐ„ ์˜๋ฏธ์  ์œ ์‚ฌ๋„๋ฅผ ์ธก์ •ํ•œ๋‹ค.

1๏ธโƒฃ BERT๋ž€?

BERT (Bidirectional Encoder Representations from Transformers)๋Š”
Google์—์„œ ๊ฐœ๋ฐœํ•œ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) ๋ชจ๋ธ๋กœ, ๋ฌธ์žฅ์˜ ๋งฅ๋ฝ์„ ์–‘๋ฐฉํ–ฅ์œผ๋กœ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ•๋ ฅํ•œ AI ๋ชจ๋ธ์ด๋‹ค.

๐Ÿ’ก ์ผ๋ฐ˜์ ์ธ NLP ๋ชจ๋ธ๊ณผ์˜ ์ฐจ์ด์ 

๊ธฐ์กด ๋ชจ๋ธ๋“ค์€ ๋ฌธ์žฅ์„ ์™ผ์ชฝ โ†’ ์˜ค๋ฅธ์ชฝ ๋ฐฉํ–ฅ(๋˜๋Š” ๋ฐ˜๋Œ€)์œผ๋กœ๋งŒ ํ•ด์„ํ–ˆ์Œ.
ํ•˜์ง€๋งŒ BERT๋Š” ์–‘๋ฐฉํ–ฅ์œผ๋กœ ๋ฌธ์žฅ์„ ์ฝ์–ด ๋” ์ •ํ™•ํ•œ ์˜๋ฏธ ๋ถ„์„ ๊ฐ€๋Šฅ!
๐Ÿ“Œ ์˜ˆ์ œ:

"๋‚˜๋Š” ์€ํ–‰์— ๊ฐ„๋‹ค."
BERT๋Š” '์€ํ–‰'์ด ๊ธˆ์œต๊ธฐ๊ด€์ธ์ง€, ๊ฐ•๊ฐ€์ธ์ง€ ๋ฌธ๋งฅ์„ ๋ณด๊ณ  ํŒ๋‹จํ•  ์ˆ˜ ์žˆ์Œ!

2๏ธโƒฃ BERT ๊ธฐ๋ฐ˜ ๋ฌธ์„œ-์งˆ๋ฌธ ์œ ์‚ฌ๋„ ๋ถ„์„ ๊ณผ์ •

๐Ÿ”น evaluate_with_bert(question, context) ํ•จ์ˆ˜๋Š” BERT๋ฅผ ์ด์šฉํ•ด ๋ฌธ์žฅ ๊ฐ„ ์˜๋ฏธ์  ์œ ์‚ฌ๋„๋ฅผ ๊ณ„์‚ฐํ•œ๋‹ค.

def evaluate_with_bert(question, context):
    """ BERT ๊ธฐ๋ฐ˜ ๋ฌธ์„œ์™€ ์งˆ๋ฌธ์˜ ์œ ์‚ฌ๋„ ํ‰๊ฐ€ """
    question_embedding = bert_model.encode(question, convert_to_tensor=True)  # ์งˆ๋ฌธ์„ ์ˆซ์ž๋กœ ๋ณ€ํ™˜
    context_embedding = bert_model.encode(context, convert_to_tensor=True)  # ๋ฌธ์„œ ๋‚ด์šฉ์„ ์ˆซ์ž๋กœ ๋ณ€ํ™˜
    similarity_score = util.pytorch_cos_sim(question_embedding, context_embedding).item()  # ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ
    return min(max(similarity_score, 0), 1)  # 0~1 ๋ฒ”์œ„ ์œ ์ง€

์œ„ ์ฝ”๋“œ์˜ ๋™์ž‘์„ ํ•œ ์ค„์”ฉ ์‚ดํŽด๋ณด์ž.

(1) ๋ฌธ์žฅ์„ ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ (Embedding)

question_embedding = bert_model.encode(question, convert_to_tensor=True)
context_embedding = bert_model.encode(context, convert_to_tensor=True)

๐Ÿ“Œ BERT๋Š” ๋ฌธ์žฅ์„ ์ˆซ์ž๋กœ ๋ณ€ํ™˜ํ•˜๋Š” "์ž„๋ฒ ๋”ฉ(embedding)" ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•œ๋‹ค.

bert_model.encode(text): ๋ฌธ์žฅ์„ ๋ฒกํ„ฐ(์ˆซ์ž๋กœ ์ด๋ฃจ์–ด์ง„ ๋ฆฌ์ŠคํŠธ)๋กœ ๋ณ€ํ™˜
convert_to_tensor=True: ๋ฒกํ„ฐ๋ฅผ PyTorch Tensor ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜ (์—ฐ์‚ฐ ์ตœ์ ํ™”)

๐Ÿ“ ์˜ˆ์ œ:

"๊ฐ•์•„์ง€๋Š” ๊ท€์—ฝ๋‹ค."  โ†’  [0.21, -0.67, 1.34, ..., 0.87]
"๋‚˜๋Š” ๊ณ ์–‘์ด๋ฅผ ํ‚ค์šด๋‹ค."  โ†’  [0.15, -0.71, 1.22, ..., 0.93]

๐Ÿ”น ์ด๋ ‡๊ฒŒ ์ˆซ์ž๋กœ ๋ณ€ํ™˜๋œ ๋ฌธ์žฅ์€ ๋ฒกํ„ฐ ๊ณต๊ฐ„์—์„œ ๋น„๊ต ๊ฐ€๋Šฅํ•ด์ง!

(2) ๋ฌธ์žฅ ๊ฐ„ ์˜๋ฏธ์  ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ (Cosine Similarity)

similarity_score = util.pytorch_cos_sim(question_embedding, context_embedding).item()

๐Ÿ“Œ BERT๋กœ ๋ณ€ํ™˜๋œ ๋‘ ๊ฐœ์˜ ๋ฒกํ„ฐ ๊ฐ„ ์œ ์‚ฌ๋„๋ฅผ ๋น„๊ตํ•˜๋Š” ๊ณผ์ •

์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ (Cosine Similarity) ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‘ ๋ฌธ์žฅ์˜ ์œ ์‚ฌ๋„๋ฅผ ์ธก์ •
util.pytorch_cos_sim(vec1, vec2): ๋‘ ๋ฒกํ„ฐ ๊ฐ„์˜ ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„๋ฅผ ๋ฐ˜ํ™˜

๐Ÿ“Œ ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ (Cosine Similarity)

์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ ๊ด€๋ จ ์ƒ์„ธ ์„ค๋ช…

(3) ์œ ์‚ฌ๋„ ์ ์ˆ˜ ์ •๊ทœํ™”

return min(max(similarity_score, 0), 1)

๐Ÿ“Œ ๊ณ„์‚ฐ๋œ ์œ ์‚ฌ๋„ ์ ์ˆ˜๋ฅผ 0~1 ๋ฒ”์œ„๋กœ ์กฐ์ •

min(max(score, 0), 1): ์Œ์ˆ˜ ๋ฐฉ์ง€ & 1 ์ดˆ๊ณผ ๋ฐฉ์ง€ (๊ฐ’ ์•ˆ์ •ํ™”)
์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์ผ๊ด€๋œ 0~1 ์Šค์ผ€์ผ๋กœ ์ ์ˆ˜๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Œ!

4๏ธโƒฃ ์ „์ฒด ํ๋ฆ„ ์ •๋ฆฌ

โœ… 1) ๋ฌธ์žฅ์„ SBERT(BERT ๊ธฐ๋ฐ˜) ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ˆซ์ž ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜
โœ… 2) ๋ฒกํ„ฐ ๊ฐ„ ์œ ์‚ฌ๋„๋ฅผ ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„๋กœ ์ธก์ • (0~1 ๋ฒ”์œ„)
โœ… 3) ์ •๊ทœํ™”ํ•˜์—ฌ ์ตœ์ข… ์ ์ˆ˜ ๋ฐ˜ํ™˜

5๏ธโƒฃ BERT ์œ ์‚ฌ๋„ ํ‰๊ฐ€๊ฐ€ ์ค‘์š”ํ•œ ์ด์œ 

๐Ÿ“Œ GPT์™€ ๋‹ฌ๋ฆฌ BERT๋Š” "์ •ํ™•ํ•œ ์˜๋ฏธ ๋น„๊ต"์— ๊ฐ•ํ•จ!

GPT๋Š” "์ƒ์„ฑ" (ํ…์ŠคํŠธ๋ฅผ ๋งŒ๋“ค๊ธฐ) ์— ํŠนํ™”
BERT๋Š” "์ดํ•ด" (ํ…์ŠคํŠธ ๊ฐ„ ์˜๋ฏธ ๋น„๊ต) ์— ํŠนํ™”
โœ… ๊ทธ๋ž˜์„œ ์ด ์ฝ”๋“œ์—์„œ๋Š” "์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ฌธ์„œ๋ฅผ ๊ฒ€์ƒ‰ํ•˜๋Š” ๋ฐ" BERT๋ฅผ ํ™œ์šฉ!
โœ… ์ตœ์ข…์ ์œผ๋กœ GPT ๋ชจ๋ธ๊ณผ ๊ฒฐํ•ฉํ•˜์—ฌ ๋”์šฑ ์ •๊ตํ•œ ๋‹ต๋ณ€์„ ์ƒ์„ฑ

0๊ฐœ์˜ ๋Œ“๊ธ€