โ‘ค ๐Ÿค– Machine Learning 2์ผ์ฐจ - ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€(Logistic Regression)

JItzelยท2025๋…„ 12์›” 11์ผ

๐Ÿก Machine_learning

๋ชฉ๋ก ๋ณด๊ธฐ
5/14

๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€ (Logistic Regression)์™€ ์‹œ๊ทธ๋ชจ์ด๋“œ

๋งŒ์•ฝ ์šฐ๋ฆฌ๊ฐ€ ์˜ˆ์ธกํ•ด์•ผ ํ•˜๋Š” ๊ฒƒ์ด ์ˆซ์ž๊ฐ€ ์•„๋‹ˆ๋ผ "ํ•ฉ๊ฒฉ/๋ถˆํ•ฉ๊ฒฉ" ๊ฐ™์€ ๋ฒ”์ฃผ์ผ ๋•Œ๋„ ์„ ํ˜• ํšŒ๊ท€๋ฅผ ์“ธ ์ˆ˜ ์žˆ์„๊นŒ? ๋Œ€๋‹ต์€ No ๊ทธ๋Ÿผ ์–ด๋–ป๊ฒŒ ํ•ด์•ผํ• ๊นŒ?

1. ์™œ ์„ ํ˜• ํšŒ๊ท€๋กœ๋Š” ์•ˆ ๋ ๊นŒ?

๊ณต๋ถ€ ์‹œ๊ฐ„(xx)์— ๋”ฐ๋ฅธ ํ•ฉ๊ฒฉ ์—ฌ๋ถ€(yy)๋ฅผ ์˜ˆ์ธกํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ด ๋ณด์ž.

  • ํ•ฉ๊ฒฉ = 1
  • ๋ถˆํ•ฉ๊ฒฉ = 0

์„ ํ˜• ํšŒ๊ท€(Linear Regression)๋กœ ์ง์„ ์„ ๊ทธ์œผ๋ฉด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค.

  1. ๋ฒ”์œ„ ์ดˆ๊ณผ: ์ง์„ ์€ ๋์—†์ด ๋ป—์–ด๋‚˜๊ฐ€๋ฏ€๋กœ 0๋ณด๋‹ค ์ž‘๊ฑฐ๋‚˜ 1๋ณด๋‹ค ํฐ ๊ฐ’์ด ๋‚˜์˜ด.
    (ํ™•๋ฅ ์€ 0~1 ์‚ฌ์ด์—ฌ์•ผ ํ•จ)
  2. ๊ธฐ์ค€ ๋ชจํ˜ธ: y=0.5y=0.5๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋‚˜๋ˆˆ๋‹ค ํ•ด๋„, ๋ฐ์ดํ„ฐ์— ๋ฏผ๊ฐํ•˜๊ฒŒ ๋ฐ˜์‘ํ•˜์—ฌ ๊ธฐ์ค€์ด ์‰ฝ๊ฒŒ ํ”๋“ค๋ฆฐ๋‹ค.
    โ†’\rightarrow "์ง์„  ๋Œ€์‹  S์ž ๊ณก์„ ์œผ๋กœ 0๊ณผ 1 ์‚ฌ์ด์˜ ํ™•๋ฅ ์„ ์˜ˆ์ธกํ•˜์ž!"
    โ†’\rightarrow ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€

2. ํ•ต์‹ฌ ํ•จ์ˆ˜: ์‹œ๊ทธ๋ชจ์ด๋“œ (Sigmoid)

์„ ํ˜• ํšŒ๊ท€์˜ ์˜ˆ์ธก๊ฐ’(z=wx+bz = wx + b)์„ 0๊ณผ 1 ์‚ฌ์ด์˜ ํ™•๋ฅ ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•ด์ฃผ๋Š” ํ•จ์ˆ˜

์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜ ์ •์˜

  • Sigmoid(z)=11+eโˆ’zSigmoid(z) = \frac{1}{1 + e^{-z}}

  • zz (์„ ํ˜•ํšŒ๊ท€ ๊ฐ’)๊ฐ€ ์•„๋ฌด๋ฆฌ ์ปค์ ธ๋„ 1์— ์ˆ˜๋ ด.

  • zz๊ฐ€ ์•„๋ฌด๋ฆฌ ์ž‘์•„์ ธ๋„ 0์— ์ˆ˜๋ ด.

  • z=0z = 0 ์ผ ๋•Œ ์ •ํ™•ํžˆ 0.5

์‹œ๊ทธ๋ชจ์ด๋“œ ์‹œ๊ฐํ™” ์‹ค์Šต

import numpy as np
import math
import matplotlib.pyplot as plt

# ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜ ์ •์˜
def sigmoid(z):
    return 1 / (1 + math.e ** -z)

# z๊ฐ’ ์ค€๋น„ (-10 ~ 10 ์‚ฌ์ด)
z = np.linspace(-10, 10, 50)
s = sigmoid(z)

# ์‹œ๊ฐํ™”
plt.plot(z, s, 'r-')
plt.axvline(0, color='k', linestyle=':') # x=0 ๊ธฐ์ค€์„ 
plt.grid(True)
plt.title("Sigmoid Function")
plt.xlabel("z (wx + b)")
plt.ylabel("Probability")
plt.show()

# ์ฃผ์š” ๊ฐ’ ํ™•์ธ
print(f"z=-10 : {sigmoid(-10)}") # 0์— ๊ทผ์ ‘
print(f"z=  0 : {sigmoid(0)}")   # 0.5
print(f"z= 10 : {sigmoid(10)}")  # 1์— ๊ทผ์ ‘

3. ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€์˜ ์ˆ˜ํ•™์  ๋ฐฐ๊ฒฝ (Odds & Logit)

1) Odds (์Šน์‚ฐ) (ํ†ต๊ณ„์  ๊ฐœ๋…)

์„ฑ๊ณต ํ™•๋ฅ (pp)์ด ์‹คํŒจ ํ™•๋ฅ (1โˆ’p1-p)๋ณด๋‹ค ๋ช‡ ๋ฐฐ ๋” ๋†’์€๊ฐ€?
Odds=p1โˆ’pOdds = \frac{p}{1-p}

2) Logit (๋กœ์ง“)

Odds์— ์ž์—ฐ๋กœ๊ทธ(lnโก\ln)๋ฅผ ์ทจํ•œ ๊ฐ’์ž…๋‹ˆ๋‹ค.
Logit=lnโก(p1โˆ’p)=wx+bLogit = \ln(\frac{p}{1-p}) = wx + b

  • ํ™•๋ฅ  pp๋Š” 0~1 ์‚ฌ์ด์ง€๋งŒ, Logit ๊ฐ’์€ โˆ’โˆžโˆผ+โˆž-\infty \sim +\infty (์Œ์ˆ˜, ์–‘์ˆ˜ ๋ชจ๋‘ ๊ฐ€๋Šฅ) ๋ฒ”์œ„๋ฅผ ๊ฐ€์ง„๋‹ค.
  • ์ฆ‰, ์šฐ๋ฆฌ๊ฐ€ ์•„๋Š” ์„ ํ˜• ํšŒ๊ท€(wx+bwx+b)์˜ ๊ฒฐ๊ณผ๊ฐ’(zz)์„ Logit์ด๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

3) ์—ญํ•จ์ˆ˜ ๊ด€๊ณ„ (Logit โ†”\leftrightarrow Sigmoid)

  • Logit ์‹์„ ๋‹ค์‹œ pp์— ๋Œ€ํ•ด ์ •๋ฆฌํ•˜๋ฉด ์šฐ๋ฆฌ๊ฐ€ ๋ณธ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜๊ฐ€ ๋‚˜์˜จ๋‹ค.

  • p=11+eโˆ’(wx+b)p = \frac{1}{1 + e^{-(wx+b)}}

  • ๊ฒฐ๊ตญ ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋Š” "์„ ํ˜• ํšŒ๊ท€ ๊ฐ’์„ ์‹œ๊ทธ๋ชจ์ด๋“œ์— ๋„ฃ์–ด ํ™•๋ฅ ์„ ๊ตฌํ•˜๋Š” ๊ฒƒ"


4. ์˜ˆ์ œ: ํ”ผ๋งˆ ์ธ๋””์–ธ ๋‹น๋‡จ๋ณ‘ ์˜ˆ์ธก (Scikit-learn)

Scikit-learn์—์„œ๋Š” LogisticRegression ํด๋ž˜์Šค๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.
์ด๋ฆ„์€ 'Regression(ํšŒ๊ท€)'์ด์ง€๋งŒ ์‹ค์ œ๋กœ๋Š” 'Classification(๋ถ„๋ฅ˜)' ๋ชจ๋ธ

1) ๋ฐ์ดํ„ฐ ์ค€๋น„ ๋ฐ ํ•™์Šต

import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt

# ๋ฐ์ดํ„ฐ ๋กœ๋“œ
df = pd.read_csv('data/pima-indians-diabetes.data.csv')

# ๋…๋ฆฝ๋ณ€์ˆ˜(X)์™€ ์ข…์†๋ณ€์ˆ˜(y) ๋ถ„๋ฆฌ
# ๋งˆ์ง€๋ง‰ ์—ด์ด ๋‹น๋‡จ๋ณ‘ ๋ฐœ๋ณ‘ ์—ฌ๋ถ€(0 ๋˜๋Š” 1)
x_data = df.iloc[:, :-1]
y_data = df.iloc[:, -1]

# ๋ชจ๋ธ ์ƒ์„ฑ ๋ฐ ํ•™์Šต
# max_iter: ๊ณ„์‚ฐ ๋ฐ˜๋ณต ํšŸ์ˆ˜ (์ˆ˜๋ ดํ•˜์ง€ ์•Š์œผ๋ฉด ๋Š˜๋ ค์ค˜์•ผ ํ•จ)
model = LogisticRegression(max_iter=500, verbose=True)
model.fit(x_data, y_data)

2) ํ•™์Šต ๊ฒฐ๊ณผ ํ™•์ธ (๊ฐ€์ค‘์น˜์™€ ์ ˆํŽธ)

print("๊ฐ€์ค‘์น˜(w):", model.coef_)
# ํŠน์„ฑ์ด 8๊ฐœ์ด๋ฏ€๋กœ w๋„ 8๊ฐœ๊ฐ€ ๋‚˜์˜ด (8x1 ํ–‰๋ ฌ)

print("์ ˆํŽธ(b):", model.intercept_)
# ํŽธํ–ฅ๊ฐ’ (1๊ฐœ)

3) ์˜ˆ์ธกํ•˜๊ธฐ: predict vs predict_proba

๋ถ„๋ฅ˜ ๋ชจ๋ธ์€ ๋‘ ๊ฐ€์ง€ ์˜ˆ์ธก ๋ฉ”์„œ๋“œ๋ฅผ ์ œ๊ณตํ•œ๋‹ค
predict(X): 0 ๋˜๋Š” 1 (์ตœ์ข… ํด๋ž˜์Šค) ์˜ˆ์ธก
predict_proba(X): [0์ผ ํ™•๋ฅ , 1์ผ ํ™•๋ฅ ] ์˜ˆ์ธก

# ์˜ˆ์ œ ๋ฐ์ดํ„ฐ 2๊ฑด
# (์ž„์‹ ํšŸ์ˆ˜, ํฌ๋„๋‹น, ํ˜ˆ์••, ํ”ผ๋ถ€๋‘๊ป˜, ์ธ์А๋ฆฐ, BMI, ๋‹น๋‡จ๋‚ด๋ ฅ, ๋‚˜์ด)
sample_data = [
    [6, 148, 72, 35, 0, 33.6, 0.627, 50],
    [1, 93, 70, 31, 0, 30.4, 0.315, 23]
]

# 1. ์ตœ์ข… ๊ฒฐ๊ณผ ์˜ˆ์ธก (0 or 1)
print(model.predict(sample_data))
# ๊ฒฐ๊ณผ: array([1, 0]) -> ์ฒซ ๋ฒˆ์งธ๋Š” ๋‹น๋‡จ(1), ๋‘ ๋ฒˆ์งธ๋Š” ์ •์ƒ(0)

# 2. ํ™•๋ฅ  ์˜ˆ์ธก
print(model.predict_proba(sample_data))
# ๊ฒฐ๊ณผ ์˜ˆ์‹œ:
# [[0.28, 0.72],  -> 1์ผ ํ™•๋ฅ ์ด 0.72๋ผ 1๋กœ ์˜ˆ์ธก
#  [0.85, 0.15]]  -> 0์ผ ํ™•๋ฅ ์ด 0.85๋ผ 0์œผ๋กœ ์˜ˆ์ธก

5. ๊ฒ€์ฆ: ์ง์ ‘ ๊ณ„์‚ฐํ•ด๋ณด๊ธฐ (Logic Check)

  • predict_proba๊ฐ€ ๋‚ด๋ถ€์ ์œผ๋กœ ์–ด๋–ป๊ฒŒ ์ž‘๋™ํ•˜๋Š”์ง€, ์•ž์„œ ๋ฐฐ์šด ์ˆ˜์‹(z=XW+bz = XW + b ํ›„ ์‹œ๊ทธ๋ชจ์ด๋“œ)์œผ๋กœ ์ง์ ‘ ๊ฒ€์ฆํ•˜๊ธฐ
import math

# ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜
def sigmoid(z):
    return 1 / (1 + math.e ** -z)

# ์ฒซ ๋ฒˆ์งธ ์ƒ˜ํ”Œ ๋ฐ์ดํ„ฐ ๊ฐ€์ ธ์˜ค๊ธฐ (2์ฐจ์› ๋ฐฐ์—ด ์œ ์ง€)
xn = np.array([sample_data[0]]) 

# 1๋‹จ๊ณ„: ์„ ํ˜• ํšŒ๊ท€ ๊ฐ’ (z) ๊ณ„์‚ฐ
# z = X @ W.T + b (ํ–‰๋ ฌ ๊ณฑ)
# model.coef_๋Š” (1, 8) ํ˜•ํƒœ์ด๋ฏ€๋กœ Transpose(.T) ํ•ด์ฃผ๊ฑฐ๋‚˜ ๋งž์ถฐ์„œ ๊ณฑํ•ด์•ผ ํ•จ
z = np.matmul(xn, model.coef_.T) + model.intercept_

# 2๋‹จ๊ณ„: ์‹œ๊ทธ๋ชจ์ด๋“œ ํ†ต๊ณผ (ํ™•๋ฅ  p ๊ณ„์‚ฐ)
prob = sigmoid(z)

print(f"์ง์ ‘ ๊ณ„์‚ฐํ•œ ํ™•๋ฅ : {prob}")
print(f"๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๊ฒฐ๊ณผ: {model.predict_proba(xn)[0][1]}")

# 3๋‹จ๊ณ„: 0.5 ๊ธฐ์ค€์œผ๋กœ ๋ถ„๋ฅ˜
prediction = 1 if prob > 0.5 else 0
print(f"์ตœ์ข… ์˜ˆ์ธก ํด๋ž˜์Šค: {prediction}")
  • ๊ฒฐ๋ก  : ์ง์ ‘ ๊ณ„์‚ฐํ•œ ๊ฐ’๊ณผ ์‚ฌ์ดํ‚ท๋Ÿฐ์˜ ๊ฒฐ๊ณผ๊ฐ€ ์ •ํ™•ํžˆ ์ผ์น˜ํ•จ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

์š”์•ฝ

  1. ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€๋Š” ์ด์ง„ ๋ถ„๋ฅ˜(0 or 1)๋ฅผ ์œ„ํ•œ ๋ชจ๋ธ์ด๋‹ค.

  2. ์„ ํ˜• ํšŒ๊ท€์˜ ๊ฒฐ๊ณผ(zz)๋ฅผ ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜์— ๋„ฃ์–ด 0~1 ์‚ฌ์ด์˜ ํ™•๋ฅ (pp)๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค.

  3. Odds(์Šน์‚ฐ)์™€ Logit(๋กœ์ง“) ๊ฐœ๋…์ด ๋ฐ”ํƒ•์ด ๋œ๋‹ค.

  4. Scikit-learn์˜ predict_proba๋ฅผ ํ†ตํ•ด ํ™•๋ฅ ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

profile
์†Œ๊ธˆ์— ์ ˆ์ธ ์ƒ์„ , ๋ชธ์„ ๋’ค์ฒ™์ด๋‹ค ๐ŸŸ

0๊ฐœ์˜ ๋Œ“๊ธ€