1013 TIL

looggiยท2022๋…„ 10์›” 13์ผ
2

์ŠคํŒŒ๋ฅดํƒ€ ๋‚ด๋ฐฐ์บ  AI-3

๋ชฉ๋ก ๋ณด๊ธฐ
37/130
post-thumbnail

๐ŸŽ€๐ŸŽโš™๐Ÿ’พ๐Ÿ“Œ๐Ÿ—‘๐Ÿ“๐Ÿ€

๐ŸŒž ์•„์นจ ์ชฝ์ง€์‹œํ—˜

  1. ์œ„ ์ด๋ฏธ์ง€๋ฅผ ๋‹ค์šด๋ฐ›์•„ ์ €์žฅํ•˜์„ธ์š”
  2. opencv ๋กœ ์ด๋ฏธ์ง€๋ฅผ ์ฝ๊ณ  ์ด๋ฏธ์ง€์˜ ๊ฐ€๋กœ, ์„ธ๋กœ๊ฐ€ ๊ฐ ๋ช‡ pixel ์ธ์ง€ ๊ตฌํ•˜์„ธ์š”
    1. (์„ธ๋กœ, ๊ฐ€๋กœ)
  3. ์ด๋ฏธ์ง€์—์„œ ์‚ฌ๋žŒ์„ ์ฐพ์•„ ํ•˜์–€์ƒ‰์œผ๋กœ ๋„ค๋ชจ๋ฅผ ๊ทธ๋ ค์„œ result1.png ๋กœ ์ €์žฅํ•˜์„ธ์š”
  4. ์ด๋ฏธ์ง€์—์„œ ์‚ฌ๋žŒ๋“ค์„ ์ž˜๋ผ people1.png, people2.pngโ€ฆ ๋กœ ์ €์žฅํ•˜์„ธ์š”
  5. ์ฝ”๋“œ์™€ ์ด๋ฏธ์ง€๋ฅผ git์— ์—…๋กœ๋“œํ•˜๊ณ  ํ•ด๋‹น repository๋ฅผ ๊ณต์œ ํ•ด์ฃผ์„ธ์š”
import torch
import cv2

model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
img = cv2.imread('aa.jpeg')
print(img.shape)

results = model(img)
results.save()

result = results.pandas().xyxy[0].to_numpy()
result = [item for item in result if item[6]=='person']
print(result)
# 
print(len(result))
# 5
tmp_img = cv2.imread('aa.jpeg')

for i in range(len(result)):
    cv2.rectangle(tmp_img, (int(results.xyxy[0][i][0].item()), int(results.xyxy[0][i][1].item())), 
                        (int(results.xyxy[0][i][2].item()), int(results.xyxy[0][i][3].item())), 
                        (255,255,255))
cv2.imwrite('result1.png', tmp_img)

for i in range(len(result)):
    cropped = tmp_img[int(result[i][1]):int(result[i][3]), # ymin:ymax
				  int(result[i][0]):int(result[i][2]) # xmin:xmax
                  ]
    i+=1
    filename = 'people%i.png'%i # %i ๋ถ€๋ถ„์—” ์ˆซ์ž๊ฐ€ ๋“ค์–ด๊ฐ
    cv2.imwrite(filename, cropped)

์ถœ๋ ฅ๊ฒฐ๊ณผ

(837, 1024, 3) 
[
array([432.42706298828125, 155.5716552734375, 631.0242919921875, 804.8375854492188, 0.9132900238037109, 0, 'person'], dtype=object), 
array([620.9993896484375, 154.4466552734375, 790.0960693359375, 806.7845458984375, 0.9080604910850525, 0, 'person'], dtype=object), 
array([762.9304809570312, 127.33479309082031, 951.8087768554688, 813.9844970703125, 0.8943580985069275, 0, 'person'], dtype=object), 
array([274.86883544921875, 174.7275390625, 457.47015380859375, 805.7642822265625, 0.8802736401557922, 0, 'person'], dtype=object), 
array([98.14337158203125, 109.3960189819336, 310.7022399902344, 816.4161987304688, 0.8792229890823364, 0, 'person'], dtype=object)
]
5

git init
git remote add origin https://github.com/hyojine/1013test.git
git remote -v
โžœ
origin https://github.com/hyojine/1013test.git (fetch)
origin https://github.com/hyojine/1013test.git (push)
git add .
git commit -m'1013test'


Author identity unknown
*** Please tell me who you are.
Run

git config --global user.email "you@example.com"
git config --global user.name "Your Name"

to set your account's default identity.
โžœ ๋‘˜๋‹ค ๋ณต์‚ฌํ•ด์„œ ๋”ฐ์˜ดํ‘œ ์•ˆ์— ์ž‘์„ฑํ•ด์„œ ๋„ฃ์œผ๋ฉด ๊นƒํ—™ ์—ฐ๊ฒฐํ•˜๋Š” ์ฐฝ ๋œจ๊ณ  ๊นƒ ์—ฐ๊ฒฐ ์™„๋ฃŒ์˜ค์˜ค์˜ค์˜คใ…—์˜ค์˜ค์˜น์˜ค์˜น


โš™๏ธ ๋จธ์‹ ๋Ÿฌ๋‹ 4์ฃผ์ฐจ โžœโ–ถโŒโ—

๐Ÿค– 4์ฃผ์ฐจ์—์„œ ๋ฐฐ์šธ ๊ฒƒ : Neural Network & Transfer Learning

perceptron
Deep Feed Forward Network : RNN, LSTM์œผ๋กœ ํŒŒ์ƒ
Deep Convolutional Network(DCN) : ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ์— ํŠนํ™”๋จ

๐Ÿค– Convolutional Neural Networks (ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง)

  • ํ•ฉ์„ฑ๊ณฑ: ์ปดํ“จํ„ฐ ๋น„์ „์—์„œ ๋งŽ์ด ์‚ฌ์šฉํ•˜๋Š” ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ ๋ฐฉ์‹(์–ผ๊ตด์ธ์‹, ์‚ฌ๋ฌผ์ธ์‹์—๋„ ์‚ฌ์šฉ๋จ)
    ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ํ•„ํ„ฐ(๊ฐ€์ค‘์น˜)์˜ ๊ฐ ์š”์†Œ๋ฅผ ์นธ๋ผ๋ฆฌ ๊ณฑํ•จ
    • element-wise: ์™ผ์ชฝ์—์„œ ์˜ค๋ฅธ์ชฝ, ์œ„์—์„œ ์•„๋ž˜๋กœ
    • filter(=kernel): ์ž…๋ ฅ๊ฐ’์— ์ ์šฉํ•ด์„œ feature map(ํŠน์„ฑ๋งต)์„ ๋ฝ‘์•„๋‚ธ๋‹ค
    • stride: ํ•„ํ„ฐ๊ฐ€ ์ด๋™ํ•˜๋Š” ๊ฐ„๊ฒฉ
    • padding(=margin): ํ•ฉ์„ฑ๊ณฑ ์—ฐ์‚ฐ์˜ ๊ฒฐ๊ณผ(ํŠน์„ฑ๋งต)์˜ ํฌ๊ธฐ๊ฐ€ ์ค„์–ด๋“œ๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€

ํ•„ํ„ฐ๋ฅผ ์—ฌ๋Ÿฌ๊ฐœ ์‚ฌ์šฉํ•˜๋ฉด ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง์˜ ์„ฑ๋Šฅ์„ ๋†’์ผ ์ˆ˜ ์žˆ๋‹ค
์ด๋ฏธ์ง€๋Š” 3์ฐจ์›(๊ฐ€๋กœ, ์„ธ๋กœ, ์ฑ„๋„)
ex) ์ž…๋ ฅ์ด๋ฏธ์ง€ ํฌ๊ธฐ(10,10,3) - ํ•„ํ„ฐ์˜ ํฌ๊ธฐ(4,4,3) - ํ•„ํ„ฐ์˜ ๊ฐฏ์ˆ˜ 2 - ์ถœ๋ ฅ(ํŠน์„ฑ๋งต)์˜ ํฌ๊ธฐ(10,10,2)
โžœ ํ•„ํ„ฐ์˜ ์ˆ˜์™€ ์•„์›ƒํ’‹(ํ”ผ์ณ๋งต) ์ฑ„๋„ ์ˆ˜ ๋™์ผํ•จ

๐Ÿค– CNN์˜ ํŠน์„ฑ

  • ํ•ฉ์„ฑ๊ณฑ(convolution) ๋ ˆ์ด์–ด์™€ ์™„์ „์—ฐ๊ฒฐ(fully connected=dense) ๋ ˆ์ด์–ด๋ฅผ ํ•จ๊ป˜ ์‚ฌ์šฉ

  • ํ•ฉ์„ฑ๊ณฑ ๋ ˆ์ด์–ด(activator :Relu)์˜ ๊ฒฐ๊ณผ๊ฐ€ ํŠน์„ฑ๋งต + pooling(=subsampling)์„ ๋ฐ˜๋ณตํ•˜๋ฉด์„œ ํฌ๊ธฐ๊ฐ€ ์ž‘์•„์ง€๋ฉด์„œ ํ•ต์‹ฌ์ ์ธ ํŠน์„ฑ๋“ค์„ ์ถ”์ถœํ•ด๋ƒ„

  • pooling: ํŠน์„ฑ๋งต์˜ ์ค‘์š”ํ•œ ๋ถ€๋ถ„์„ ์ถ”์ถœ, ์ €์žฅ

    • max pooling: stride ๊ฐ„๊ฒฉ์œผ๋กœ ์˜ฎ๊ฒจ๋‹ค๋‹ˆ๋ฉด์„œ ํŠน์„ฑ๋งต์˜ poolsize์—์„œ ๊ฐ€์žฅ ํฐ ๊ฐ’๋“ค์„ ์ถ”์ถœ
    • avg pooling: stride ๊ฐ„๊ฒฉ์œผ๋กœ ์˜ฎ๊ฒจ๋‹ค๋‹ˆ๋ฉด์„œ ํŠน์„ฑ๋งต์˜ poolsize์•ˆ์˜ ๊ฐ’๋“ค์˜ ํ‰๊ท ์„ ์ถ”์ถœ
  • flatten layer : ๋งˆ์ง€๋ง‰ ํ’€๋ง ๋ ˆ์ด์–ด๋ฅผ ์ง€๋‚˜๋ฉด FCL(Dense)์™€ ์—ฐ๊ฒฐ๋˜์–ด์•ผํ•˜๋Š”๋ฐ ํ’€๋ง์„ ํ†ต๊ณผํ•œ ํŠน์„ฑ๋งต์€ 2์ฐจ์›, FCL์€ 1์ฐจ์›์ด๋ฏ€๋กœ ์—ฐ์‚ฐํ•˜๊ธฐ์œ„ํ•ด 2์ฐจ์› โžœ 1์ฐจ์›์œผ๋กœ ๋ฐ”๊ฟ”์ค€๋‹ค(flattening)

  • flatten์„ ๊ฑฐ์นœ ๋ ˆ์ด์–ด๋Š” flc์™€ ํ–‰๋ ฌ๊ณฑ์…ˆ์„ ํ•˜๊ณ  ์ดํ›„ ์™„์ „์—ฐ๊ฒฐ๊ณ„์ธต๊ณผ ํ™œ์„ฑํ™”ํ•จ์ˆ˜์˜ ๋ฐ˜๋ณต์„ ํ†ตํ•ด ๋…ธ๋“œ ๊ฐฏ์ˆ˜๋ฅผ ์ค„์ด๊ณ  ๋งˆ์ง€๋ง‰์— softmaxํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ํ†ต๊ณผํ•ด์„œ ๊ฒฐ๊ณผ๋ฌผ ์ƒ์„ฑ

๐Ÿค– CNN ํ™œ์šฉ์˜ ์˜ˆ

  • ์‚ฌ๋ฌผ ์ธ์‹(Object Detection) : ์‚ฌ๋ฌผ์„ ๋„ค๋ชจ์นธ ํ˜•ํƒœ๋กœ ์ฐพ์•„์คŒ โžœ YOLOv5(์–ด์ œ ์จ๋ณธ๊ฑฐ)๊ฐ€ ๋Œ€ํ‘œ์  Computer Vision Algorithm
    ๋‹ค๋ฅธ real-time detection์— ๋น„ํ•ด์„œ ์ •ํ™•๋„๊ฐ€ ๋†’์Œ
  • ์ด๋ฏธ์ง€ ๋ถ„ํ• (Segmentation) : ๊ฐ ๋ฌผ์ฒด์— ์†ํ•œ ํ”ฝ์…€๋“ค์„ ๋ถ„๋ฆฌ ๋ˆ„๋ผ๋”ฐ๋“ฏ์ด ์‚ฌ๋ฌผ์˜ ํ˜•ํƒœ๋ฅผ ๋น„๊ต์  ์ •ํ™•ํ•˜๊ฒŒ ์ธ์‹ํ•จ
    CT ์ดฌ์˜์—์„œ ์ข…์–‘์˜ ํฌ๊ธฐ์™€ ์œ„์น˜๋ฅผ ์ƒ‰๊น”๋กœ ๋‚˜ํƒ€๋ƒ„
    ์ธ๋ฌผ ์ดฌ์˜๋ชจ๋“œ์—์„œ ๋ฐฐ๊ฒฝ๋งŒ ํ๋ฆฟํ•˜๊ฒŒ ํ•˜๊ธฐ

ex) ์ž์œจ์ฃผํ–‰ ๋ฌผ์ฒด ์ธ์‹, ์ž์„ธ ์ธ์‹(์ €์ŠคํŠธ๋Œ„์Šค), ํ™”์งˆ ๊ฐœ์„ (super-resolution), style transfer(์‚ฌ์ง„์— ํ•„ํ„ฐ ์ ์šฉํ•˜๊ธฐ)

๐Ÿค– CNN์˜ ์ข…๋ฅ˜

  • AlexNet (2012):์˜๋ฏธ์žˆ๋Š” ์„ฑ๋Šฅ์„ ๋‚ธ ์ฒซ ๋ฒˆ์งธ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง, Dropout๊ณผ Image augmentation ๊ธฐ๋ฒ•์„ ํšจ๊ณผ์ ์œผ๋กœ ์ ์šฉํ•˜์—ฌ ๋”ฅ๋Ÿฌ๋‹์— ๋งŽ์€ ๊ธฐ์—ฌ๋ฅผ ํ•จ
  • VGGNet (2014): Deepํ•œ ๋ชจ๋ธ(ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ๊ฐœ์ˆ˜๊ฐ€ ๋งŽ๊ณ  ๋ชจ๋ธ์˜ ๊นŠ์ด๊ฐ€ ๊นŠ์Œ)
    ์ฒ˜์Œ ๋ชจ๋ธ์„ ์„ค๊ณ„ํ•  ๋•Œ ์ „์ด ํ•™์Šต ๋“ฑ์„ ํ†ตํ•ด์„œ ๊ฐ€์žฅ ๋จผ์ € ํ…Œ์ŠคํŠธํ•˜๋Š” ๋ชจ๋ธ
  • GoogLeNet(=Inception V3) (2015): ๊ตฌ์กฐ๊ฐ€ ๋ณต์žกํ•ด ๋„๋ฆฌ ์“ฐ์ด์ง„ ์•Š์•˜์ง€๋งŒ ๊ตฌ์กฐ์— ์ฃผ๋ชฉ!
    ํ•˜๋‚˜์˜ ๊ณ„์ธต์—์„œ๋„ ๋‹ค์–‘ํ•œ ์ข…๋ฅ˜์˜ ํ•„ํ„ฐ, ํ’€๋ง์„ ๋„์ž…ํ•จ์œผ๋กœ์จ ๊ฐœ๋ณ„ ๊ณ„์ธต๋ฅผ ๋‘ํ…๊ฒŒ ํ™•์žฅ์‹œํ‚ด
    (๊ธฐ์กด์—” ํ•œ ๊ฐ€์ง€์˜ ํ•„ํ„ฐ๋ฅผ ์ ์šฉํ•œ ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต์„ ๋‹จ์ˆœํžˆ ๊นŠ๊ฒŒ ์Œ“์Œ)
    ์ฐจ์›(์ฑ„๋„) ์ถ•์†Œ๋ฅผ ์œ„ํ•œ 1x1 ํ•ฉ์„ฑ๊ณฑ ๊ณ„์ธต
    ์—ฌ๋Ÿฌ ๊ณ„์ธต์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ถ„ํ• ํ•˜๊ณ  ํ•ฉ์น˜๋Š” ์•„์ด๋””์–ด๋Š”, ๊ฐˆ๋ฆผ๊ธธ์ด ์ƒ๊น€์œผ๋กœ์จ ์กฐ๊ธˆ ๋” ๋‹ค์–‘ํ•œ ํŠน์„ฑ์„ ๋ชจ๋ธ์ด ์ฐพ์„ ์ˆ˜ ์žˆ๊ฒŒํ•˜๊ณ , ์ธ๊ณต์ง€๋Šฅ์ด ์‚ฌ๋žŒ์ด ๋ณด๋Š” ๊ฒƒ๊ณผ ๋น„์Šทํ•œ ๊ตฌ์กฐ๋กœ ๋ณผ ์ˆ˜ ์žˆ๊ฒŒํ•จ.
    ์ด๋Ÿฌํ•œ ๊ตฌ์กฐ๋กœ VGGNet ๋ณด๋‹ค ์‹ ๊ฒฝ๋ง์ด ๊นŠ์–ด์กŒ์Œ์—๋„, ์‚ฌ์šฉ๋œ ํŒŒ๋ผ๋ฏธํ„ฐ๋Š” ์ ˆ๋ฐ˜ ์ดํ•˜
  • ResNet (2015): ์ธต์ด ๊นŠ์–ด์งˆ ์ˆ˜๋ก ์—ญ์ „ํŒŒ์˜ ๊ธฐ์šธ๊ธฐ๊ฐ€ ์ ์  ์‚ฌ๋ผ์ ธ์„œ ํ•™์Šต์ด ์ž˜ ๋˜์ง€ ์•Š๋Š” ๋ฌธ์ œ(Gradient vanishing)๊ฐ€ ๋ฐœ์ƒ โžœ Residual block์„ ์ œ์•ˆ(๊ทธ๋ž˜๋””์–ธํŠธ๊ฐ€ ์ž˜ ํ๋ฅผ ์ˆ˜ ์žˆ๋„๋ก ์ผ์ข…์˜ ์ง€๋ฆ„๊ธธ(Shortcut=Skip connection)์„ ๋งŒ๋“ค์–ด์ฃผ๋Š” ๋ฐฉ๋ฒ• = ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ ๊ฐ„์˜ ์ฐจ์ด๋ฅผ ํ•™์Šตํ•˜๋„๋ก ์„ค๊ณ„๋จ)

๐Ÿค– Transfer Learning (์ „์ด ํ•™์Šต)

  • pretrained model๋“ค์„ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•ด์„œ ๋‹ค์‹œ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•
    ๋ฐ์ดํ„ฐ ์…‹์ด ์™„์ „ํžˆ ๋‹ค๋ฅด๋”๋ผ๋„ ์œ ์˜๋ฏธํ•˜๊ฒŒ ํ•™์Šต์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋จ

๐Ÿค– Recurrent Neural Networks (์ˆœํ™˜ ์‹ ๊ฒฝ๋ง)

  • ์Œ์„ฑ, ๋ฌธ์ž์—ด ๋“ฑ ์ˆœ์ฐจ์ ์œผ๋กœ ๋“ฑ์žฅํ•˜๋Š” ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ์— ์ ํ•ฉํ•œ ๋ชจ๋ธ
  • ๊ธธ์ด์— ๊ด€๊ณ„์—†์ด ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ์„ ๋ฐ›์•„๋“ค์ผ ์ˆ˜ ์žˆ๋Š” ๊ตฌ์กฐ
  • ์ฃผ์‹์ด๋‚˜ ์•”ํ˜ธํ™”ํ์˜ ์‹œ์„ธ ์˜ˆ์ธก, ์‚ฌ๋žŒ๊ณผ ๋Œ€ํ™”ํ•˜๋Š” ์ฑ—๋ด‡์„ ๋งŒ๋“œ๋Š” ๋“ฑ์˜ ๋‹ค์–‘ํ•œ ๋ชจ๋ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Œ

๐Ÿค– Generative Adversarial Network (์ƒ์„ฑ์  ์ ๋Œ€ ์‹ ๊ฒฝ๋ง)

  • ์„œ๋กœ ์ ๋Œ€(Adversarial)ํ•˜๋Š” ๊ด€๊ณ„์˜ 2๊ฐ€์ง€ ๋ชจ๋ธ(์ƒ์„ฑ ๋ชจ๋ธ๊ณผ ํŒ๋ณ„ ๋ชจ๋ธ)์„ ๋™์‹œ์— ์‚ฌ์šฉํ•˜๋Š” ๊ธฐ์ˆ 
  • ์œ„์กฐ์ง€ํ๋ฒ”(Generator)์€ ๋”์šฑ ๋” ์ •๊ตํ•˜๊ฒŒ, ๊ฒฝ์ฐฐ(Discriminator)์€ ๋”์šฑ ๋” ํŒ๋ณ„์„ ์ž˜ํ•˜๋ฉด์„œ ์„œ๋กœ ๋ฐœ์ „์˜ ๊ด€๊ณ„๊ฐ€ ๋˜์–ด ์›๋ณธ๊ณผ ๊ตฌ๋ณ„์ด ์–ด๋ ค์šด ๊ฐ€์งœ ์ด๋ฏธ์ง€๊ฐ€ ๋งŒ๋“ค์–ด์ง€๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
  • GAN์˜ ์ž‘๋™๋ฐฉ์‹
    • Generator(์œ„์กฐ์ง€ํ๋ฒ”): ์ด๋ฏธ์ง€๊ฐ€ ์ง„์งœ(1)๋กœ ํŒ๋ณ„๋˜์–ด์•ผํ•จ โžœ ๋ณด๋‹ค ์ •๊ตํ•˜๊ฒŒ ๋ชจ๋ธ์„ ๋งŒ๋“ค๋ ค๊ณ  ๋…ธ๋ ฅํ•˜๋ฉฐ Target์€ 1๋กœ ๋‚˜์˜ค๋„๋ก ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ๊ฐ€์งœ๋ฅผ ์ง„์งœ์ธ 1์ฒ˜๋Ÿผ ๋งŒ๋“ค๊ธฐ ์œ„ํ•˜์—ฌ ํƒ€๊นƒ์ธ 1๊ณผ ์˜ˆ์ธก์˜ ์ฐจ์ด์ธ ์†์‹ค์„ ์ค„์ด๊ธฐ ์œ„ํ•˜์—ฌ Backpropagation์„ ์ด์šฉํ•œ weight๋ฅผ ์กฐ์ •
    • Discriminator(๊ฒฝ์ฐฐ): ์ง„์งœ ์ด๋ฏธ์ง€๋Š” 1๋กœ, ๊ฐ€์งœ ์ด๋ฏธ์ง€๋Š” 0์œผ๋กœ ํŒ๋ณ„ํ•  ์ˆ˜ ์žˆ์–ด์•ผํ•ฉ๋‹ˆ๋‹ค. ์ƒ์„ฑ๋œ ๋ชจ๋ธ์—์„œ Fake์™€ Real ์ด๋ฏธ์ง€ ๋‘˜๋‹ค๋ฅผ ํ•™์Šตํ•˜์—ฌ ์˜ˆ์ธก๊ณผ ํƒ€๊นƒ์˜ ์ฐจ์ด์ธ ์†์‹ค์„ ์ค„์—ฌ์•ผํ•จ
      โžœ ๋‘ ๋ชจ๋ธ์ด ๋Œ€๋ฆฝํ•˜๋ฉด์„œ(Adversarial) ๋ฐœ์ „ํ•ด ์—ํญ(Epoch)์ด ์ง€๋‚  ๋•Œ๋งˆ๋‹ค ๋žœ๋ค ์ด๋ฏธ์ง€๊ฐ€ ์ ์  ๋™๋ฌผ์„ ์ •๊ตํ•˜๊ฒŒ ์ƒ์„ฑํ•ด ๋‚ด๋Š” ๊ฒƒ
      ex) deep fake, beautyGAN, Toonify Yourself

๐Ÿค–์‹ค์Šต

๐Ÿฑโ€๐Ÿš€ CNN ์‹ค์Šต - ์ˆ˜ํ™” MNIST

(์ง€๋‚œ๋ฒˆ์— deep neural network MLP๋กœ ํ–ˆ๋˜)
https://colab.research.google.com/drive/1x2SRHEAdRqNHTMKvVn8oUSwF9KV9Vi4C?usp=sharing#scrollTo=I_strLH75R_x

๋Ÿฐํƒ€์ž„Runtime - ๋Ÿฐํƒ€์ž„ ์œ ํ˜• ๋ณ€๊ฒฝChange runtime type - GPU ์„ ํƒ(์—ฐ์‚ฐ์†๋„ ๋Š˜๋ฆฌ๊ธฐ)

import os
os.environ['KAGGLE_USERNAME'] = 'username' # username
os.environ['KAGGLE_KEY'] = 'key' # key
!kaggle datasets download -d datamunge/sign-language-mnist
!unzip sign-language-mnist.zip
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense, Conv2D, MaxPooling2D, Flatten, Dropout
from tensorflow.keras.optimizers import Adam, SGD
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OneHotEncoder

#๋ฐ์ดํ„ฐ์…‹ ๋กœ๋“œํ•˜๊ธฐ
train_df = pd.read_csv('sign_mnist_train.csv')
test_df = pd.read_csv('sign_mnist_test.csv')

# ๋ผ๋ฒจ ๋ถ„ํฌ ํ™•์ธํ•˜๊ธฐ(์ด 24๊ฐœ)
plt.figure(figsize=(16, 10))
sns.countplot(train_df['label'])
plt.show()

# ์ „์ฒ˜๋ฆฌ
# ์ž…๋ ฅ๊ณผ ์ถœ๋ ฅ ๋‚˜๋ˆ„๊ธฐ
# ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ์…‹
train_df = train_df.astype(np.float32)
#32๋น„ํŠธ๋กœ ๋ฐ”๊พธ๋Š”๊ฑด ํ•„์ˆ˜
x_train = train_df.drop(columns=['label'], axis=1).values
# ๋ผ๋ฒจ ๋นผ๊ณ ๋Š” ๋ชจ๋‘ x
x_train = x_train.reshape((-1, 28, 28, 1))
# (๋ฐฐ์น˜์‚ฌ์ด์ฆˆ, ์ด๋ฏธ์ง€ ๊ฐ€๋กœ, ์„ธ๋กœ, ๊ทธ๋ ˆ์ด์Šค์ผ€์ผ)
# ์ด๋ฏธ์ง€ ๊ฐ€๋กœ, ์„ธ๋กœ, ์ƒ‰์ƒ๊นŒ์ง€ํ•ด์„œ 3์ฐจ์›
# -> xdata๋ฅผ (28,28,1) ํฌ๊ธฐ์˜ ์ด๋ฏธ์ง€ ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜
y_train = train_df[['label']].values
# ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ์…‹
test_df = test_df.astype(np.float32)
x_test = test_df.drop(columns=['label'], axis=1).values
x_test = x_test.reshape((-1, 28, 28, 1))
y_test = test_df[['label']].values

print(x_train.shape, y_train.shape) # (27455, 28, 28, 1) (27455, 1)
print(x_test.shape, y_test.shape) # (7172, 28, 28, 1) (7172, 1)

# ๋ฐ์ดํ„ฐ ๋ฏธ๋ฆฌ๋ณด๊ธฐ
index = 1
plt.title(str(y_train[index]))
plt.imshow(x_train[index].reshape((28, 28)), cmap='gray')
plt.show()

# one-hot ์ธ์ฝ”๋”ฉํ•˜๊ธฐ
encoder = OneHotEncoder()
y_train = encoder.fit_transform(y_train).toarray()
y_test = encoder.fit_transform(y_test).toarray()

print(y_train.shape) # (27455, 24)

# ์ผ๋ฐ˜ํ™”ํ•˜๊ธฐ
# ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ 255๋กœ ๋‚˜๋ˆ„์–ด 0-1 ์‚ฌ์ด์˜ ์†Œ์ˆ˜์  ๋ฐ์ดํ„ฐ(floating point 32bit = float32)๋กœ ๋ฐ”๊พธ๊ธฐ
# ImageDataGenerator() ์ด์šฉ
train_image_datagen = ImageDataGenerator(
  rescale=1./255, # ์ผ๋ฐ˜ํ™”
)

train_datagen = train_image_datagen.flow(
    x=x_train,
    y=y_train,
    batch_size=256,
    shuffle=True
)
# flow๋ฅผ ์ด์šฉํ•ด ์ˆœ์ฐจ์ ์œผ๋กœ ๊ฐ’์„ ๋„ฃ์–ด์ค€๋‹ค
test_image_datagen = ImageDataGenerator(
  rescale=1./255
)

test_datagen = test_image_datagen.flow(
    x=x_test,
    y=y_test,
    batch_size=256,
    shuffle=False # ๋žœ๋ค์„ฑ์„ ์—†์•ฐ
)

index = 1

preview_img = train_datagen.__getitem__(0)[0][index]
# __getitem__
preview_label = train_datagen.__getitem__(0)[1][index]

plt.imshow(preview_img.reshape((28, 28)))
plt.title(str(preview_label))
plt.show()

โญโญ ์ด์–ด์„œ ๋„คํŠธ์›Œํฌ ๊ตฌ์„ฑ โญโญ

input = Input(shape=(28, 28, 1))

# conv๋ž‘ pooling์ด๋ž‘ ํ•œ์Œ์ด ํ•œ๋ธ”๋Ÿญ
hidden = Conv2D(filters=32, kernel_size=3, strides=1, padding='same', activation='relu')(input)
hidden = MaxPooling2D(pool_size=2, strides=2)(hidden)

hidden = Conv2D(filters=64, kernel_size=3, strides=1, padding='same', activation='relu')(hidden)
hidden = MaxPooling2D(pool_size=2, strides=2)(hidden)

hidden = Conv2D(filters=32, kernel_size=3, strides=1, padding='same', activation='relu')(hidden)
hidden = MaxPooling2D(pool_size=2, strides=2)(hidden)
# ๋งˆ์ง€๋ง‰ ํ’€๋ง์„ ๊ฑฐ์น˜๊ณ  ๋‚˜๋ฉด

# 1์ฐจ์›์œผ๋กœ ํ’€์–ด์ค˜์•ผ dense์™€ ์—ฐ์‚ฐ ๊ฐ€๋Šฅ
hidden = Flatten()(hidden)

# ์—ฌ๊ธฐ๋ถ€ํ„ด deep neural network๋ž‘ ๋™์ผ(๋…ธ๋“œ์ˆ˜, ํ™œ์„ฑํ•จ์ˆ˜)
hidden = Dense(512, activation='relu')(hidden)

hidden = Dropout(rate=0.3)(hidden)
# ๋…ธ๋“œ์˜ 30%๋ฅผ ๋žœ๋ค์œผ๋กœ ๋บ€๋‹ค

output = Dense(24, activation='softmax')(hidden)

model = Model(inputs=input, outputs=output)

model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.001), metrics=['acc'])

model.summary()


โžœ ํฌ๊ธฐ๋Š” ์ •์ˆ˜์—ฌ์•ผํ•˜๋ฏ€๋กœ 7->3 ์œผ๋กœ ์ค„์–ด๋“ค๋•Œ ๋‚˜๋จธ์ง€๋Š” ๋ฒ„๋ฆผ

# ํ•™์Šต์‹œํ‚ค๊ธฐ
history = model.fit(
    train_datagen,
    validation_data=test_datagen, # ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ๋ฅผ ๋„ฃ์–ด์ฃผ๋ฉด ํ•œ epoch์ด ๋๋‚ ๋•Œ๋งˆ๋‹ค ์ž๋™์œผ๋กœ ๊ฒ€์ฆ
    epochs=20 # epochs ๋ณต์ˆ˜ํ˜•์œผ๋กœ ์“ฐ๊ธฐ!
)

# ํ•™์Šต๊ฒฐ๊ณผ ๊ทธ๋ž˜ํ”„
fig, axes = plt.subplots(1, 2, figsize=(20, 6))
axes[0].plot(history.history['loss'])
axes[0].plot(history.history['val_loss'])
axes[1].plot(history.history['acc'])
axes[1].plot(history.history['val_acc'])


โญ์ด๋ฏธ์ง€ ์ฆ๊ฐ• ๊ธฐ๋ฒ• data augmentation ์ด์šฉํ•ด๋ณด๊ธฐโญ

#ImageDataGenerator๋ฅผ ์ด์š”ํ•ด์„œ ์ด๋ฏธ์ง€ ์ฆ๊ฐ•์„ ์‰ฝ๊ฒŒ ํ•  ์ˆ˜ ์žˆ๋‹ค
train_image_datagen = ImageDataGenerator(
  rescale=1./255, # ์ผ๋ฐ˜ํ™”(์›๋žœ ์ด๊ฒƒ๋งŒํ–ˆ์—ˆ๋Š”๋ฐ)
  rotation_range=10,  # ๋žœ๋คํ•˜๊ฒŒ ์ด๋ฏธ์ง€๋ฅผ ํšŒ์ „ (๋‹จ์œ„: ๋„, 0-180)
  zoom_range=0.1, # ๋žœ๋คํ•˜๊ฒŒ ์ด๋ฏธ์ง€ ํ™•๋Œ€ (%)
  width_shift_range=0.1,  # ๋žœ๋คํ•˜๊ฒŒ ์ด๋ฏธ์ง€๋ฅผ ์ˆ˜ํ‰์œผ๋กœ ์ด๋™ (%)
  height_shift_range=0.1,  # ๋žœ๋คํ•˜๊ฒŒ ์ด๋ฏธ์ง€๋ฅผ ์ˆ˜์ง์œผ๋กœ ์ด๋™ (%)
)

train_datagen = train_image_datagen.flow(
    x=x_train,
    y=y_train,
    batch_size=256,
    shuffle=True
)

test_image_datagen = ImageDataGenerator(
  rescale=1./255
)
# ๊ฒ€์ฆ/ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๋Š” augmentations์—†์–ด๋„ ๋˜๊ธฐ๋„ ํ•˜๊ณ  ์žˆ์œผ๋ฉด ๊ฒ€์ฆ๋•Œ๋งˆ๋‹ค ๊ฒฐ๊ณผ๊ฐ€ ๋‹ฌ๋ผ์ ธ์„œ ๊ฒฐ๊ณผ๊ฐ€ ๊ฐ๊ด€์„ฑ์„ ์žƒ์Œ

test_datagen = test_image_datagen.flow(
    x=x_test,
    y=y_test,
    batch_size=256,
    shuffle=False
)

index = 1

preview_img = train_datagen.__getitem__(0)[0][index]
preview_label = train_datagen.__getitem__(0)[1][index]

plt.imshow(preview_img.reshape((28, 28)))
plt.title(str(preview_label))
plt.show()
input = Input(shape=(28, 28, 1))

hidden = Conv2D(filters=32, kernel_size=3, strides=1, padding='same', activation='relu')(input)
hidden = MaxPooling2D(pool_size=2, strides=2)(hidden)

hidden = Conv2D(filters=64, kernel_size=3, strides=1, padding='same', activation='relu')(hidden)
hidden = MaxPooling2D(pool_size=2, strides=2)(hidden)

hidden = Conv2D(filters=32, kernel_size=3, strides=1, padding='same', activation='relu')(hidden)
hidden = MaxPooling2D(pool_size=2, strides=2)(hidden)

hidden = Flatten()(hidden)

hidden = Dense(512, activation='relu')(hidden)

hidden = Dropout(rate=0.3)(hidden)

output = Dense(24, activation='softmax')(hidden)

model = Model(inputs=input, outputs=output)

model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.001), metrics=['acc'])

model.summary()

history = model.fit(
    train_datagen,
    validation_data=test_datagen, 
    epochs=20
)

fig, axes = plt.subplots(1, 2, figsize=(20, 6))
axes[0].plot(history.history['loss'])
axes[0].plot(history.history['val_loss'])
axes[1].plot(history.history['acc'])
axes[1].plot(history.history['val_acc'])

๐Ÿฑโ€๐Ÿš€ ์ „์ดํ•™์Šต ์‹ค์Šต - ๋‚ด์ผํ• ๋ž˜

profile
looooggi

0๊ฐœ์˜ ๋Œ“๊ธ€