๐Ÿ›ซ์ œ์ฃผ๋„ ๊ด€๊ด‘์ง€ ์ถ”์ฒœ ๋ชจ๋ธ

ํ์ดยท2022๋…„ 5์›” 31์ผ
1

Project

๋ชฉ๋ก ๋ณด๊ธฐ
1/1

Introduction

  • ์ œ์ฃผ๋„ ๊ด€๊ด‘์ง€ ์ถ”์ฒœ ๋ชจ๋ธ์€ ์—ฌํ–‰์ง€๋ฅผ ์‰ฝ๊ฒŒ ์„ ์ •ํ•˜๊ธฐ์œ„ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
  • ์ด๋ฏธ์ง€, ์ž์—ฐ์–ด(์นดํ…Œ๊ณ ๋ฆฌ, ํ‚ค์›Œ๋“œ ๋“ฑ)์„ ์ด์šฉํ•ด ์œ ์‚ฌ๋„๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ  ์—ฌํ–‰์ง€๋ฅผ ์ถ”์ฒœํ•˜๋Š” ์‹œ์Šคํ…œ์„ ๊ตฌํ˜„ํ•˜์˜€์Šต๋‹ˆ๋‹ค.
  • ๊ตฌํ˜„ ๊ฒฐ๊ณผ(์™ผ์ชฝ : ๋„ค๋น„๊ฒŒ์ด์…˜ API ์ตœ์  ๊ฒฝ๋กœ | ์˜ค๋ฅธ์ชฝ : ์ง์„  ๊ฑฐ๋ฆฌ ์ตœ์  ๊ฒฝ๋กœ)

Contents

1. ํ”„๋กœ์ ํŠธ ์†Œ๊ฐœ

๋ฐฐ๊ฒฝ

  • ๊ฐ€๊ณ  ์‹ถ์€ ์žฅ์†Œ๋Š” ๋งŽ์ง€๋งŒ ์ผ์ผ์ด ์žฅ์†Œ๋ฅผ ํ™•์ธํ•˜๋ฉฐ ์„ ์ •ํ•˜๊ธฐ ์–ด๋ ค์›€
  • ์žฅ์†Œ๋ฅผ ์„ ์ •ํ•˜๋”๋ผ๋„ ์ˆœ์„œ์— ๋”ฐ๋ผ ์‹œ๊ฐ„๊ณผ ๊ฑฐ๋ฆฌ๊ฐ€ ๋‹ฌ๋ผ์ ธ ์ตœ์ ์˜ ๊ฒฝ๋กœ ์„ ์ •์ด ์–ด๋ ค์›€
  • ์žฅ์†Œ์™€ ๊ฒฝ๋กœ์— ๋Œ€ํ•œ ์ถ”์ฒœ์œผ๋กœ ํ•ด๋‹น ๊ณผ์ •์˜ ์–ด๋ ค์›€์„ ์ค„์ผ ์ˆ˜ ์žˆ์„ ๊ฒƒ

ํ”„๋กœ์ ํŠธ ๊ฐœ์š”

  • ๊ตฌ์„ฑ์ธ์› : ๊น€์ค€ํ˜•, ๊น€๋‚จ๊ทœ
  • ์ˆ˜ํ–‰๊ธฐ๊ฐ„ : 1๋‹ฌ (2202๋…„ 1์›”)
  • ๋ชฉํ‘œ : ํ‚ค์›Œ๋“œ์™€ ํ…Œ๋งˆ ์„ ์ •์œผ๋กœ ๊ด€๊ด‘์ง€์™€ ๊ฒฝ๋กœ ์ถ”์ฒœ
  • ๋ฐ์ดํ„ฐ : ์นด์นด์˜ค ํ‚ค์›Œ๋“œ ๊ฒ€์ƒ‰ API, selenium ํ™œ์šฉ
    • ๋Œ€ํ‘œ ์ปฌ๋Ÿผ : [id, place_name, keyword, category_group_name, x, y, base_url, rating, (image)]
    • ๊ด€๊ด‘์ง€ ๋ฐ์ดํ„ฐ(3008๊ฐœ)
    • ์ˆ™๋ฐ• ๋ฐ์ดํ„ฐ(3004๊ฐœ)

๊ฐœ๋ฐœ ํ™˜๊ฒฝ

  • ์–ธ์–ด : Python
  • ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ : Jupyter, Pandas, Numpy, Selenium, Scikit-Learn, Tensorflow
  • ์•Œ๊ณ ๋ฆฌ์ฆ˜ : VGG16, CLIP

2. ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๋ฐ ์ „์ฒ˜๋ฆฌ

๊ด€๊ด‘์ง€ ๋ฐ์ดํ„ฐ

โ–ช๋ฐ์ดํ„ฐ ์ˆ˜์ง‘

  • ๊ฒ€์ƒ‰ ํ‚ค์›Œ๋“œ(9๊ฐœ) : '๋ง›์ง‘', '๋ถ„์œ„๊ธฐ ์ข‹์€', 'ํ…Œ๋งˆํŒŒํฌ', '์˜ค์…˜๋ทฐ', '๊ฐ์„ฑ', '๊ฐ€์กฑ์—ฌํ–‰', '์ฒดํ—˜', 'ํœด์‹', '๋ ˆํฌ์ธ ', '๊ฐ€๋ณผ๋งŒํ•œ ๊ณณ'
  • ์นดํ…Œ๊ณ ๋ฆฌ(4๊ฐœ) : CT1(๋ฌธํ™”์‹œ์„ค), AT4(๊ด€๊ด‘๋ช…์†Œ), FD6(์Œ์‹์ ), CE7(์นดํŽ˜)
  • (์ œ์ฃผ์‹œ, ์„œ๊ท€ํฌ์‹œ) ๋ฒ•์ •๋™, ๋ฆฌ ๋ณ„๋กœ ๊ฒ€์ƒ‰


โ–ช๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

  • id ์ค‘๋ณต๊ฐ’ ์ฒ˜๋ฆฌ
  • ๋ฌธํ™” ์‹œ์„ค์˜ ์ˆ˜๊ฐ€ ๋‹ค๋ฅธ ์นดํ…Œ๊ณ ๋ฆฌ์— ๋น„ํ•ด ํ˜„์ €ํžˆ ์ ์–ด ๋ฌธํ™”์‹œ์„ค์€ ๊ด€๊ด‘๋ช…์†Œ๋กœ ๋ณ€๊ฒฝ
  • ์ด๋ฏธ์ง€๊ฐ€ ์—†๋Š” ๋ฐ์ดํ„ฐ ์ œ๊ฑฐ
  • ์ด๋ฏธ์ง€์˜ ์‚ฌ์ด์ฆˆ (224 X 224)๋กœ ์กฐ์ ˆ : ImageNet ๊ธฐ๋ฐ˜์˜ ์‚ฌ์ „ ํ•™์Šต ๋ชจ๋ธ ์‚ฌ์šฉ
    => ์ด ๋ฐ์ดํ„ฐ : 3008๊ฐœ

์ˆ™๋ฐ• ๋ฐ์ดํ„ฐ

โ–ช๋ฐ์ดํ„ฐ ์ˆ˜์ง‘

  • ๊ฒ€์ƒ‰ ํ‚ค์›Œ๋“œ(6๊ฐœ) : 'ํ˜ธํ…”', '๋ฆฌ์กฐํŠธ',' ์ฝ˜๋„', '๊ฒŒ์ŠคํŠธํ•˜์šฐ์Šค', '๋ฏผ๋ฐ•', 'ํŽœ์…˜'
  • ์นดํ…Œ๊ณ ๋ฆฌ(1๊ฐœ) : AD5(์ˆ™๋ฐ•)
  • (์ œ์ฃผ์‹œ, ์„œ๊ท€ํฌ์‹œ) ๋ฒ•์ •๋™, ๋ฆฌ ๋ณ„๋กœ ๊ฒ€์ƒ‰


โ–ช๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ

  • id ์ค‘๋ณต๊ฐ’ ์ฒ˜๋ฆฌ
  • ์ˆ˜๊ฐ€ ์ƒ๋Œ€์ ์œผ๋กœ ์ ์€ ๋ฆฌ์กฐํŠธ์™€ ์ฝ”๋“œ๋ฅผ ๋ฆฌ์กฐํŠธ/์ฝ˜๋„๋กœ ๋ณ‘ํ•ฉ
  • ์ด๋ฏธ์ง€๊ฐ€ ์—†๋Š” ๋ฐ์ดํ„ฐ ์ œ๊ฑฐ
    => ์ด ๋ฐ์ดํ„ฐ : 3004๊ฐœ

3. ๋ชจ๋ธ๋ง

์ด๋ฏธ์ง€ ํŠน์ง• ์ถ”์ถœ ๋ฐ ์œ ์‚ฌ๋„ ์ธก์ •

  • VGG16 ๋ชจ๋ธ์„ ํ™œ์šฉํ•ด ํŠน์ง• ์ถ”์ถœ(Feature Extraction) :

    • ์ฒซ ๋ฒˆ์งธ FC Layer ์ถ”์ถœ : (1, 4096) ํ˜•ํƒœ์˜ ํŠน์ง• ๋ฒกํ„ฐ ํš๋“
  • ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ

  • ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„

  • ์œ ํด๋ฆฌ๋“œ ๊ฑฐ๋ฆฌ์™€ ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„์˜ ์ฐจ์ด๊ฐ€ ๋ณด์ด์ง€ ์•Š์•„ ์ตœ์ข…์ ์œผ๋กœ ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ ์‚ฌ์šฉ

  • ์Œ์‹์ ๊ณผ ์นดํŽ˜๋Š” ํŠน์ง• ์ถ”์ถœ๋กœ๋Š” ํ‘œํ˜„์ด ์ž˜๋˜์ง€ ์•Š์•„ ๊ด€๊ด‘ ๋ช…์†Œ๋งŒ ์ง„ํ–‰

์Œ์‹ ๋ผ๋ฒจ๋ง

  • CLIP : ์ด๋ฏธ์ง€์™€ ํ…์ŠคํŠธ ์Œ์„ ์˜ˆ์ธกํ•˜๋„๋ก ํ•™์Šต๋˜๋Š” ๋ชจ๋ธ
  • ์Œ์‹ ์นดํ…Œ๊ณ ๋ฆฌ ์„ ์ • (32๊ฐœ)
    • CLIP์„ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด ์˜์–ด๋กœ ์ž‘์„ฑ
    • ["chicken", "Grilled pork","chiness food","sea food","noodle","Grilled fish", "Grilled Cutlassfish","beef","Hanjeongsik", "sashimi","bolied pork","hamburger","shrimp sashimi","shrimp","melon prosciutto","pork cutlet", "bread","pizza","Waffle","Tteokbokki","pasta","ramen","sushi","corn dog","American breakfast","crab","curry","bread","soup","tuna", "doughnut","koreanstyle sushi(gimbab)"]
  • ์Œ์‹ ์ด๋ฏธ์ง€์™€ ์Œ์‹ ์นดํ…Œ๊ณ ๋ฆฌ๋กœ ๋ผ๋ฒจ๋ง ์ง„ํ–‰
    • ๊ฐ€์žฅ ๋†’์€ ์˜ˆ์ธก์„ ๋ณด์—ฌ์ฃผ๋Š” ๋ผ๋ฒจ๋ง์„ ์Œ์‹์˜ ์นดํ…Œ๊ณ ๋ฆฌ๋กœ ์„ ์ •

ํ…Œ๋งˆ ์ž์—ฐ์–ด ์œ ์‚ฌ๋„

  • content : keyword์™€ category_group_name๋ฅผ ํ™œ์šฉํ•œ ๋ง๋ญ‰์น˜
    • ํ‚ค์›Œ๋“œ : '๋ง›์ง‘', '๋ถ„์œ„๊ธฐ ์ข‹์€', 'ํ…Œ๋งˆํŒŒํฌ', '์˜ค์…˜๋ทฐ', '๊ฐ์„ฑ', '๊ฐ€์กฑ์—ฌํ–‰', '์ฒดํ—˜', 'ํœด์‹', '๋ ˆํฌ์ธ ', '๊ฐ€๋ณผ๋งŒํ•œ ๊ณณ'
    • ์นดํ…Œ๊ณ ๋ฆฌ : '๊ด€๊ด‘๋ช…์†Œ', '์Œ์‹์ ', '์นดํŽ˜'
  • CountVectorizer๋ฅผ ํ†ตํ•ด ๋ฒกํ„ฐํ™”
  • Cosine Similarity
    • ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ matrix
    • ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„ ์ ์šฉ ์˜ˆ์‹œ

๊ฒฝ๋กœ ์ถ”์ฒœ

  • ์นด์นด์˜ค๋‚ด๋น„ API : ์žฅ์†Œ ์ขŒํ‘œ๋ฅผ ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ์‚ฌ์šฉ -> ๊ฒฝ๋กœ ์ถ”์ฒœ
# ํ—ค๋”
headers = {"Authorization" : "KakaoAK {}".format(rest_api_key)}
# ํŒŒ๋ผ๋ฏธํ„ฐ
url1 = "https://apis-navi.kakaomobility.com/v1/directions?origin={๊ฒฝ๋„1},{์œ„๋„1}&destination={๊ฒฝ๋„2},{์œ„๋„2}&waypoints={๊ฒฝ๋„,์œ„๋„|...}"
url2 = "https://apis-navi.kakaomobility.com/v1/directions?origin={๊ฒฝ๋„3},{์œ„๋„3}&destination={๊ฒฝ๋„4},{์œ„๋„4}&waypoints={๊ฒฝ๋„,์œ„๋„|...}"

# GET์„ ์ด์šฉํ•˜์—ฌ ํš๋“
res1 = requests.get(url1, headers=headers)
# Json์„ ์ด์šฉํ•˜์—ฌ ํ•ด์ œ
doc1 = json.loads(res1.text)

res2 = requests.get(url2, headers=headers)
doc2 = json.loads(res2.text)
  • ๊ฒฝ๋กœ ์„ฑ๊ณตํ•˜๋Š” ๊ฒฝ์šฐ

    - Folium์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฒฝ๋กœ ์‹œ๊ฐํ™”
  • ๊ฒฝ๋กœ ์ฐพ๊ธฐ ์‹คํŒจํ•˜๋Š” ๊ฒฝ์šฐ
  • ๊ฒฝ๋กœ ํƒ์ƒ‰์ด ์‹คํŒจํ•˜๋Š” ๊ฒฝ์šฐ๋ฅผ ์œ„ํ•ด ์ง์„  ๊ฑฐ๋ฆฌ ๊ฒฝ๋กœ๋„ ์ถ”์ฒœ

4. ์ตœ์ข… ์‹œ์Šคํ…œ

  • ๊ตฌํ˜„ ์กฐ๊ฑด
    • ์ถœ๋ฐœ ์ง€์  : ์ œ์ฃผ ๊ณตํ•ญ
    • ์žฅ์†Œ ์„ ํƒ : ์ด์ „ ์žฅ์†Œ 15KM๋‚ด์— ์žˆ๋Š” ์žฅ์†Œ ์ค‘ ์„ ํƒ(์œ„๋„์™€ ๊ฒฝ๋„๋ฅผ ์‚ฌ์šฉํ•œ ์ง์„ ๊ฑฐ๋ฆฌ ์‚ฌ์šฉ)
    • ๊ด€๊ด‘๋ช…์†Œ : ์ด๋ฏธ์ง€ ์œ ์‚ฌ๋„, ์ž์—ฐ์–ด ์œ ์‚ฌ๋„, ํ‰์  ํ™œ์šฉ
    • ์Œ์‹์  : CLIP ๋ผ๋ฒจ๋ง, ์ž์—ฐ์–ด ์œ ์‚ฌ๋„, ํ‰์  ํ™œ์šฉ
    • ์นดํŽ˜ : ์ž์—ฐ์–ด ์œ ์‚ฌ๋„, ํ‰์  ํ™œ์šฉ

์‹œ์Šคํ…œ ๋™์ž‘ ์˜ˆ์‹œ

1) ํ‚ค์›Œ๋“œ ์„ ํƒ

2) ์นดํ…Œ๊ณ ๋ฆฌ ์„ ํƒ

3) ๋Œ€ํ‘œ ์žฅ์†Œ ์„ ํƒ(์นดํŽ˜์ผ ๊ฒฝ์šฐ ์‹คํ–‰X)


4) ๊ด€๊ด‘์ง€ ์ถ”์ฒœ

  • ๋Œ€ํ‘œ ์žฅ์†Œ๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๊ด€๊ด‘์ง€ ์ถ”์ฒœ ์ง„ํ–‰
  • ์นดํ…Œ๊ณ ๋ฆฌ, ๋Œ€ํ‘œ ์žฅ์†Œ ์„ ํƒ ๋ฐ˜๋ณต
    • ์ข…๋ฃŒ : ์นดํ…Œ๊ณ ๋ฆฌ ์„ ํƒ์‹œ 0 ์ž…๋ ฅ

5) ๊ฒฝ๋กœ ํƒ์ƒ‰

  • ์นด์นด์˜ค๋‚ด๋น„ API ํ™œ์šฉ ๊ฒฝ๋กœ ํƒ์ƒ‰
  • ์ง์„  ๊ฑฐ๋ฆฌ ๊ธฐ์ค€ ๊ฒฝ๋กœ ํƒ์ƒ‰

6) ๊ฒฝ๋กœ ์ถœ๋ ฅ

  • ํƒ์ƒ‰ํ•œ ๊ฒฝ๋กœ ๊ฒฐ๊ณผ๋ฅผ folium์œผ๋กœ ์‹œ๊ฐํ™”
  • ๊ด€๊ด‘์ง€ ์ฃผ๋ณ€ 3KM ์ด๋‚ด์˜ ์ˆ™๋ฐ• ์—…์†Œ ํ‘œ์‹œ
  • ์นด์นด์˜ค๋‚ด๋น„ API ๊ฒฝ๋กœ
  • ์ง์„  ๊ฑฐ๋ฆฌ ๊ธฐ์ค€ ๊ฒฝ๋กœ

5. ํ•œ๊ณ„ ๋ฐ ๋ณด์™„์ 

๐Ÿ› ์‚ฌ์šฉ ๋ฐ์ดํ„ฐ ๋ถ€์กฑ

  • ์ถ”์ฒœ์— ์‚ฌ์šฉํ•œ ์ฃผ์š” ๋ฐ์ดํ„ฐ๋Š” ์ด๋ฏธ์ง€, ์ž์—ฐ์–ด, ํ‰์  3๊ฐ€์ง€์ด๋‹ค.
    • ์ด๋ฏธ์ง€
      • ์Œ์‹์ ๊ณผ ์นดํŽ˜์˜ ์ด๋ฏธ์ง€๋Š” ์Œ์‹, ์Œ๋ฃŒ ๋“ฑ์œผ๋กœ ํŠน์ง• ๋ฒกํ„ฐ๋กœ ๋ถ„๋ฅ˜ํ•˜๊ธฐ ์•Œ๋งž์ง€์•Š๋‹ค.
      • ์Œ์‹์ ์˜ ๊ฒฝ์šฐ CLIP์„ ํ™œ์šฉํ•ด ๋ผ๋ฒจ๋ง์„ ์ง„ํ–‰ํ•˜์˜€์ง€๋งŒ, ์นดํŽ˜๋Š” ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•œ ์ž‘์—…์„ ์ง„ํ–‰ํ•˜์ง€ ๋ชปํ•˜์˜€๋‹ค.
    • ์ž์—ฐ์–ด
      • ์ž์—ฐ์–ด ๋ง๋ญ‰์น˜๋ฅผ keyword์™€ category_group_name 2๊ฐ€์ง€๋ฅผ ์กฐํ•ฉํ•˜์—ฌ ๊ตฌ์„ฑํ•˜์˜€๋‹ค.
      • ๊ฐ ์žฅ์†Œ์˜ keyword๋Š” ๋Œ€๋ถ€๋ถ„ 1๊ฐœ์ด๋ฉฐ, ์ตœ๋Œ€ 3๊ฐœ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์กฐํ•ฉ์ด ๊ฐ„๋‹จํ•˜๋‹ค.
      • ๊ฐ„๋‹จํ•œ ์กฐํ•ฉ์ด๊ธฐ ๋•Œ๋ฌธ์— ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„๊ฐ€ ๋น„์Šทํ•œ ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์ด ์กด์žฌํ•œ๋‹ค.
    • ํ‰์ 
      • ์žฅ์†Œ๋ณ„ ํŽ˜์ด์ง€(place_url)์—์„œ ํ‰์ ์„ ๊ทธ๋Œ€๋กœ ์ถ”์ถœํ•˜์—ฌ ์‚ฌ์šฉํ•œ๋‹ค.
      • 1๋ช…์ด ํ‰๊ฐ€ํ•œ 5์ ๊ณผ 100๋ช…์ด ํ‰๊ฐ€ํ•œ 5์ ์ด ๋˜‘๊ฐ™์€ ํ‰์ ์œผ๋กœ ์ธ์‹๋œ๋‹ค.
  • ํš๋“ํ•œ ๋ฐ์ดํ„ฐ์—์„œ ์—ฌ๋Ÿฌ ์ œํ•œ์ ์ด ๋ฐœ๊ฒฌ๋˜์–ด ์ถ”์ฒœ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ„๋‹จํ•œ ํ˜•์‹์œผ๋กœ๋งŒ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.
  • ํ‰์  ๋ฟ์•„๋‹ˆ๋ผ ์‚ฌ๋žŒ๋“ค์˜ ๋ฆฌ๋ทฐ๋ฅผ ํ™œ์šฉํ•˜๋ฉด ์กฐ๊ธˆ ๋” ๋ณด์™„ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋ผ ์ƒ๊ฐ๋œ๋‹ค.

๐Ÿ› ๊ฒฐ๊ณผ์˜ ํ‰๊ฐ€์ง€ํ‘œ ๋ถ€์žฌ

  • ๋ผ๋ฒจ์ด ์กด์žฌํ•˜๋Š” ์ง€๋„ ํ•™์Šต์ด ์•„๋‹ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ถ”์ฒœ ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ํ‰๊ฐ€์ง€ํ‘œ๊ฐ€ ์—†๋‹ค.

6. ์ฐธ๊ณ  ์ž๋ฃŒ

7. ๊ตฌ์„ฑ ์ธ์›

  • ๊น€์ค€ํ˜• : Github
  • ๊น€๋‚จ๊ทœ : Github
profile
ํ˜„. ๊ฒŒ์ž„ ํšŒ์‚ฌ ๋ฐ์ดํ„ฐ ๋ถ„์„๊ฐ€ ๋ฐ ๊ณผํ•™์ž - ๋ฐ์ดํ„ฐ๋ฅผ ๊ณต๋ถ€ํ•˜๋Š” ์‚ฌ๋žŒ์ž…๋‹ˆ๋‹ค.

0๊ฐœ์˜ ๋Œ“๊ธ€