๐Ÿ–ฅ๏ธ[Python] 7-2-2. ์›นํฌ๋กค๋ง (์œ ๊ฐ€ ๋ฐ์ดํ„ฐ ๊ฐ€์ ธ์˜ค๊ธฐ)

thisk336ยท2023๋…„ 6์›” 12์ผ
0

Python

๋ชฉ๋ก ๋ณด๊ธฐ
13/17
post-thumbnail

์ถœ์ฒ˜ : ํ•œ๊ตญ์„์œ ๊ณต์‚ฌ ์˜คํ”ผ๋„ท

์œ ๊ฐ€ ๋ฐ์ดํ„ฐ ๊ฐ€์ ธ์˜ค๊ธฐ

  • ๋จผ์ € 'opinet.co.kr'์€ ๋™์  ํŽ˜์ด์ง€๋กœ ์ž‘๋™๋˜๊ธฐ ๋•Œ๋ฌธ์— selenium๊ณผ chromedriver๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํŽ˜์ด์ง€์— ์ ‘์†ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
from selenium import webdriver
from selenium.webdriver.common.by import By
import time

driver = webdriver.Chrome('chromedriver') #Chrome driver๋ฅผ ๋ถˆ๋Ÿฌ์™€์„œ driver ๋ณ€์ˆ˜์— ์ €์žฅํžŒ๋‹ค.
driver.get("https://www.opinet.co.kr/searRgSelect.do") 
# driver์— url์„ ์ ‘์†์‹œ์ผœ webdriver๋ฅผ ํ‚จ๋‹ค.

time.sleep(2) # ์›นํŽ˜์ด์ง€๋ฅผ ํ‚ค๊ณ  ๋ฐ”๋กœ ์ž‘๋™ํ•˜๋ฉด ์ธํ„ฐ๋„ท ์†๋„ ๋“ฑ ์š”๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—
              # 2์ดˆ ์‰ฐ๋‹ค.

  • ํ•ด๋‹น ํŽ˜์ด์ง€์—์„œ ์œ ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์˜ค๋ ค๋ฉด ๋จผ์ € ์‹œ/๋„๋ฅผ ์„ ํƒํ•œ ํ›„ ์‹œ/๊ตฐ/๊ตฌ๋ฅผ ์„ ํƒํ•ด ์กฐํšŒ๋ฅผ ๋ˆ„๋ฅด๊ณ  ์ˆ˜์ง‘๋œ ๊ฐ€๊ฒฉ์„ ์—‘์…€์ €์žฅ ๋ฒ„ํŠผ์„ ํด๋ฆญํ•˜์—ฌ ๊ฐ€์ ธ์™€์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • ๋”ฐ๋ผ์„œ ์‹œ/๋„ ๋ชฉ๋ก๊ณผ ์‹œ/๊ตฐ/๊ตฌ ๋ชฉ๋ก์„ ๋จผ์ € ์ถ”์ถœํ•œ ๋’ค ์–ป์–ด์˜จ ๋ชฉ๋ก์œผ๋กœ ๋ฐ˜๋ณต๋ฌธ์„ ์ˆ˜ํ–‰ํ•˜๋ฉด์„œ ์กฐํšŒ๋ฅผ ๋ˆ„๋ฅด๊ณ  ์—‘์…€์ €์žฅ์„ ๋ˆ„๋ฅด๋Š” ์ž๋™ํ™”๋œ ์›น ํฌ๋กค๋Ÿฌ๋ฅผ ๊ตฌ์ถ•ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์‹œ/๋„ ๋ชฉ๋ก ์ˆ˜์ง‘

sido = driver.find_element(By.XPATH, '//*[@id="SIDO_NM0"]')
#Xpath๋ฅผ ํ™œ์šฉํ•ด ์‹œ/๋„์— ์žˆ๋Š” ๋ชจ๋“  element๋ฅผ ๋ถˆ๋Ÿฌ์˜จ ํ›„ ์ €์žฅํ•œ๋‹ค.

sido_names = sido.find_elements(By.TAG_NAME,'option')
#Tag_name์„ ํ™œ์šฉํ•ด ์‹œ/๋„์— ์žˆ๋Š” ๋ชจ๋“  option ๊ฐ’์„ sido_names ๋ณ€์ˆ˜์— ์ €์žฅํ•œ๋‹ค.

sido_list = [] #sido_list๋ผ๋Š” ๋ฆฌ์ŠคํŠธ๋ฅผ ๋งŒ๋“ค์–ด ์ดˆ๊ธฐํ™” ์‹œ์ผœ์ค€๋‹ค.

for sido_name in sido_names : # ์ €์žฅ๋œ sido_names๋ฅผ for๋ฌธ์„ ์‚ฌ์šฉํ•˜์—ฌ
    sido_list.append(sido_name.get_attribute('value')) # appendํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด sido_list์— ์ €์žฅํ•œ๋‹ค.
sido_list = sido_list[1:] # ํ•„์š”์—†๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์šด๋‹ค.

์‹œ/๊ตฐ/๊ตฌ ๋ชฉ๋ก ์ˆ˜์ง‘

sigungu = driver.find_element(By.XPATH, '//*[@id="SIGUNGU_NM0"]')
#Xpath๋ฅผ ํ™œ์šฉํ•ด ์‹œ/๊ตฐ/๊ตฌ์— ์žˆ๋Š” ๋ชจ๋“  element๋ฅผ ๋ถˆ๋Ÿฌ์˜จ ํ›„ ์ €์žฅํ•œ๋‹ค.

sigungu_names = sigungu.find_elements(By.TAG_NAME,'option')
#Tag_name์„ ํ™œ์šฉํ•ด ์‹œ/๊ตฐ/๊ตฌ์— ์žˆ๋Š” ๋ชจ๋“  option ๊ฐ’์„ sigungu_names ๋ณ€์ˆ˜์— ์ €์žฅํ•œ๋‹ค.

sigungu_list = [] #sigungu_list๋ผ๋Š” ๋ฆฌ์ŠคํŠธ๋ฅผ ๋งŒ๋“ค์–ด ์ดˆ๊ธฐํ™” ์‹œ์ผœ์ค€๋‹ค.

for sigungu_name in sigungu_names : # ์ €์žฅ๋œ sigungu_name์„ for๋ฌธ์„ ์‚ฌ์šฉํ•˜์—ฌ
    sigungu_list.append(sigungu_name.get_attribute('value')) # appendํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด sigungu_name์— ์ €์žฅํ•œ๋‹ค.
sigungu_list = sigungu_list[1:] # ํ•„์š”์—†๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์šด๋‹ค.

์กฐํšŒ ๋ฒ„ํŠผ ํด๋ฆญํ•˜๊ธฐ

# "์กฐํšŒ" ๋ฒ„ํŠผ์˜ xpath๋ฅผ ์ฐพ์•„์„œ ํด๋ฆญํ•œ๋‹ค.
driver.find_element(By.XPATH, '//*[@id="searRgSelect"]').click()

์—‘์…€ ์ €์žฅ ๋ฒ„ํŠผ ํด๋ฆญํ•˜๊ธฐ

# "์—‘์…€์ €์žฅ" ๋ฒ„ํŠผ์˜ xpath๋ฅผ ์ฐพ์•„์„œ ํด๋ฆญํ•œ๋‹ค.
driver.find_element(By.XPATH, '//*[@id="glopopd_excel"]').click()

์ฝ”๋“œ๋ฅผ ํ†ตํ•ฉํ•˜์—ฌ ํ•˜๋‚˜์˜ ์ฝ”๋“œ๋กœ ๋งŒ๋“ค์–ด๋ด…์‹œ๋‹ค.

from selenium import webdriver
from selenium.webdriver.common.by import By
import time

driver = webdriver.Chrome('chromedriver') #Chrome driver๋ฅผ ๋ถˆ๋Ÿฌ์™€์„œ driver ๋ณ€์ˆ˜์— ์ €์žฅํžŒ๋‹ค.
driver.get("https://www.opinet.co.kr/searRgSelect.do") 
# driver์— url์„ ์ ‘์†์‹œ์ผœ webdriver๋ฅผ ํ‚จ๋‹ค.

time.sleep(2) # ์›นํŽ˜์ด์ง€๋ฅผ ํ‚ค๊ณ  ๋ฐ”๋กœ ์ž‘๋™ํ•˜๋ฉด ์ธํ„ฐ๋„ท ์†๋„ ๋“ฑ ์š”๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—
              # 2์ดˆ ์‰ฐ๋‹ค.

sido = driver.find_element(By.XPATH, '//*[@id="SIDO_NM0"]')
#Xpath๋ฅผ ํ™œ์šฉํ•ด ์‹œ/๋„์— ์žˆ๋Š” ๋ชจ๋“  element๋ฅผ ๋ถˆ๋Ÿฌ์˜จ ํ›„ ์ €์žฅํ•œ๋‹ค.
sido_names = sido.find_elements(By.TAG_NAME,'option')
#Tag_name์„ ํ™œ์šฉํ•ด ์‹œ/๋„์— ์žˆ๋Š” ๋ชจ๋“  option ๊ฐ’์„ sido_names ๋ณ€์ˆ˜์— ์ €์žฅํ•œ๋‹ค.

sido_list = [] #sido_list๋ผ๋Š” ๋ฆฌ์ŠคํŠธ๋ฅผ ๋งŒ๋“ค์–ด ์ดˆ๊ธฐํ™” ์‹œ์ผœ์ค€๋‹ค.
for sido_name in sido_names : # ์ €์žฅ๋œ sido_names๋ฅผ for๋ฌธ์„ ์‚ฌ์šฉํ•˜์—ฌ
    sido_list.append(sido_name.get_attribute('value')) # appendํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด sido_list์— ์ €์žฅํ•œ๋‹ค.
sido_list = sido_list[1:] # ํ•„์š”์—†๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์šด๋‹ค.

for sido_name in sido_list : # for๋ฌธ์„ ์ด์šฉํ•ด์„œ ๊ฐ€์ ธ์˜จ ์‹œ/๋„ ๋ฐ์ดํ„ฐ๋ฅผ ์ „๋‹ฌํ•œ๋‹ค.
    sido = driver.find_element(By.XPATH, '//*[@id="SIDO_NM0"]') # ํŽ˜์ด์ง€๊ฐ€ ๋กœ๋”ฉ๋  ๋•Œ ๋งˆ๋‹ค xpath๋ฅผ ์ฐพ์œผ๋ฉด์„œ
                                                                # for๋ฌธ์„ ๋Œ๋ ค์ค˜์•ผ ์—๋Ÿฌ๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š๋Š”๋‹ค.
    sido.send_keys(sido_name) #์‹œ/๋„ ๋ฐ์ดํ„ฐ๋ฅผ ์›นํŽ˜์ด์ง€์— ์ „๋‹ฌํ•œ๋‹ค.
    time.sleep(2) # 2์ดˆ ์‰ฐ๋‹ค.

    sigungu = driver.find_element(By.XPATH, '//*[@id="SIGUNGU_NM0"]') # ์‹œ/๊ตฐ/๊ตฌ์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ์ €์žฅ๋˜์–ด ์žˆ๋Š” Xpath๋ฅผ ์ฐพ์•„์„œ
                                                                      # sigungu ๋ณ€์ˆ˜์— ์ €์žฅํ•œ๋‹ค.
    sigungu_names = sigungu.find_elements(By.TAG_NAME,'option') #sigungu์—์„œ 'Option' ์— ํ•ด๋‹นํ•˜๋Š” elements๋ฅผ 
                                                                #sigungu_names ๋ณ€์ˆ˜์— ์ €์žฅํ•œ๋‹ค.

    sigungu_list = [] #sigungu_list๋ผ๋Š” ๋ฆฌ์ŠคํŠธ๋ฅผ ๋งŒ๋“ค์–ด ์ดˆ๊ธฐํ™” ์‹œ์ผœ์ค€๋‹ค.
    for sigungu_name in sigungu_names : #for๋ฌธ์„ ์ด์šฉํ•˜์—ฌ ๊ฐ€์ ธ์˜จ ์‹œ/๊ตฐ/๊ตฌ ๋ฐ์ดํ„ฐ๋ฅผ ์ „๋‹ฌํ•œ๋‹ค.
        sigungu_list.append(sigungu_name.get_attribute('value')) # appendํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด sigungu_list์— ์ €์žฅํ•œ๋‹ค.
    sigungu_list = sigungu_list[1:] # ํ•„์š”์—†๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ง€์šด๋‹ค.
    for sigungu_name in sigungu_list :
        sigungu = driver.find_element(By.XPATH, '//*[@id="SIGUNGU_NM0"]')
        time.sleep(2)
        sigungu.send_keys(sigungu_name) #์‹œ/๊ตฐ/๊ตฌ ๋ฐ์ดํ„ฐ๋ฅผ ์›นํŽ˜์ด์ง€์— ์ „๋‹ฌํ•œ๋‹ค.
        time.sleep(2)
        driver.find_element(By.XPATH, '//*[@id="searRgSelect"]').click() #์กฐํšŒ ๋ฒ„ํŠผ์˜ xpath๋ฅผ ๋ถˆ๋Ÿฌ์™€ ํด๋ฆญํ•œ๋‹ค.
        time.sleep(2)
        driver.find_element(By.XPATH, '//*[@id="glopopd_excel"]').click() #์—‘์…€์ €์žฅ ๋ฒ„ํŠผ์˜ xpath๋ฅผ ๋ถˆ๋Ÿฌ์™€ ํด๋ฆญํ•œ๋‹ค.
        time.sleep(2)

0๊ฐœ์˜ ๋Œ“๊ธ€