Python - Selenium으로 필요한 데이터 가져오기(수작업 줄이기)

프동프동·2022년 7월 6일

python selenium

Python

목록 보기

1/2

간만에 노가다할 작업이 있어서 Python - Selenium을 사용했다.

meshswap - Dashboard에 있는 데이터 긁어오기

사전 설치

자신이 사용하는 크롬 버전에 맞게 chromedriver 설치 후 소스파일 디렉터리 경로에 두기
https://chromedriver.chromium.org/downloads
python 3.x 설치
selenium 설치

pip3 install selenium

코드 작성(구 버전)

url = 'https://meshswap.fi/exchange/pool'
if __name__ == '__main__':
    driver = webdriver.Chrome() 
    driver.get(url) // 해당 url로 크롬 브라우저를 열어줘

에러 발생

원인

selenium이 버전업이 되면서 이제 사용하지 않는 형식이라는 의미

DeprecationWarning: executable_path has been deprecated, please pass in a Service object

해결 방법

webdriver-manager 설치

 pip install webdriver-manager

코드 작성 (코드 수정)

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager

url = 'https://meshswap.fi/exchange/pool'
if __name__ == '__main__':
    driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
    driver.get(url)

driver.get(url)을 입력하면 조작 가능한 크롬브라우저 창이 출력된다.

코드 해석: 브라우저가 열리고 XPATH(특정 데이터)까지 완전히 읽어올 수 있을 때까지(최대 5초)동안 기다리겠다.

meshswap 대시보드를 가보시면 아시겠지만 데이터를 가져와 화면에 뿌려주는데 짧게 로딩이 있습니다.

 wait = WebDriverWait(driver, 5)
    element = wait.until(EC.element_to_be_clickable(
        (By.XPATH, '//*[@id="app"]/main/section/section/section/article[2]/section[1]/div[2]/div[1]')))

XPATH 가져오는 방법

가져오려는 XPATH 찾는 방법

저는 페어명을 가져오겠습니다.

결과 확인

응용

바로 아래에 있는 페어도 가져와서 비교해보자

특정한 부분만 다르고 모두 같은 것을 알 수 있다.

전체 코드

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
from selenium.common.exceptions import NoSuchElementException
url = 'https://meshswap.fi/exchange/pool'
if __name__ == '__main__':
    driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
    driver.get(url)
    wait = WebDriverWait(driver, 10)
    element = wait.until(EC.element_to_be_clickable(
        (By.CSS_SELECTOR, '#exchange-page > div > section > div.pool-main-body > section > article > div.pool-table__body > div:nth-child(1) > div.pool-table__col.pool-table__col--pair > div:nth-child(2) > p:nth-child(1) > strong')))

    temp = []
    result = driver.find_element(By.CSS_SELECTOR,
                                 '#exchange-page > div > section > div.pool-main-body > section > article > div.pool-table__body > div:nth-child(1) > div.pool-table__col.pool-table__col--pair > div:nth-child(2) > p:nth-child(1) > strong'.format(2)).text
    print(result)
    try:
        for i in range(1, 10):
            if(i>1):
                page = driver.find_element(By.CSS_SELECTOR,
                                           '#exchange-page > div > section > div.pool-main-body > section > section > button.common-pager-button.common-pager-button--next').click()
                time.sleep(1)
            for j in range(1, 11):
                pare = driver.find_element(By.CSS_SELECTOR,'#exchange-page > div > section > div.pool-main-body > section > article > div.pool-table__body > div:nth-child({0}) > div.pool-table__col.pool-table__col--pair > div:nth-child(2) > p:nth-child(1)'.format(j)).text
                apy= driver.find_element(By.CSS_SELECTOR,'#exchange-page > div > section > div.pool-main-body > section > article > div.pool-table__body > div:nth-child({0}) > div.pool-table__col.pool-table__col--estimated > p.pool-table__col.pool-table__col--main-rate'.format(j)).text
                tvl = driver.find_element(By.CSS_SELECTOR,'#exchange-page > div > section > div.pool-main-body > section > article > div.pool-table__body > div:nth-child({0}) > div.pool-table__col.pool-table__col--liquidity > p'.format(j)).text
                temp.append(pare)
                print('{0};{1};{2}'.format(pare, tvl, apy))
                time.sleep(0.3)

    except NoSuchElementException:
        print(len(temp))