조코딩님의 [파이썬 셀레니움 이미지 크롤링으로 배우는 업무 자동화의 기초] 강의를 보며 정리한 글입니다.
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.keys import Keys
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get("http://www.python.org")
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.keys import Keys
import time
import urllib.request
driver = webdriver.Chrome(ChromeDriverManager().install())
# Search images from Google Images
driver.get("https://www.google.com/imghp?hl=en")
elem = driver.find_element_by_name("q")
elem.send_keys("cute cats")
elem.send_keys(Keys.RETURN)
SCROLL_PAUSE_TIME = 1
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(SCROLL_PAUSE_TIME)
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
try:
driver.find_element_by_css_selector(".mye4qd").click()
except:
break
last_height = new_height
# Get each image URL and download it
count = 1
images = driver.find_elements_by_css_selector(".rg_i.Q4LuWd")
for image in images:
try:
image.click()
time.sleep(3)
img_url = driver.find_element_by_xpath("/html/body/div[2]/c-wiz/div[3]/div[2]/div[3]/div/div/div[3]/div[2]/c-wiz/div[1]/div[1]/div/div[2]/a/img").get_attribute("src")
opener = urllib.request.build_opener()
opener.addheaders = [('User-Agent','Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1941.0 Safari/537.36')]
urllib.request.install_opener(opener)
urllib.request.urlretrieve(img_url, str(count) + ".jpg")
count = count + 1
except:
pass
# Close driver
driver.close()
이미지 auto 다운로드:
다운로드 이미지 중 1:
다운로드 성공! 이제 BeautifulSoup와 함께 써봐야겠다!
Reference:
https://intellipaat.com/community/15101/selenium-chromedriver-executable-needs-to-be-in-path
https://www.youtube.com/watch?v=1b7pXC1-IbE&t=1498s