Crawling- Timing Matters

nara_lee·2025년 4월 25일
0
post-thumbnail

Let's Compare Block1✅ and Block2❌

✅ 1

while True:
    try:
        load_more_button = wait.until(EC.element_to_be_clickable((By.XPATH, "//button[starts-with(text(), 'Load') and contains(text(), 'More Products')]")))
        driver.execute_script("arguments[0].scrollIntoView();", load_more_button)
        time.sleep(1)
        load_more_button.click()
        time.sleep(2)  # wait for products to load
    except:
        print("No more 'Load More' button. All products loaded.")
        break

❌ 2

while True: 
    elem = driver.find_element(By.CSS_SELECTOR,"#idReactCategory > div.category-grid.category-grid--no-filters > div.category-grid__main > div > div.load-more > button")
    if elem is None:
        break
    elem.click()
    time.sleep(0.1)

✅ Why #1 works

load_more_button = wait.until(EC.element_to_be_clickable((By.XPATH, "//button[starts-with(text(), 'Load') and contains(text(), 'More Products')]")))
  • It waits for the button to become clickable using:
    WebDriverWait + expected_conditions.element_to_be_clickable
  • This solves most timing issues where elements exist in the DOM but aren’t yet visible or interactable.
  • It scrolls the button into view:
    driver.execute_script("arguments[0].scrollIntoView();", load_more_button)
    This is important if the button is below the fold and otherwise not interactable.

So this block is smart:

Wait → Scroll → Click → Wait again.


❌ Why #2 fails

elem = driver.find_element(By.CSS_SELECTOR,"#idReactCategory > div...button")
  • find_element does not wait — it looks for the element immediately.
    If the page is still loading, or the button is temporarily not visible → it throws an error or elem is None.
  • It doesn't check if the button is clickable or visible.
  • It doesn’t scroll the button into view — this is important for many React/Vue/SPAs where elements are rendered lazily.

Also:

if elem is None:
    break

This condition is never triggered, because find_element() raises an Exception if it doesn't find anything — it doesn’t return None. So you should’ve wrapped it in a try-except.


✅ How to fix #2

Here's how you'd rewrite #2 in a reliable way using CSS_SELECTOR + smart waiting:

from selenium.common.exceptions import TimeoutException

while True:
    try:
        # Wait for the button to be clickable
        load_more_button = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "#idReactCategory .load-more > button")))
        driver.execute_script("arguments[0].scrollIntoView();", load_more_button)
        time.sleep(0.5)
        load_more_button.click()
        time.sleep(1)  # let products load
    except TimeoutException:
        print("No more 'Load More' button or not clickable.")
        break

TL;DR

Reason#1 ✅ Works#2 ❌ Doesn’t Work
Uses explicit waitWebDriverWait + clickable❌ Direct find_element()
Checks clickable
Scrolls into view
Handles timing issues
Breaks on missing elem✅ via try-exceptif elem is None won’t work

본 후기는 [한글과컴퓨터x한국생산성본부x스나이퍼팩토리] 한컴 AI 아카데미 (B-log) 리뷰로 작성 되었습니다.

#한컴AI아카데미 #AI개발자 #AI개발자교육 #한글과컴퓨터 #한국생산성본부 #스나이퍼팩토리 #부트캠프 #AI전문가양성 #개발자교육 #개발자취업

0개의 댓글