[Python]Web Scraping-1

정현석·2020년 9월 25일



import requests 
from bs4 import BeautifulSoup
#bs4 버전 BeautifulSoup import

indeed_result = requests.get("https://www.indeed.com/jobs?q=python&limit=50")
#url 주소 변수 저장
indeed_soup = BeautifulSoup(indeed_result.text, "html.parser")
# BeautifulSoup 변수 설정 indeed site에서 text 가져오기
  
pagination = indeed_soup.find("div", {"class": "pagination"})
#pagination , indeed_soup.find 명령어로 찾기
#div 로 사용된 class 

links = pagination.find_all('a')
#pagination에서 찾은 div, class 중에서 'a'(anchor) 찾기

pages = [] # 비어있는 리스트 작성
for link in links[:-1] : 
    pages.append(int(link.string)) 
    # find("span")
    #Links 에 있는 각 anchor의 span 안에 있는 string만 검색 
    #link 에 변수 저장, 
    # string 을 int 로 변경 // 마지막 next 는 int로 변환이 안되니 마지막 직전까지 저장
# pages = pages[0:-1]
# [-1]spans 은 모두 가져오되 마지막 것은 제외
# [0:5] 으로 할 경우 첫5개의 item 불러오기
# [0:-1] 처음부터 마지막 요소까지 실행 ( 마지막 직전 ) 
# print (pages[-1]) 
# 마지막 숫자 출력
max_page = pages[-1]

정현석

기록하는 벨로그

이전 포스트

[Python] if-else

다음 포스트

[Python]Web Scraping-1

[Python] if-else

[Python]Web Scraping-2

0개의 댓글