크롤링 기본 코드
import requests
from bs4 import BeautifulSoup
url = 'https://movie.naver.com/movie/bi/mi/basic.nhn?code=171539'
headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
data = requests.get(url,headers=headers)
soup = BeautifulSoup(data.text, 'html.parser')
import requests
from bs4 import BeautifulSoup
url = 'https://movie.naver.com/movie/bi/mi/basic.nhn?code=171539'
headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
data = requests.get(url,headers=headers)
soup = BeautifulSoup(data.text, 'html.parser')
title = soup.select_one('meta[property="og:title"]')
print(title)
실행결과 (성공적으로 가지고 옴)
C:\Users\rokimo\Desktop\sparta\projects\alonememo\venv\Scripts\python.exe C:/Users/rokimo/Desktop/sparta/projects/alonememo/meta_prac.py
<meta content="그린 북" property="og:title"/>
Process finished with exit code 0
print(title['content']) 와 같이 입력하면 텍스트만 가지고 오게 된다.
import requests
from bs4 import BeautifulSoup
url = 'https://movie.naver.com/movie/bi/mi/basic.nhn?code=171539'
headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
data = requests.get(url,headers=headers)
soup = BeautifulSoup(data.text, 'html.parser')
title = soup.select_one('meta[property="og:title"]')['content']
print(title)
실행결과
C:\Users\rokimo\Desktop\sparta\projects\alonememo\venv\Scripts\python.exe C:/Users/rokimo/Desktop/sparta/projects/alonememo/meta_prac.py
그린 북
Process finished with exit code 0
위와 같은 방식으로 여러 요소를 계속 크롤링한다.
import requests
from bs4 import BeautifulSoup
url = 'https://movie.naver.com/movie/bi/mi/basic.nhn?code=171539'
headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
data = requests.get(url,headers=headers)
soup = BeautifulSoup(data.text, 'html.parser')
title = soup.select_one('meta[property="og:title"]')['content']
image = soup.select_one('meta[property="og:image"]')['content']
desc = soup.select_one('meta[property="og:description"]')['content']
print(title,image,desc)
실행결과
C:\Users\rokimo\Desktop\sparta\projects\alonememo\venv\Scripts\python.exe C:/Users/rokimo/Desktop/sparta/projects/alonememo/meta_prac.py
그린 북 https://movie-phinf.pstatic.net/20190115_228/1547528180168jgEP7_JPEG/movie_image.jpg?type=m665_443_2 1962년 미국, 입담과 주먹만 믿고 살아가던 토니 발레롱가(비고 모텐슨)는 교양과 우아함 그 자체인천재...
Process finished with exit code 0