id
와 class
는 Locator로서, 특정 태그를 지칭하는 데에 사용됩니다.
- tagname: 태그의 이름
- id: 하나의 고유 태그를 가리키는 라벨
- class: 여러 태그를 묶는 라벨
<p>This element has only tagname</p> <p id="target">This element has tagname and id</p> <p class="targets">This element has tagname and class</p>
import requests
from bs4 import BeautifulSoup
res = requests.get("http://example.python-scraping.com/")
soup = BeautifulSoup(res.text, "html.parser")
id
없이 div 태그를 찾습니다.soup.find("div")
<div class="navbar navbar-inverse"> <div class="flash"></div> <div class="navbar-inner"> <div class="container"> <!-- the next tag is necessary for bootstrap menus, do not remove --> <button class="btn btn-navbar" data-target=".nav-collapse" data-toggle="collapse" style="display:none;" type="button"> <span class="icon-bar"></span> <span class="icon-bar"></span> <span class="icon-bar"></span> </button> <ul class="nav pull-right" id="navbar"><li class="dropdown"><a class="dropdown-toggle" data-toggle="dropdown" href="#" rel="nofollow">Log In</a><ul class="dropdown-menu"><li><a href="/places/default/user/register?_next=/places/default/index" rel="nofollow"><i class="icon icon-user glyphicon glyphicon-user"></i> Sign Up</a></li><li class="divider"></li><li><a href="/places/default/user/login?_next=/places/default/index" rel="nofollow"><i class="icon icon-off glyphicon glyphicon-off"></i> Log In</a></li></ul></li></ul> <div class="nav"> <ul class="nav"><li class="web2py-menu-first"><a href="/places/default/index">Home</a></li><li class="web2py-menu-last"><a href="/places/default/search">Search</a></li></ul> </div><!--/.nav-collapse --> </div> </div> </div>
id
가 results인 div 태그를 찾습시다.
soup.find("div", id = "results")
<div id="results"> <table><tr><td><div><a href="/places/default/view/Afghanistan-1"><img src="/places/static/images/flags/af.png"/> Afghanistan</a></div></td><td><div><a href="/places/default/view/Aland-Islands-2"><img src="/places/static/images/flags/ax.png"/> Aland Islands</a></div></td></tr><tr><td><div><a href="/places/default/view/Albania-3"><img src="/places/static/images/flags/al.png"/> Albania</a></div></td><td><div><a href="/places/default/view/Algeria-4"><img src="/places/static/images/flags/dz.png"/> Algeria</a></div></td></tr><tr><td><div><a href="/places/default/view/American-Samoa-5"><img src="/places/static/images/flags/as.png"/> American Samoa</a></div></td><td><div><a href="/places/default/view/Andorra-6"><img src="/places/static/images/flags/ad.png"/> Andorra</a></div></td></tr><tr><td><div><a href="/places/default/view/Angola-7"><img src="/places/static/images/flags/ao.png"/> Angola</a></div></td><td><div><a href="/places/default/view/Anguilla-8"><img src="/places/static/images/flags/ai.png"/> Anguilla</a></div></td></tr><tr><td><div><a href="/places/default/view/Antarctica-9"><img src="/places/static/images/flags/aq.png"/> Antarctica</a></div></td><td><div><a href="/places/default/view/Antigua-and-Barbuda-10"><img src="/places/static/images/flags/ag.png"/> Antigua and Barbuda</a></div></td></tr></table> </div>
class
를 이용하여 div 태그를 찾습니다.
find_result = soup.find("div", "page-header") find_result.text
'\n\n Example web scraping website\n \n\n'