bs4

차노·2023년 8월 10일
0

Beautiful Soup is a python web scraping library for pulling data from web pages, documents, HTML, and XML files.

It creates a parse tree from page source code taht can be used to extract data in a hierarchical and more readable manner.

pip is a package management system used to install and manage software packages written in python.

Reference

The Python libraries requests and Beautiful Soup are powerful tools for the job.

  • Use requests and Beautiful Soup for scraping and parsing data from the web.

Scraping

Web Scraping is the process of gathering information from the internet. Even copying and pasting the lyrics of your favourite song is a form of web scraping. It means 'automation' in computer science.

Why is scraping diffult?

The first is Varitety. Because every website is different, you might encounter unique problems.
The second is Durability. Websites constantly change.

You can access the data directly using formats like JSON and XML. HTML is primarily a way to represent content to users visually.

Parse HTML Code With Beautiful Soup

Beautiful Soup is a Python library for parsing structured data.

Beautiful Soup은 구조화된 데이터를 파싱하는 파이썬 라이브러리이다.

It allows you to interact with HTML in a similar way to how you interact with a web page using developer tools.

In an HTML, web page, every element can have an id attribute assigned.

Reference

0개의 댓글