위키

포럼

도구

뷰티플 수프

2021-12-18

편집

역링크

토론

1 개요[ | ]

Beautiful Soup
뷰티플 수프, 뷰터펄 숲 [bjúːtəfəl suːp]

HTML과 XML 문서를 파싱하는 파이썬 패키지
잘못 쓴 태그, 안닫힌 태그 등 소위 '태그 수프'를 잘 처리할 수 있다.

pip install BeautifulSoup4

2 예시 1[ | ]

from bs4 import BeautifulSoup
print(BeautifulSoup("<html><head></head><body>Sacr&eacute; bleu!</body></html>", "html.parser"))

→ HTML 엔티티가 유니코드 문자로 변환되었다.

3 예시 2[ | ]

웹 상의 HTML 페이지를 읽어와서 파싱한다.
requests와 함께 사용한 예시

import requests
from bs4 import BeautifulSoup

r = requests.get('https://en.wikipedia.org/wiki/Main_Page')
soup = BeautifulSoup(r.text, 'html.parser')
for anchor in soup.find_all('a'):
    print(anchor.get('href', '/'))

4 같이 보기[ | ]

5 참고[ | ]

원본 주소 "https://zetawiki.com/w/index.php?title=뷰티플_수프&oldid=797403"

수정 2021-12-18 생성 2017-06-04

편집자

문서 댓글 ({{ doc_comments.length }})

{{ comment.name }} {{ comment.created | snstime }}

분류 댓글:
{{cat.name.replace(/_/g,' ')}} ({{cat.cnt}})

{{comment.page_title}}
― {{comment.name}}

CC-BY-SA 3.0 · Powered by MediaWiki

개인정보처리방침 · ABOUT