디코딩(Decoding)이란?!
컴퓨터가 이해할 수 있는 기계어를, 사람이 이해할 수 있는 정보로 바꾸는 것 ( = 디컴파일 )
from urllib.request import urlopen, Request
url = "https://suwoni-codelab.com/assets/story.txt"
header ={"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"}
response = Request(url, headers=header)
with urlopen(response) as story:
for line in story:
print(line)
output
출력되는 것은 byte자료형으로 됨 ==> 문자열(str 자료형)로 다루려면 문자 코드를 지정해서 디코딩
b'\xeb\x82\x98\xeb\x8a\x94 Python\xec\x9d\x84 \xea\xb3
ec\x9c\xbc\xeb\xa1\x9c \xec\x97\xac\xeb\x9f\xac\xea\xb0\x80\xec\xa7\x80
\xec\x95\xb1 \xea\xb7\xb8\xeb\xa6\xac\xea\xb3\xa0 \xec\x9e\x90\xeb\x8f\x9
디코딩(Decoding)해보기
from urllib.request import urlopen, Request
url = "https://suwoni-codelab.com/assets/story.txt"
header ={"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"}
response = Request(url, headers=header)
with urlopen(response) as story:
story_words =[]
for line in story:
line_words = line.decode('utf-8').split()
for word in line_words:
story_words.append(word)
for word in story_words:
print(word)