[해결 전]
197 server 에서,
VSCODE 에서,
jeongsj@ubuntu:~$ pip install simpletransformers
jeongsj@ubuntu:~$ !wget https://raw.githubusercontent.com/korquad/korquad.github.io/master/dataset/KorQuAD_v1.0_train.json -O KorQuAD_v1.0_train.json
jeongsj@ubuntu:~$ !wget https://raw.githubusercontent.com/korquad/korquad.github.io/master/dataset/KorQuAD_v1.0_dev.json -O KorQuAD_v1.0_dev.json
# python_file.py
import json
with open('KorQuAD_v1.0_train.json', 'r') as f:
train_data = json.load(f)
train_data = [item for topic in train_data['data'] for item in topic['paragraphs'] ]
print(train_data[0:10])
jeongsj@ubuntu:~$ python python_file.py
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 107692: invalid continuation byte
[원인]
jupyter notebook(command 환경)과 vscode(prompt 환경)의 명령어 차이
jupyter notebook: !(느낌표)를 붙인다.
vscode: !(느낌표)를 붙이지 않는다.
[해결 방법]
197 server 에서,
VSCODE 에서,
jeongsj@ubuntu:~$ pip install simpletransformers
jeongsj@ubuntu:~$ wget https://raw.githubusercontent.com/korquad/korquad.github.io/master/dataset/KorQuAD_v1.0_train.json -O KorQuAD_v1.0_train.json
('-O KorQuAD_v1.0_train.json' 생략 가능)
jeongsj@ubuntu:~$ wget https://raw.githubusercontent.com/korquad/korquad.github.io/master/dataset/KorQuAD_v1.0_dev.json -O KorQuAD_v1.0_dev.json
('-O KorQuAD_v1.0_dev.json' 생략 가능)
# python_file.py
import json
with open('KorQuAD_v1.0_train.json', 'r') as f:
train_data = json.load(f)
train_data = [item for topic in train_data['data'] for item in topic['paragraphs'] ]
print(train_data[0:10])
jeongsj@ubuntu:~$ python python_file.py
[해결 후]