[인턴 프로젝트] 3. paddleOCR을 써보자

코드짜는침팬지·2023년 7월 20일

1. Paddle 설치하기

Easyocr을 이용한 영상처리의 경우 일부 bounding box를 못찾는 문제도 있었고
인식률이 그리 좋다고 할 순 없었다. 커스텀 트레이닝도 진행 했었는데 별로 결과가 좋지 못했고
중국 baidu에서 개발한 paddle을 사용 해보기로 했다.

아나콘다 환경을 만들되 파이썬 버전은 3.7로 해주자

https://github.com/PaddlePaddle/Paddle

깃허브 주소는 다음과 같다

git clone https://github.com/PaddlePaddle/Paddle

CPU 버전은 다음을 설치해주자

conda install paddlepaddle==2.5.0 --channel https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/

GPU 버전은 다음을 설치해주면 된다.

CUDA 10.2

conda install paddlepaddle-gpu==2.5.0 cudatoolkit=10.2 --channel https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/

CUDA 11.2

conda install paddlepaddle-gpu==2.5.0 cudatoolkit=11.2 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge

CUDA 11.6

conda install paddlepaddle-gpu==2.5.0 cudatoolkit=11.6 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge

CUDA 11.7

conda install paddlepaddle-gpu==2.5.0 cudatoolkit=11.7 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/Paddle/ -c conda-forge

tuna.tsinghua 이건 칭화대에서 관리하는 라이브러리이다.
중국 특: 깃허브 안됨

설치 후 파이썬에서

import paddle
paddle.utils.run_check()

다음 명령어를 입력했을 때

PaddlePaddle is installed successfully!

다음 결과가 출력되면 성공적으로 설치된것이다.
이제 paddle이 설치된 경로에서

pip install -r requirements.txt

다음 명령어를 입력 해 라이브러리들을 다운 받아주자

2. paddle 사용해보기

import cv2 #opencv
print(cv2.__version__)
from paddleocr import PaddleOCR, draw_ocr # main OCR dependencies
from matplotlib import pyplot as plt # plot images
import os # folder directory navigation
import numpy as np
os.environ['KMP_DUPLICATE_LIB_OK']='True' #이거 안해주면 오류남

모델 및 탐지

# Setup model
ocr_model = PaddleOCR(lang='en')
img_path = os.path.join('../', 'your_path') #그냥 경로복사해서 통째로 넣어도 된다.
# Run the ocr method on the ocr model
result = ocr_model.ocr(img_path)

for res in result:
 print(res[1][0])

구조를 보면 박스의 좌표, text 내용, accuracy로 이루어져 있다.

시각화 하기

# Extracting detected components
res = result[0]
print(type(res))
boxes = [res[i][0] for i in range(len(result[0]))] # 
texts = [res[i][1][0] for i in range(len(result[0]))]
scores = [float(res[i][1][1]) for i in range(len(result[0]))]

#for res in result:
print(boxes[0])
print(texts[0])
print(scores[0])#, Text: {res[1]}

# Specifying font path for draw_ocr method
font_path = os.path.join('../PaddleOCR', 'doc', 'fonts', 'latin.ttf')

# imports image
img = cv2.imread(img_path) 

# reorders the color channelsa
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) 

# Visualize our image and detections
# resizing display area
plt.figure(figsize=(65,65))

# draw annotations on image
annotated = draw_ocr(img, boxes, texts, scores, font_path=font_path) 

# show the image using matplotlib
plt.imshow(annotated)