미디어파이프로 자세 인지 모델 만들기

yun·2023년 11월 11일

DecisionTree Detection ML mediapipe

ML/DL

목록 보기

4/9

소개

용도: https://developers.google.com/mediapipe/solutions/guide
설치방법: https://developers.google.com/mediapipe/solutions/setup_python

ubuntu 22.04 / Python 3.10.12 환경이므로 바로 설치해 보기로 했다.
```
pip install mediapipe
```

관절 좌표를 알아보자

Pose landmark detection: https://developers.google.com/mediapipe/solutions/vision/pose_landmarker/python
- import
```
from mediapipe.tasks.python import vision
```
- model: https://developers.google.com/mediapipe/solutions/vision/pose_landmarker/index#models
  
  미디어파이프에서 제공되는 Pose landmarker model로 33개 좌표를 인식할 수 있다.

자세를 어떻게 인지할 수 있을까?

과졔: [오른팔 들었음], [왼팔 들었음], [두 팔 다 들었음], [두 팔 다 내렸음] 4가지 자세를 구분할 것
- 오른쪽 어깨, 오른쪽 팔꿈치, 오른쪽 손목(12, 14, 16)
- 왼쪽 어깨, 왼쪽 팔꿈치, 왼쪽 손목(11, 13, 15)
- 어깨보다 손목이 위에 있으면 팔을 든 것, 어깨보다 손목이 아래에 있으면 팔을 내린 것으로 인식하자.
- 노트북 웹캠 화면으로는 팔을 내리면 팔이 보이지 않는다. 팔꿈치/손목이 감지되지 않았을 때도 팔을 내린 것으로 인식하자.
- Task의 input과 output
  - input: 정적 이미지, 비디오 프레임, 실시간 비디오 피드
  - output: pose landmarks (optional: segmentation mask)
- Pose landmarker에 Lite, Full, Heavy가 있는데 어떻게 다른가?
  - Heavy > Full > Lite 순으로 PDJ가 높다.
  - Q. PDJ?
    - Percentage of Detected Joints: 탐지된 관절 비율
    - 제일 낮은 Lite도 평균 87%, 간단한 프로젝트에는 사용해도 될 것 같다.
- 피부색이나 성별에 따라 탐지율이 다른가?
  - 14개 지역에서 각 100개씩, 1400개의 이미지로 테스트한 결과
  - Fitzpatrick scale에 따라 피부를 1에서 6까지로 구분해서 기록
  - Q. Fitzpatrick scale?
    - 가장 밝은 피부를 타입1, 가장 어두운 피부를 타입6으로 규정
- 큰 차이는 없지만 스킨타입이 1인 경우 가장 탐지율이 낮다. dataset에서 차지하는 비율이 적으므로 데이터가 부족해서 실패 케이스 몇 건이 크게 보이는지도.
- 여성보다는 남성의 자세를 더 잘 확인함. 큰 차이는 아니다.

코딩

참고: https://github.com/google/mediapipe/blob/master/docs/solutions/holistic.md

읽히는 값 전부 찍어보기

names = ["left shoulder", "right shoulder", "left elbow", "right elbow", "left wrist", "right wrist"]

for i, name in enumerate(names):
	print(name)
	print(results.pose_landmarks.landmark[i+11].y)
	print('-----------------------------')

정의한 조건으로 자세 구분

# const for joints
RIGHT_SHOULDER = results.pose_landmarks.landmark[12].y
RIGHT_ELBOW = results.pose_landmarks.landmark[14].y
RIGHT_WRIST = results.pose_landmarks.landmark[16].y

LEFT_SHOULDER = results.pose_landmarks.landmark[11].y
LEFT_ELBOW = results.pose_landmarks.landmark[13].y
LEFT_WRIST = results.pose_landmarks.landmark[15].y
        
# left up
if (RIGHT_WRIST < RIGHT_ELBOW and RIGHT_ELBOW < RIGHT_SHOULDER) and LEFT_WRIST > LEFT_SHOULDER:
	print("RIGHT UP")
elif (LEFT_WRIST < LEFT_ELBOW and LEFT_ELBOW < LEFT_SHOULDER) and RIGHT_WRIST > RIGHT_SHOULDER:
	print("LEFT UP")
elif (RIGHT_WRIST < RIGHT_ELBOW and RIGHT_ELBOW < RIGHT_SHOULDER) and LEFT_WRIST < LEFT_SHOULDER:
	print("BOTH UP")
elif (RIGHT_WRIST > RIGHT_ELBOW and RIGHT_ELBOW > RIGHT_SHOULDER) and LEFT_WRIST > LEFT_SHOULDER:
	print("BOTH DOWN")

자세별 관절 좌표 저장

# while문 시작 전에 json과 list 담을 변수 선언
pose_point_json = {}

right_up_list = []
left_up_list = []
both_up_list = []
both_down_list = []

...
# left up
if (RIGHT_WRIST < RIGHT_ELBOW and RIGHT_ELBOW < RIGHT_SHOULDER) and LEFT_WRIST > LEFT_SHOULDER:
	print("RIGHT UP")
	right_up_json = {}
	right_up_json['RIGHT_WRIST'] = RIGHT_WRIST
	right_up_json['RIGHT_SHOULDER'] = RIGHT_SHOULDER
	right_up_json['LEFT_WRIST'] = LEFT_WRIST
	right_up_json['LEFT_SHOULDER'] = LEFT_SHOULDER
	right_up_list.append(right_up_json)
elif (LEFT_WRIST < LEFT_ELBOW and LEFT_ELBOW < LEFT_SHOULDER) and RIGHT_WRIST > RIGHT_SHOULDER:
	print("LEFT UP")
	left_up_json = {}
	left_up_json['RIGHT_WRIST'] = RIGHT_WRIST
	left_up_json['RIGHT_SHOULDER'] = RIGHT_SHOULDER
	left_up_json['LEFT_WRIST'] = LEFT_WRIST
	left_up_json['LEFT_SHOULDER'] = LEFT_SHOULDER
	left_up_list.append(left_up_json)
elif (RIGHT_WRIST < RIGHT_ELBOW and RIGHT_ELBOW < RIGHT_SHOULDER) and LEFT_WRIST < LEFT_SHOULDER:
	print("BOTH UP")
	both_up_json = {}
	both_up_json['RIGHT_WRIST'] = RIGHT_WRIST
	both_up_json['RIGHT_SHOULDER'] = RIGHT_SHOULDER
	both_up_json['LEFT_WRIST'] = LEFT_WRIST
	both_up_json['LEFT_SHOULDER'] = LEFT_SHOULDER
	both_up_list.append(both_up_json)
elif (RIGHT_WRIST > RIGHT_ELBOW and RIGHT_ELBOW > RIGHT_SHOULDER) and LEFT_WRIST > LEFT_SHOULDER:
	print("BOTH DOWN")
	both_down_json = {}
	both_down_json['RIGHT_WRIST'] = RIGHT_WRIST
	both_down_json['RIGHT_SHOULDER'] = RIGHT_SHOULDER
	both_down_json['LEFT_WRIST'] = LEFT_WRIST
	both_down_json['LEFT_SHOULDER'] = LEFT_SHOULDER
	both_down_list.append(both_down_json)
    
# ESC 키 입력 시 json 파일 저장
if cv2.waitKey(5) & 0xFF == 27:
	pose_point_json['RIGHT_UP'] = right_up_list
	pose_point_json['LEFT_UP'] = left_up_list
	pose_point_json['BOTH_UP'] = both_up_list
	pose_point_json['BOTH_DOWN'] = both_down_list
	with open('../data/pose_point.json', 'w', encoding='utf-8') as f:
		json.dump(pose_point_json, f, ensure_ascii=False, indent=4)
    ...