[React] MediaPipe Hands 웹에서 띄워보기

seungjun.dev·2023년 6월 29일

AI React mediapipe typescript

React

목록 보기

6/13

MediaPipe의 Hands를 이용한 핸드 트래커를 웹에서 구현하기 위한 정보를 적은 글입니다.
원글: https://qiita.com/nemutas/items/32ce13ae31360877baa5

MediaPipe란

Google에서 제공하는 비전 AI 라이브러리
AI 모델 개발과 머신러닝까지 마친 AI를 그대로 제공

MediaPipe - Hands (Hand landmarks detection)

손의 랜드마크를 감지하는 MediaPipe의 기능 중 하나

MediaPipe로 손의 모양을 인식할 때는 손가락 마디마다 랜드마크라는 빨간 점을 찍어 이 점 마다의 각도를 계산해 인식합니다.
사진을 보면 손의 각 마디마다 빨간 점과 명칭이 적혀있는 것을 볼 수 있습니다.

본격적으로 따라해보기

🚨 주의

TypeScript로 작성된 코드이며 JavaScript에서는 작동하지 않습니다.

React - 18.2.0
TypeScript - 5.2.0
react-webcam - 7.1.1
@emotion/css - 11.11.2
@mediapipe/hands - 0.4.1675469240
@mediapipe/camera_utils - 0.3.1675466862
@mediapipe/drawing_utils - 0.3.1675466124

패키지 설치

CRA(Create-React-App) 으로 프로젝트 생성

npx create-react-app [프로젝트명] --template typescript

React에서 웹캠을 처리하기 위한 설치

npm i react-webcam

emotion/css 를 사용한 스타일링

npm i @emotion/css

MediaPipe 관련 패키지 설치

npm i @mediapipe/hands @mediapipe/camera_utils @mediapipe/drawing_utils

구현

App.tsx

import { useCallback, useEffect, useRef } from "react";
import Webcam from "react-webcam";
import { css } from "@emotion/css";
import { Camera } from "@mediapipe/camera_utils";
import { Hands, Results } from "@mediapipe/hands";
import { drawCanvas } from "./utils/drawCanvas";

const App = () => {
  const webcamRef = useRef<Webcam>(null);
  const canvasRef = useRef<HTMLCanvasElement>(null);
  const resultsRef = useRef<Results>();

  /**
   * 검출결과（프레임마다 호출됨）
   * @param results
   */
  const onResults = useCallback((results: Results) => {
    resultsRef.current = results;

    const canvasCtx = canvasRef.current!.getContext("2d")!;
    drawCanvas(canvasCtx, results);
  }, []);

  // 초기 설정
  useEffect(() => {
    const hands = new Hands({
      locateFile: (file) => {
        return `https://cdn.jsdelivr.net/npm/@mediapipe/hands/${file}`;
      },
    });

    hands.setOptions({
      maxNumHands: 2,
      modelComplexity: 1,
      minDetectionConfidence: 0.5,
      minTrackingConfidence: 0.5,
    });

    hands.onResults(onResults);

    if (
      typeof webcamRef.current !== "undefined" &&
      webcamRef.current !== null
    ) {
      const camera = new Camera(webcamRef.current.video!, {
        onFrame: async () => {
          await hands.send({ image: webcamRef.current!.video! });
        },
        width: 1280,
        height: 720,
      });
      camera.start();
    }
  }, [onResults]);

  /*  랜드마크들의 좌표를 콘솔에 출력 */
  const OutputData = () => {
    const results = resultsRef.current!;
    console.log(results.multiHandLandmarks);
  };

  return (
    <div className={styles.container}>
      {/* 비디오 캡쳐 */}
      <Webcam
        audio={false}
        style={{ visibility: "hidden" }}
        width={1280}
        height={720}
        ref={webcamRef}
        screenshotFormat="image/jpeg"
        videoConstraints={{ width: 1280, height: 720, facingMode: "user" }}
      />
      {/* 랜드마크를 손에 표시 */}
      <canvas
        ref={canvasRef}
        className={styles.canvas}
        width={1280}
        height={720}
      />
      {/* 좌표 출력 */}
      <div className={styles.buttonContainer}>
        <button className={styles.button} onClick={OutputData}>
          Output Data
        </button>
      </div>
    </div>
  );
};

// ==============================================
// styles

const styles = {
  container: css`
    position: relative;
    width: 100vw;
    height: 100vh;
    overflow: hidden;
    display: flex;
    justify-content: center;
    align-items: center;
  `,
  canvas: css`
    position: absolute;
    width: 1280px;
    height: 720px;
    background-color: #fff;
  `,
  buttonContainer: css`
    position: absolute;
    top: 20px;
    left: 20px;
  `,
  button: css`
    color: #fff;
    background-color: #0082cf;
    font-size: 1rem;
    border: none;
    border-radius: 5px;
    padding: 10px 10px;
    cursor: pointer;
  `,
};

export default App;

utils/drawCanvas.ts

import { drawConnectors, drawLandmarks } from "@mediapipe/drawing_utils";
import { HAND_CONNECTIONS, Results } from "@mediapipe/hands";

export const drawCanvas = (ctx: CanvasRenderingContext2D, results: Results) => {
  const width = ctx.canvas.width;
  const height = ctx.canvas.height;

  ctx.save();
  ctx.clearRect(0, 0, width, height);
  // canvas의 좌우 반전
  ctx.scale(-1, 1);
  ctx.translate(-width, 0);
  // capture image 그리기
  ctx.drawImage(results.image, 0, 0, width, height);
  // 손의 묘사
  if (results.multiHandLandmarks) {
    // 골격 묘사
    for (const landmarks of results.multiHandLandmarks) {
      drawConnectors(ctx, landmarks, HAND_CONNECTIONS, {
        color: "#00FF00",
        lineWidth: 5,
      });
      drawLandmarks(ctx, landmarks, {
        color: "#FF0000",
        lineWidth: 1,
        radius: 5,
      });
    }
  }
  ctx.restore();
};

코드 살펴보기

MediaPipe 설정

hands.setOptions({
  maxNumHands: 2, // 손을 몇 개 까지 인식할지
  modelComplexity: 1, // 모델 복잡도 → 모델이 얼마나 다양하고 상세한 패턴을 학습하는지 
  minDetectionConfidence: 0.5, // 최소 신뢰도 → 모델이 인식한 결과 중 신뢰도가 0.5 이상인 결과만 출력
  minTrackingConfidence: 0.5, // 최대 신뢰도
});

hands.setOptions 에서는 손의 검출 설정을 합니다.

const camera = new Camera(webcamRef.current.video!, {
  onFrame: async () => {
    await hands.send({ image: webcamRef.current!.video! });
  },
  width: 1280,
  height: 720,
});
camera.start();

camera 설정에서는 랜드마크를 그리는 것의 범위를 설정합니다.
여기서는 1280x720 으로 설정했습니다.

검출 결과 데이터 구조

const onResults = useCallback((results: Results) => {
  resultsRef.current = results;

  const canvasCtx = canvasRef.current!.getContext("2d")!;
  drawCanvas(canvasCtx, results);
}, []);

/*  랜드마크들의 좌표를 콘솔에 출력 */
const OutputData = () => {
  const results = resultsRef.current!;
  console.log(results.multiHandLandmarks);
};

...

{/* 좌표 출력 */}
<div className={styles.buttonContainer}>
  <button className={styles.button} onClick={OutputData}>
    Output Data
  </button>
</div>