polycam으로 3DGS

chaenyang·2025년 6월 30일

3D

목록 보기

7/9

3DGS 실행하려면 input으로 image와 그 이미지에 대한 camera parameter가 필요하다. 보통 COLMAP과 같은 알고리즘으로 SfM 과정을 거쳐서 카메라 파라미터(내부+외부)를 얻는다.

polycam으로 촬영하면 이미지와 카메라 파라미터가 기록되기 때문에 SfM을 안거쳐도 된다.

iphone 13 pro를 사용했다. 프로에 lidar 센서가 있어서 프로로 촬영해야 한다. (+polycam 유료다)

찍고 raw data로 다운받으면 아래와 같이 파일이 저장된다.

cameras 열어보면 각 이미지마다의 카메라 intrinsics, extrinsics가 저장된걸 볼 수 있다.

1. Intrinsics

cx: 카메라 주점(x축 중심) - 이미지 중심의 x 좌표
cy: 카메라 주점(y축 중심) - 이미지 중심의 y 좌표
fx: 초점거리(x 방향)
fy: 초점거리(y 방향)
width: 이미지 너비 (pixel)
height: 이미지 높이 (pixel)

이렇게 카메라 instrinsics K가 완성된다.

\mathbf{K} = \begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix}

2. Extrinsics

t_00 ~ t_22: Rotation matrix (R)
- world → camera rotation (3x3) matrix
t_03 ~ t_23: Translation vector
- world → camera의 위치

[R|T] = \begin{bmatrix} t_{00} & t_{01} & t_{02} & t_{03} \\ t_{10} & t_{11} & t_{12} & t_{13} \\ t_{20} & t_{21} & t_{22} & t_{23} \end{bmatrix}

이렇게해서 전체 projection 행렬:

P = K \cdot [R|T]

이 완성된다.

근데 지금 각 이미지마다 camera parameter json 파일이 대응해서 저장돼 있는데 모든 카메라 파라미터 정보를 담은 하나의 json 파일이 필요하다.

이렇게 만들어주고 파일명 transforms_train.json으로 해줘야한다.

import os
import json
import math

image_dir = "keyframes/images"
camera_dir = "keyframes/cameras"
output_path = "keyframes/transforms.json"
def load_camera_json(path):
    with open(path, 'r') as f:
        return json.load(f)
def build_transform_matrix(cam):
    return [
        [cam["t_00"], cam["t_01"], cam["t_02"], cam["t_03"]],
        [cam["t_10"], cam["t_11"], cam["t_12"], cam["t_13"]],
        [cam["t_20"], cam["t_21"], cam["t_22"], cam["t_23"]],
        [0.0,        0.0,        0.0,        1.0]
    ]
def compute_camera_angle_x(cam):
    fx = cam["fx"]
    width = cam["width"]
    return 2 * math.atan(0.5 * width / fx)
frames = []
camera_angle_x = None
for fname in sorted(os.listdir(camera_dir)):
    if not fname.endswith(".json"):
        continue
    stem = fname.replace(".json", "")
    cam_path = os.path.join(camera_dir, fname)
    img_path = os.path.join(image_dir, stem + ".jpg")
    if not os.path.exists(img_path):
        print(f"이미지 없음: {img_path}")
        continue
    cam_data = load_camera_json(cam_path)
    transform_matrix = build_transform_matrix(cam_data)
    if camera_angle_x is None:
        camera_angle_x = compute_camera_angle_x(cam_data)
    frames.append({
        "file_path": f"images/{stem}.jpg",
        "transform_matrix": transform_matrix
    })
output = {
    "camera_angle_x": camera_angle_x,
    "frames": frames
}
with open(output_path, "w") as f:
    json.dump(output, f, indent=4)
print(f"transforms.json 생성 완료 총 {len(frames)}개 프레임")

이제 데이터 다 얻었으니 학습시키면 된다.

chaenyang

잉공지능

이전 포스트

DIFIX3D+: Improving 3D Reconstructions with Single-Step Diffusion Models 리뷰

다음 포스트

polycam으로 3DGS

3D

DIFIX3D+: Improving 3D Reconstructions with Single-Step Diffusion Models 리뷰

gaussian opacity fields error 해결

0개의 댓글