[SLAM] optical flow and direct method

Sinaenjuni·2024년 6월 1일

SLAM

목록 보기

13/14

수식이나 그림에 대한 설명은 해당 그림, 수식 밑에 설명을 적는 것으로 통일한다.

Optical flow와 Direct method는 시간 경과에 따른 이미지 간 픽셀 이동을 계산하는 알고리즘이다. 다만, 두 알고리즘은 2D 상의 이미지 픽셀(Optical flow)을 추적하냐, 3D 상에서 카메라의 움직임(Direct method)을 추적하냐의 차이가 있다.

! Optical flow: 2D 상의 이미지 픽셀을 추적
! Direct method: 3D 상의 카메라 움직임을 추적

Key assumtions

Brightness consistency (밝기 향상성)
- 객체 상의 픽셀(밝기)은 프레임이 바뀌어도 그 값이 변하지 않는다.
Small motion (작은 움직임)
- 객체는 프레임 사이에서 크게 움직이지 않는다. 다른 시각으로 생각하면, 객체보다 사간이 상대적으로 빠르게 움직인다?
Spatial coherence (공간 일관성)
- 공간적으로 서로 인접하는 점들은 동일한 객체에 속할 가능성이 높다.

1. Optical flow

I(x,y,y)=I(x+dx,y+dy,t+dt)

Optical flow를 수식으로 나타내면 다음과 같다. 위 가정에서 말했듯이 프레임(시간)이 변하더라도 객체의 밝기(Intensicy)값은 변하지 않는다. 때문에 위 와 같은 식으로 정의할 수 있다.

I(x+dx,y+dy,t+dt) = I(x,y,y)+ \frac{\partial I}{\partial x} dx + \frac{\partial I}{\partial y} dy + \frac{\partial I}{\partial t} dt

그리고 뒤에 있는 $I(x+dx,y+dy,t+dt)$ 식은 테일러 1차 근사를 통해 표현할 수 있다. (전미분은 편미분의 합으로 표현)

\frac{\partial I}{\partial x} dx + \frac{\partial I}{\partial y} dy + \frac{\partial I}{\partial t} dt = 0

그리고 위 가정에 의해 시간에 흐름에 따른 밝기 변화는 없어야 하기 때문에 위 부분은 0으로 취급할 수 있다.

수식 정리

양변을 $dt$ 로 나눈 후 $\frac{\partial I}{\partial t}$ 를 이항한다.

\begin{aligned} \frac{\partial I}{\partial x} \frac{dx}{dt} + \frac{\partial I}{\partial y} \frac{dy}{dt} + \frac{\partial I}{\partial t} &= 0 \\ \frac{\partial I}{\partial x} \frac{dx}{dt} + \frac{\partial I}{\partial y} \frac{dy}{dt} &= - \frac{\partial I}{\partial t} \end{aligned}

위 식을 간단하게 표현한다.

\begin{aligned} &\frac{\partial I}{\partial x} = I_{x}, \frac{dx}{dt} = u, \frac{\partial I}{\partial y} = I_{y}, \frac{dy}{dt} = v, \frac{\partial I}{\partial t} = I_{t} \\ \end{aligned}

I_{x}u+I_{y}v = -I_{t}

행렬로 나타내면 다음과 같다.
$\begin{bmatrix} I_{x}&I_{y} \end{bmatrix} \begin{bmatrix} u\\v \end{bmatrix} = -I_{t}$
하지만, $n=2, m=1$ 즉, 식보다 변수가 많기 때문에 해를 구할 수 없다.
해를 구하기 위해 슬라이딩 윈도우 방법을 적용한다. (LK optical flow)
$\begin{bmatrix} I_{x}&I_{y} \end{bmatrix}_k \begin{bmatrix} u\\v \end{bmatrix} = -I_{tk}, where, k=1,..., w^2$
만약, 3x3짜리 윈도우를 사용한다면, 9개의 pixel points가 존재하기 때문에 식은 9개로 시스템 문제로 해결할 수 있다.
위에서 정리한 식은 선형 시스템의 정규방적식으로 $u,ㅍ$ 를 구할 수 있다.
$A \begin{bmatrix} u\\v \end{bmatrix} = -b$ $\begin{bmatrix} u\\v \end{bmatrix}^* = -(A^TA)^{-1}A^Tb$
선형최적화 문제로 최조 제곱법의 해, 또는 의사 역행렬을 통해 바로 최적해를 구할 수 있다.

Python code

import cv2
import numpy as np
import numpy.linalg as la

def get_corners(img, max_corners=1000, min_distance=0.1):
    # cv2.goodFeaturesToTrack(image, maxCorners, qualityLevel, 
    #                         minDistance, corners=None, mask=None, 
    #                         blockSize=None, useHarrisDetector=None, k=None) -> corners
    if len(img.shape) == 3:
        img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    corners = cv2.goodFeaturesToTrack(img, maxCorners=max_corners, 
                                            qualityLevel=0.01, minDistance=min_distance)
    if corners is None:
        return None
    else:
        corners = corners.astype(np.uint16).squeeze()
        return corners

def draw_points(img, corners):
    for point in corners: # u, v = point
        cv2.circle(img, point, 3, (255,0,0,0), 2, cv2.LINE_AA)

def optical_flow(img0, img1, corners, window_size):
    w = window_size//2

    # normalize image
    img0 = img0.astype(np.float32) / 255
    img1 = img1.astype(np.float32) / 255
    print(img0.shape) #rows, cols

    x_kernel = np.array([[-1,1], [-1,1]]) # dx
    y_kernel = np.array([[-1,-1], [1,1]]) # dy
    t_kernel = np.array([[1,1], [1,1]])   # dy
    dx=cv2.filter2D(img0, -1, kernel=x_kernel)
    dy=cv2.filter2D(img0, -1, kernel=y_kernel)
    dt=cv2.filter2D(img1, -1, kernel=t_kernel) -  cv2.filter2D(img0, -1, kernel=t_kernel)

    U = np.zeros(img0.shape)
    V = np.zeros(img0.shape)

    for x, y in corners:
        Ix = dx[y-w : y+w+1, x-w : x+w+1].flatten()
        Iy = dy[y-w : y+w+1, x-w : x+w+1].flatten()
        It = dt[y-w : y+w+1, x-w : x+w+1].flatten()

        A = np.vstack((Ix, Iy)).T
        A_pinv = la.pinv(A)
        b = It

        uv = -(A_pinv @ b)

        U[y, x] = uv[0]
        V[y, x] = uv[1]
        img1 = cv2.arrowedLine(img1, (x, y), (int(round(x+uv[0])), int(round(y+uv[1]))),
                            (0, 255, 0),
                            thickness=1,
                            line_type=cv2.LINE_AA, tipLength=1)
    cv2.imshow("img1", img1)
    return (U, V)

if __name__ == "__main__":
    # arguments
    window_size = 5
    max_corners = 100
    min_distance = 1

    # Load images
    img0 = "/img0.png"
    img1 = "/img1.png"
    img0 = cv2.imread(img0, cv2.IMREAD_GRAYSCALE)
    img1 = cv2.imread(img1, cv2.IMREAD_GRAYSCALE)
    v_img = cv2.cvtColor(img0, cv2.COLOR_GRAY2BGR)
    
    # find corners
    corners = get_corners(img1, max_corners=max_corners, min_distance=min_distance)

    # optical flow algorithm
    optical_flow(img0, img1, corners, window_size)

    cv2.waitKey()
    cv2.destroyAllWindows()

Sinaenjuni

다음 포스트

[SLAM] optical flow and direct method

SLAM

Key assumtions

1. Optical flow

수식 정리

Python code

Multiple View Geometry (다중관점기하학)

0개의 댓글

관련 채용 정보