FPCV 1주차, Introduction, Image Formulation

양세종·2024년 4월 14일

공부

2024-spring-fpcv-study

목록 보기

1/4

개요

First Principles of Computer Vision 스터디 정리용 글
1주차는 introduction과 image formation까지 들음

Introduction

Playlist Link

Overview

Video Link
Computer Vision의 목표는 컴퓨터가 '볼 수 있게' 하는 것
요즘은 Neural Network만 있으면 다 되는데 왜 Computer Vision의 First Principle을 알아야하는가?
A1. First Principle로 알려진 문제를 풀기 위해 데이터를 모으는 건 불필요하다
A2. Neural Network가 잘 동작하지 않을 때는, First Principle이 이유를 알 수 있는 유일한 희망이다
A3. 데이터를 모으고 신경망을 학습 시키는 것은 지루하고, 가끔씩은 불가능하다
그리고 무엇보다 재미있지 않은가? 😉

What is Computer Vision?

Video Link
왜 굳이 컴퓨터에게 '볼 수 있는' 능력을 가르치려고 하는가?
A1. 일상의 잡무들을 떠넘기고 더 중요한 일들에 집중할 수 있다
A2. 우리의 비전 시스템은 훌륭하긴 하지만, 정밀도가 높지는 않다
A3. 가장 중요한 건 사람은 인식하지 못 하는 영역도 컴퓨터는 할 수 있다는 것이다


컴퓨터 비전 시스템의 구성요소

컴퓨터 비전은
- 사람의 시각 과정을 자동화하는 것
- 정보 처리 과정
- 이미지 생성을 거꾸로 하는 것
- 그래픽스의 반대
- ...
- 재밌고 유용하다!
이미지는 픽셀의 배열이다!
픽셀은 밝기, 색, 거리, 질감 등에 대한 정보를 가지고 있다


이런 픽셀에서 정보를 얻어내야 한다

어렵고, 여러 분야 지식을 필요로 한다
optics, camera, signal processing, electronics, computer vision, biology, cognitive science, biology, ...

What is Vision Used For?

Video Link
공장에서의 Visual Inspection 등에 활용
자동차 번호 인식, 책 디지털화 등 Optical Character Reconition 기술 활용
Biometrics를 이용한 지문, 홍채 인증 등에 활용
face detection, identification, ...
tracking the object in-the-wild
optical mouse
entertainment and gaming (e.g. Kinect)
Visual Effects for video generation
Augmented Reality : Face Manipulation
Visual Search (e.g. google search photo)
Vision for Exploration (e.g. Mars, Disaster Zone, Space, Deep Seas, Jungle)
Autonomous Navigation, Driving
Remote Sensing (e.g. satellite)
Medical Imaging (X-ray, Ultra Sound, MRA, CT, ...)

How Do Humans Do It?

Video Link
눈에서 대부분의 정보 압축이 일어난다
아직 우리는 모션 캡쳐 영역 등에 대해서만 제한적으로 알고 있을뿐 정확한 뇌의 메커니즘을 이해하지 못 하고 있다


사람의 눈과 시각 인식 피질

사실 많은 현실에서의 task에서는 우리의 visual system의 오라클로 따라갈만하지 않다
정확하지도 않고 종종 우리 자신을 속이기도 한다 (e.g. 착시현상 등), 이는 색깔, 선형성, 밝기 등등 우리가 필요로 하는 인식의 모든 속성에서 일어난다


산 위의 크레이터	크레이터 안의 산

Sensing과 Thinking은 다르다

Topics Covered

Video Link
Image Formation and Optics : Image는 어떻게 만들어지는가?
Image Sensors : Optical Image를 어떻게 Electronic Signal로 변환하는가? (e.g. Binary Images)
Image Processing : Denoising, Sharpening
Feature Detection : (e.g. Edge, Corner), edge to boundary, Scale-Invariant Feature Transform (SIFT)
Application of Feature : Image Alignment, Stitching, Panorama generation, ...
Face Detection, ...
3D: Radiometry and Reflectance, Photometric Stereo (e.g. surface normal from brightness), Shape from Shading (Give assumption or hypothesis), Depth from Focus/Defocus, Active Illumination Methods
Camera Calibration, camera parameters (e.g. focal length)
Binocular Stereo Imaging (e.g. depth estimation)
Video: Motion and Optical Flow, Structure from Motion
Group pixels : Image Segmentation
Object Tracking
Appearance Matching : Object Recognition using Principle Component Analysis (PCA)
Artificial Neural Networks : Machine Learning behind

About the Lecture Series

Video Link
Introduction, Imaging, Features, Reconstruction 1 (single view point), Reconstruction 2 (multi-view or video), Perception

References and Credits

Video Link

Image Formulation

Playlist Link

Overview

Video Link
Project 3D scene onto 2D plane, Camera Model, Geometric and Photometric relation between scene and image
Pinhole and Perspective Projection
Image formation using Lense
Lens Related Issues
Wide Angle Cameras

Pinhole and Perspective Projection

Video Link


Pinhole System, 가장 간단한 카메라 모델

Camera Obscura (Dark Chamber), Snake Eye, No Lens Just Large Pinhole
perspective projection of a line: 3D line to 2D line
image magnification
- focal length와 depth에 따라 magnification scale이 변화한다
- average scene depth에 비해 the range of scene depth가 작다면 magnification은 상수로 생각해도 된다


3D에서 평행한 여러 직선들은 동일한 vanishing point를 공유한다	집중시키고 싶은 object를 vanishing point에 배치	vanishing point를 이용한 착시

Ideal pinhole size
- 적절하지 않은 pinhole size는 blurry한 결과를 내놓는다
- 기본적으로 작아야하지만, 너무 작은 경우 diffraction을 일으킨다
- focal length와 light wave length을 고려해서 정해야한다
- 적절한 long exposure와 잘 찍힌 사진은 모든 영역에서 well focus 되어있다
- 그런데... 한 12초 정도? 그래서 우린 lens를 쓴다

Image Formation using Lenses

Video Link


빛을 모으기 위해 lens를 쓴 경우

object와 lens의 거리가 infinite인 경우 focal length와 lens와 plane의 거리는 같다
magnification을 생각하면 굴절에 의해서 line은 위치에 따라서 서로 다른 magnification이 생긴다
two lens system을 통해 object와 lens의 거리를 유지하면서도 이러한 문제를 해결할 수 있다 (zooming의 원리)
aperture of lens의 조절을 통해 image brightness를 조절할 수 있다
lens의 f-number를 활용해 손쉽게 focal length에 따른 aperture를 조절할 수 있다


Lens Defocus Problem, Blur Circle은 object size와 F-Number와 각각 정비례, 반비례한다

image plane을 움직이거나 lens를 움직여서 해결할 수 있다

Depth of Field

Youtube Link
Depth of Field란 image에 "sufficiently well" focused 된 object의 distance 범위를 뜻한다
예를들면 blur circle이 pixel size보다 작게 만드는 object의 distance 범위라고 할 수 있다


DoF의 개념도

Hyperfocal Distance
The trade-off of Aperture Size : DoF vs Brightness


largest	middle	smallest


focus되는 object로부터 빛이 닿기만 한다면 어두워지긴 하더라도 잘 focus 되어 있다

lens의 좋은 성질 중 하나로, 카메라 렌즈에 있는 먼지 같은 것도 빛만 통과 시킨다면 이미지에 큰 영향을 미치지 않는다
Tilting the Lens

Youtube Link
Compound Lenses


실제 카메라의 복잡한 렌즈 구성

Vignetting


vignetting의 원인	vignetting의 결과

chromatic aberration


파장에 따른 굴절률의 차이가 이런 결과를 만든다

geometric distortion
radial(barrel) distortion : 중앙으로 부터 각 변으로의 distortion
tangential distortion

Wide Angle Cameras

Video Link
반구를 정확하게 imaging 하려면 무한한 image plane을 필요로 하지만 이건 물리적으로 불가능하다
vignetting이 일어나는 위치의 물체는 심각하게 stretch 되기도 한다
따라서 이런 문제들을 해결하기 위해 lens들과 mirror들을 사용한다
Fisheye Lens Camera
Meniscus Lens : wide 영역의 빛을 모은다
그 뒤의 렌즈들 : 모든 영역에서 well focus되도록 빛들을 조정한다
모은 정보들을 활용해 perspective image plane으로 재구성도 가능하다


Fisheye Lens Camera	결과

두 개의 fisheye lens를 양쪽으로 매치하면 complete sphere를 얻을 수 있다 (e.g. Insta 360)
거울을 사용하면 이 문제를 해결할 수 있다. 거울을 이용하면 virtual camera가 대칭점에 있는 효과를 낼 수 있다 (optical folding)


평면 거울을 활용할 때의 이점

쌍곡면 거울을 활용하면 focus도 조절 가능하다


쌍곡면 거울을 활용할 때의 이점

포물면 거울을 활용하면 넓은 영역의 perspective imaing이 가능하다 (파노라마를 한번에 찍기...!) (e.g. Sony Bloggie, Kogeto "Dot" for iPhone)


포물면 거울을 활용할 때의 이점	예시

이런 기술들은 telescope 등에 활용되고 있다 (아주아주 멀리 있는 물체를 좁은 영역에 담아내야하니까...) (e.g. James Webb Space Telescope)
관자도 거울을 이용한 눈을 가지고 있다
우리의 눈은 ellipsoidal cornea와 limbus로 구성된 거울과 렌즈를 가지고 있다


우리 눈이 잡아내는 이미지

Animal Eyes

Video Link
Trilobite
Primitive Eyes
Evolution of Eye : 평면 -> 구체, Lens를 추가하는 방향으로 진화
Image Formation in the Eye
Optics in Human Eye
Human Eye: Iris Control System
Accommodation (Focusing) in different distance
Change in Accommodation with Age
The glasses for Myopia (Near-Sightedness) : concave lens diverge the light
The glasses for Hyperopia (Far-sightedness) : convex lens to converge the light
Liquid Lens : You can apply a vortage to control the liquid lens (e.g. zooming)

양세종

블로그가 이전되었습니다. (2024.09.12) 홈페이지 참조 (https://yangspace.co.kr/)

다음 포스트

FPCV 1주차, Introduction, Image Formulation

2024-spring-fpcv-study

개요

Introduction

Overview

What is Computer Vision?

What is Vision Used For?

How Do Humans Do It?

Topics Covered

About the Lecture Series

References and Credits

Image Formulation

Overview

Pinhole and Perspective Projection

Image Formation using Lenses

Depth of Field

Wide Angle Cameras

Animal Eyes

FPCV 2주차, Image Sensing, Binary Images

0개의 댓글

FPCV 1주차, Introduction, Image Formulation

2024-spring-fpcv-study

개요

Introduction

Overview

What is Computer Vision?

What is Vision Used For?

How Do Humans Do It?

Topics Covered

About the Lecture Series

References and Credits

Image Formulation

Overview

Pinhole and Perspective Projection

Image Formation using Lenses

Depth of Field

Lens Related Issues

Wide Angle Cameras

Animal Eyes

FPCV 2주차, Image Sensing, Binary Images

0개의 댓글