[2025/W44] ๐Ÿค— Weekly AI Research

Skyยท2025๋…„ 10์›” 31์ผ

Weekly AI Research Digest

๋ชฉ๋ก ๋ณด๊ธฐ
72/89

2D-3D ๊ณต๊ฐ„ ํ•™์Šต, ์žฌ๊ท€์  ์ฝ”๋“œ(ReCode), ์ž ์žฌ ๊ณต๊ฐ„ ์ถ”๋ก ์œผ๋กœ ์ง„ํ™”ํ•˜๋Š” ์ฐจ์„ธ๋Œ€ AI ์—์ด์ „ํŠธ
๋ชจํ˜ธํ•œ ์ฟผ๋ฆฌ ์ƒํ˜ธ์ž‘์šฉ, ๋Šฅ๋™ํ˜• ๋กœ๋ด‡ ์ œ์–ด, ๋ฌดํ•œ 3D ์„ธ๊ณ„ ์ƒ์„ฑ ๋ฐ ๋ฐ์ดํ„ฐ ์—์ด์ „ํŠธ ์ž์œจ์„ฑ ํƒ๊ตฌ

Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations

Paper, Project
์ธ๊ฐ„์ด ์—ฌ๋Ÿฌ ๊ฐ๊ฐ์„ ํ†ตํ•ด ๊ณต๊ฐ„ ๊ฐœ๋…์„ ํ•™์Šตํ•˜๋Š” ๋ฐฉ์‹์— ์ฐฉ์•ˆํ•˜์—ฌ, 2D ์ด๋ฏธ์ง€์™€ 3D ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ ๋ฐ์ดํ„ฐ๋ฅผ ํ•จ๊ป˜ ํ™œ์šฉํ•˜๋Š” ์ƒˆ๋กœ์šด ์ž๊ธฐ ์ง€๋„ ํ•™์Šต(self-supervised learning) ๋ฐฉ๋ฒ•๋ก  'Concerto'๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์ด ๋ชจ๋ธ์€ 3D ๋ฐ์ดํ„ฐ ์ž์ฒด ๋‚ด์—์„œ ํ•™์Šตํ•˜๊ณ , 2D์™€ 3D ๋ฐ์ดํ„ฐ ๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์„ ๊ฒฐํ•ฉํ•œ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ, 3D ์”ฌ(scene) ์ธ์‹ ์ž‘์—…์—์„œ ๊ธฐ์กด 2D ๋˜๋Š” 3D ๋‹จ๋… ๋ชจ๋ธ๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ด๋ฉฐ ScanNet๊ณผ ๊ฐ™์€ ์ฃผ์š” ๋ฒค์น˜๋งˆํฌ์—์„œ SOTA(์ตœ๊ณ  ์„ฑ๋Šฅ)๋ฅผ ๋‹ฌ์„ฑํ–ˆ๋‹ค. ๋˜ํ•œ ๋น„๋””์˜ค๋‚˜ ์–ธ์–ด(CLIP)์™€ ์—ฐ๊ณ„ํ•˜์—ฌ ๊ฐœ๋ฐฉํ˜• ์„ธ๊ณ„ ์ธ์‹์œผ๋กœ๋„ ํ™•์žฅ๋  ์ˆ˜ ์žˆ๋‹ค.

ReCode: Unify Plan and Action for Universal Granularity Control

Paper, Project
๊ธฐ์กด LLM ์—์ด์ „ํŠธ๊ฐ€ ๊ณ ์ˆ˜์ค€ '๊ณ„ํš'๊ณผ ์ €์ˆ˜์ค€ 'ํ–‰๋™'์„ ๋ถ„๋ฆฌํ•˜์—ฌ ์ฒ˜๋ฆฌํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ƒํ™ฉ์— ๋”ฐ๋ฅธ ์œ ์—ฐํ•œ ๋Œ€์ฒ˜๊ฐ€ ์–ด๋ ต๋‹ค๋Š” ๋ฌธ์ œ๋ฅผ ์ง€์ ํ•œ๋‹ค. ์ด ๋…ผ๋ฌธ์€ 'ReCode'๋ผ๋Š” ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ œ์•ˆํ•˜์—ฌ, ๊ณ„ํš๊ณผ ํ–‰๋™์„ '์žฌ๊ท€์  ์ฝ”๋“œ ์ƒ์„ฑ'์ด๋ผ๋Š” ๋‹จ์ผํ•œ ํ‘œํ˜„์œผ๋กœ ํ†ตํ•ฉํ•œ๋‹ค. ๋†’์€ ์ˆ˜์ค€์˜ ๊ณ„ํš์„ ์ถ”์ƒ์ ์ธ ํ•จ์ˆ˜๋กœ ๊ฐ„์ฃผํ•˜๊ณ , ์ด๋ฅผ ์‹คํ–‰ ๊ฐ€๋Šฅํ•œ ์›์‹œ ํ–‰๋™์— ๋„๋‹ฌํ•  ๋•Œ๊นŒ์ง€ ์žฌ๊ท€์ ์œผ๋กœ ๋ถ„ํ•ดํ•œ๋‹ค. ์ด ๋ฐฉ์‹์„ ํ†ตํ•ด ์—์ด์ „ํŠธ๊ฐ€ ๋™์ ์œผ๋กœ ์˜์‚ฌ๊ฒฐ์ •์˜ ์„ธ๋ถ„์„ฑ(granularity)์„ ์กฐ์ ˆํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜๋ฉฐ, ํ•™์Šต ๋ฐ์ดํ„ฐ ํšจ์œจ์„ฑ๊ณผ ์ถ”๋ก  ์„ฑ๋Šฅ ๋ชจ๋‘์—์„œ ๋›ฐ์–ด๋‚œ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์˜€๋‹ค.

InteractComp: Evaluating Search Agents With Ambiguous Queries

Paper, Project
๋Œ€๋ถ€๋ถ„์˜ ๊ฒ€์ƒ‰ ์—์ด์ „ํŠธ๊ฐ€ ์‚ฌ์šฉ์ž์˜ ์ฟผ๋ฆฌ๊ฐ€ ๋ช…ํ™•ํ•˜๋‹ค๊ณ  ๊ฐ€์ •ํ•˜์ง€๋งŒ, ์‹ค์ œ ์ฟผ๋ฆฌ๋Š” '๋ชจํ˜ธํ•œ' ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์•„ ์ƒํ˜ธ์ž‘์šฉ์„ ํ†ตํ•œ ๋ช…ํ™•ํ™”๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ์ด ๋…ผ๋ฌธ์€ ์—์ด์ „ํŠธ๊ฐ€ ์ฟผ๋ฆฌ์˜ ๋ชจํ˜ธ์„ฑ์„ ์ธ์ง€ํ•˜๊ณ  ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด '์ ๊ทน์ ์œผ๋กœ ์ƒํ˜ธ์ž‘์šฉ'ํ•˜๋Š”์ง€ ํ‰๊ฐ€ํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฒค์น˜๋งˆํฌ 'InteractComp'๋ฅผ ์ œ์•ˆํ•œ๋‹ค. 17๊ฐœ ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•œ ๊ฒฐ๊ณผ, ๋ชจ๋ธ๋“ค์ด ๋ชจํ˜ธํ•œ ์ƒํ™ฉ์—์„œ ์งˆ๋ฌธํ•˜์ง€ ์•Š๊ณ  '๊ณผ์‹ 'ํ•˜์—ฌ ์ž˜๋ชป๋œ ๋‹ต์„ ๋‚ด๋†“๋Š” ๊ฒฝํ–ฅ์„ ๋ณด์˜€๋‹ค. ์ง€๋‚œ 15๊ฐœ์›”๊ฐ„ ๊ฒ€์ƒ‰ ์„ฑ๋Šฅ์€ 7๋ฐฐ ํ–ฅ์ƒ๋์ง€๋งŒ ์ƒํ˜ธ์ž‘์šฉ ๋Šฅ๋ ฅ์€ ์ •์ฒด๋˜์–ด ์žˆ์—ˆ์Œ์„ ๋ฐํžˆ๋ฉฐ, ์ด ๋ฒค์น˜๋งˆํฌ๊ฐ€ ์—์ด์ „ํŠธ์˜ ์ƒํ˜ธ์ž‘์šฉ ๋Šฅ๋ ฅ ํ‰๊ฐ€์™€ ํ›ˆ๋ จ์— ์ค‘์š”ํ•˜๋‹ค๊ณ  ๊ฐ•์กฐํ•œ๋‹ค.

JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

Paper, Project
์ฝ”๋“œ ์ธํ…”๋ฆฌ์ „์Šค๊ฐ€ ํ…์ŠคํŠธ ์ฝ”๋“œ๋ฅผ ๋„˜์–ด ํ”„๋กœ๊ทธ๋žจ์ด ์ƒ์„ฑํ•˜๋Š” '์‹œ๊ฐ์  ๊ฒฐ๊ณผ๋ฌผ'(์ฐจํŠธ, UI ๋“ฑ)๊นŒ์ง€ ์ดํ•ดํ•ด์•ผ ํ•  ํ•„์š”์„ฑ์— ์ฃผ๋ชฉํ•œ๋‹ค. ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, (1) ๊ณ ํ’ˆ์งˆ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ฝ”๋“œ ๋ฐ์ดํ„ฐ๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ƒ์„ฑํ•˜๋Š” ํˆดํ‚ท์„ ๊ฐœ๋ฐœํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ด ๋Œ€๊ทœ๋ชจ ์ฝ”ํผ์Šค 'JanusCode-800K'๋ฅผ ๊ตฌ์ถ•ํ–ˆ๋‹ค. (2) ์ด ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ…์ŠคํŠธ, ์‹œ๊ฐ์  ์ž…๋ ฅ, ๋˜๋Š” ๋‘˜์˜ ์กฐํ•ฉ์œผ๋กœ๋ถ€ํ„ฐ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•˜๋Š” 'JanusCoder' ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œ์ผฐ๋‹ค. ์ด ๋ชจ๋ธ์€ ํ…์ŠคํŠธ ์ค‘์‹ฌ ๋ฐ ๋น„์ „ ์ค‘์‹ฌ ์ฝ”๋”ฉ ์ž‘์—… ๋ชจ๋‘์—์„œ ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์ด๋ฉฐ, ์ผ๋ถ€๋Š” ์ƒ์šฉ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๋Šฅ๊ฐ€ํ–ˆ๋‹ค.

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper, Project
๋ณต์žกํ•œ ์‹ค์ œ ์ž‘์—…์„ ์œ„ํ•ด ์™ธ๋ถ€ ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žฅ๊ธฐ์ ์ธ ์ƒํ˜ธ์ž‘์šฉ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์—์ด์ „ํŠธ 'DeepAgent'๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๊ธฐ์กด ์—์ด์ „ํŠธ๊ฐ€ ๊ธด ์ƒํ˜ธ์ž‘์šฉ ์ด๋ ฅ์œผ๋กœ ์ธํ•ด ์ปจํ…์ŠคํŠธ ๊ธธ์ด ํญ๋ฐœ๊ณผ ์˜ค๋ฅ˜ ๋ˆ„์ ์„ ๊ฒช๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, '์ž์œจ์  ๋ฉ”๋ชจ๋ฆฌ ํด๋”ฉ' ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ๋„์ž…ํ•œ๋‹ค. ์ด๋Š” ๊ณผ๊ฑฐ ์ด๋ ฅ์„ ๊ตฌ์กฐํ™”๋œ ๋ฉ”๋ชจ๋ฆฌ๋กœ ์••์ถ•ํ•˜์—ฌ ์ค‘์š”ํ•œ ์ •๋ณด๋Š” ๋ณด์กดํ•˜๊ณ  ์˜ค๋ฅ˜๋ฅผ ์ค„์ธ๋‹ค. ๋˜ํ•œ 'ToolPO'๋ผ๋Š” ๊ฐ•ํ™”ํ•™์Šต ์ „๋žต์„ ํ†ตํ•ด ํšจ์œจ์ ์ธ ๋„๊ตฌ ์‚ฌ์šฉ๋ฒ•์„ ํ•™์Šตํ•œ๋‹ค. DeepAgent๋Š” 8๊ฐœ์˜ ๋ฒค์น˜๋งˆํฌ์—์„œ ๊ธฐ์กด ๋ชจ๋ธ๋ณด๋‹ค ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.

Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning

Paper, Project
MLLM์ด "์ด๋ฏธ์ง€๋ฅผ ๋ณด๋ฉฐ ์ƒ๊ฐ"ํ•˜๋Š” ๋Šฅ๋ ฅ์„ ๋„˜์–ด "๋น„๋””์˜ค๋ฅผ ๋ณด๋ฉฐ ์ƒ๊ฐ"ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก  'Video-Thinker'๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์ด ๋ชจ๋ธ์€ ์ถ”๋ก  ๊ณผ์ •์—์„œ ์™ธ๋ถ€ ๋„๊ตฌ ์—†์ด MLLM ์ž์ฒด๊ฐ€ ๊ฐ€์ง„ "๊ทธ๋ผ์šด๋”ฉ"๊ณผ "์บก์…”๋‹" ๋Šฅ๋ ฅ์„ ์ž์œจ์ ์œผ๋กœ ํ™œ์šฉํ•ด ์ถ”๋ก  ๋‹จ์„œ๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด, (1) ์ž์œจ์  ๋„๊ตฌ ์‚ฌ์šฉ๋ฒ•์ด ํฌํ•จ๋œ ์ถ”๋ก  ๋ฐ์ดํ„ฐ์…‹(Video-Thinker-10K)์„ ๊ตฌ์ถ•ํ•˜๊ณ , (2) ์ง€๋„ ํ•™์Šต(SFT)๊ณผ ๊ฐ•ํ™”ํ•™์Šต(GRPO)์„ ๊ฒฐํ•ฉํ•œ ํ›ˆ๋ จ ์ „๋žต์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ, ์—ฌ๋Ÿฌ ๋น„๋””์˜ค ์ถ”๋ก  ๋ฒค์น˜๋งˆํฌ์—์„œ SOTA(์ตœ๊ณ  ์„ฑ๋Šฅ)๋ฅผ ๋‹ฌ์„ฑํ–ˆ๋‹ค.

Scaling Latent Reasoning via Looped Language Models

Paper, Project
๊ธฐ์กด LLM์ด CoT(Chain-of-Thought)์ฒ˜๋Ÿผ ํ…์ŠคํŠธ๋ฅผ ๋ช…์‹œ์ ์œผ๋กœ '์ƒ์„ฑ'ํ•˜๋ฉฐ ์ถ”๋ก ํ•˜๋Š” ๋ฐฉ์‹์˜ ํ•œ๊ณ„๋ฅผ ์ง€์ ํ•œ๋‹ค. ์ด ๋…ผ๋ฌธ์€ ์ถ”๋ก  ๊ณผ์ •์„ ์‚ฌ์ „ ํ•™์Šต ๋‹จ๊ณ„์— ๋‚ด์žฅํ•˜๋Š” 'Looped Language Models (LoopLM)' ํŒจ๋Ÿฌ๋‹ค์ž„๊ณผ 'Ouro' ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. Ouro๋Š” ํ…์ŠคํŠธ๊ฐ€ ์•„๋‹Œ '์ž ์žฌ ๊ณต๊ฐ„(latent space)'์—์„œ ๋ฐ˜๋ณต์ ์ธ ๊ณ„์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋ฉฐ ์ถ”๋ก ํ•œ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ, 1.4B, 2.6B์˜ ๋น„๊ต์  ์ž‘์€ Ouro ๋ชจ๋ธ์ด 12B ํฌ๊ธฐ์˜ SOTA LLM๊ณผ ๋™๋“ฑํ•˜๊ฑฐ๋‚˜ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ์ด๋Š” ์ง€์‹ ์šฉ๋Ÿ‰์ด ์ปค์„œ๊ฐ€ ์•„๋‹ˆ๋ผ, '์ง€์‹์„ ์กฐ์ž‘ํ•˜๊ณ  ํ™œ์šฉํ•˜๋Š” ๋Šฅ๋ ฅ'์ด ๋›ฐ์–ด๋‚˜๊ธฐ ๋•Œ๋ฌธ์ž„์„ ๋ฐํ˜”๋‹ค.

A Survey of Data Agents: Emerging Paradigm or Overstated Hype?

Paper, Project
'๋ฐ์ดํ„ฐ ์—์ด์ „ํŠธ'๋ผ๋Š” ์šฉ์–ด๊ฐ€ ํ˜„์žฌ ๋ช…ํ™•ํ•œ ์ •์˜ ์—†์ด ํ˜ผ์šฉ๋˜์–ด ์‚ฌ์šฉ์ž์˜ ๊ธฐ๋Œ€์™€ ์‹ค์ œ ์„ฑ๋Šฅ ๊ฐ„์˜ ๋ถˆ์ผ์น˜ ๋“ฑ์„ ์ผ์œผํ‚ค๊ณ  ์žˆ์Œ์„ ์ง€์ ํ•˜๋Š” ์„œ๋ฒ ์ด ๋…ผ๋ฌธ์ด๋‹ค. ์ด ๋…ผ๋ฌธ์€ ์ž์œจ ์ฃผํ–‰์˜ SAE ๋ ˆ๋ฒจ(L0~L5) ๋ถ„๋ฅ˜์ฒ˜๋Ÿผ, ๋ฐ์ดํ„ฐ ์—์ด์ „ํŠธ์˜ '์ž์œจ์„ฑ ์ˆ˜์ค€'์„ 6๋‹จ๊ณ„๋กœ ์ •์˜ํ•˜๋Š” ๊ณ„์ธต์  ๋ถ„๋ฅ˜ ์ฒด๊ณ„๋ฅผ ์ตœ์ดˆ๋กœ ์ œ์•ˆํ•œ๋‹ค. ์ด ๋ถ„๋ฅ˜ ์ฒด๊ณ„๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๊ธฐ์กด ์—ฐ๊ตฌ๋“ค์„ ์ฒด๊ณ„์ ์œผ๋กœ ๊ฒ€ํ† ํ•˜๊ณ , ํŠนํžˆ ํ˜„์žฌ L2์—์„œ L3๋กœ ๋„˜์–ด๊ฐ€๋Š” ๋‹จ๊ณ„์˜ ๊ธฐ์ˆ ์  ๊ณผ์ œ๋ฅผ ๋ถ„์„ํ•˜๋ฉฐ, ํ–ฅํ›„ ์™„์ „ ์ž์œจ ์—์ด์ „ํŠธ(L5)๋กœ ๋‚˜์•„๊ฐ€๊ธฐ ์œ„ํ•œ ๋กœ๋“œ๋งต์„ ์ œ์‹œํ•œ๋‹ค.

RoboOmni: Proactive Robot Manipulation in Omni-modal Context

Paper, Project
๊ธฐ์กด ๋กœ๋ด‡์ด ๋ช…์‹œ์  ์ง€์‹œ์— ์˜์กดํ•˜๋Š” ํ•œ๊ณ„๋ฅผ ๋„˜์–ด, ์‹ค์ œ ํ™˜๊ฒฝ์ฒ˜๋Ÿผ ์‚ฌ์šฉ์ž์˜ ๋ง, ์ฃผ๋ณ€ ์†Œ๋ฆฌ, ์‹œ๊ฐ์  ๋‹จ์„œ ๋“ฑ '๋ชจ๋“ (omni-modal) ๋งฅ๋ฝ'์„ ํŒŒ์•…ํ•ด ์˜๋„๋ฅผ ๋Šฅ๋™์ ์œผ๋กœ ์ถ”๋ก ํ•˜๊ณ  ํ–‰๋™ํ•˜๋Š” ๋กœ๋ด‡ 'RoboOmni'๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด (1) ๋Œ€ํ™”, ์†Œ๋ฆฌ, ์‹œ๊ฐ ์ •๋ณด๋ฅผ ํ†ตํ•ฉํ•˜๋Š” ์˜ด๋‹ˆ ๋ชจ๋‹ฌ LLM ๊ธฐ๋ฐ˜์˜ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ๊ฐœ๋ฐœํ•˜๊ณ , (2) ์ด๋Ÿฌํ•œ ๋Šฅ๋™์  ์˜๋„ ์ธ์‹์„ ํ›ˆ๋ จ์‹œํ‚ค๊ธฐ ์œ„ํ•œ ๋Œ€๊ทœ๋ชจ 'OmniAction' ๋ฐ์ดํ„ฐ์…‹์„ ๊ตฌ์ถ•ํ–ˆ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, RoboOmni๋Š” ํ…์ŠคํŠธ๋‚˜ ์Œ์„ฑ ์ธ์‹(ASR) ๊ธฐ๋ฐ˜ ๋ชจ๋ธ๋ณด๋‹ค ์ž‘์—… ์„ฑ๊ณต๋ฅ ๊ณผ ๋Šฅ๋™์  ์ง€์› ๋Šฅ๋ ฅ์—์„œ ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.

WorldGrow: Generating Infinite 3D World

Paper, Project
๊ธฐํ•˜ํ•™์ /์‹œ๊ฐ์ ์œผ๋กœ ์ผ๊ด€์„ฑ์„ ์œ ์ง€ํ•˜๋ฉฐ '๋ฌดํ•œํžˆ ํ™•์žฅ ๊ฐ€๋Šฅํ•œ 3D ์„ธ๊ณ„'๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ๋‹ค๋ฃฌ๋‹ค. ๊ธฐ์กด 3D ๋ชจ๋ธ์ด ๊ฐ์ฒด ์ค‘์‹ฌ์ด๊ฑฐ๋‚˜ ์Šค์ผ€์ผ์—…์ด ์–ด๋ ค์šด ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•ด, 'WorldGrow'๋Š” ๊ณ„์ธต์  ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ํ•ต์‹ฌ ์•„์ด๋””์–ด๋Š” (1) ์‚ฌ์ „ ํ›ˆ๋ จ๋œ 3D ๋ชจ๋ธ์„ ํ™œ์šฉํ•ด ๊ตฌ์กฐํ™”๋œ '์”ฌ ๋ธ”๋ก'์„ ์ƒ์„ฑํ•˜๊ณ , (2) '3D ๋ธ”๋ก ์ธํŽ˜์ธํŒ…' ๊ธฐ์ˆ ๋กœ ๋งฅ๋ฝ์— ๋งž๊ฒŒ ์”ฌ์„ ํ™•์žฅํ•˜๋ฉฐ, (3) 'Coarse-to-fine' ์ „๋žต์œผ๋กœ ์ „์ฒด ๊ตฌ์กฐ์™€ ์„ธ๋ถ€ ๋””ํ…Œ์ผ์„ ๋ชจ๋‘ ์žก๋Š” ๊ฒƒ์ด๋‹ค. ๊ทธ ๊ฒฐ๊ณผ, ์‚ฌ์‹ค์ ์ด๊ณ  ๊ตฌ์กฐ์ ์œผ๋กœ ์ผ๊ด€๋œ ๋ฌดํ•œ 3D ์”ฌ ์ƒ์„ฑ์„ SOTA(์ตœ๊ณ  ์„ฑ๋Šฅ) ์ˆ˜์ค€์œผ๋กœ ๋‹ฌ์„ฑํ–ˆ๋‹ค.

profile
XR๊ณผ AI์— ๊ด€์‹ฌ์ด ๋งŽ์€ Sky ์ž…๋‹ˆ๋‹ค.

0๊ฐœ์˜ ๋Œ“๊ธ€