[2025/W36] ๐Ÿค— Weekly AI Research

Skyยท2025๋…„ 9์›” 6์ผ

Weekly AI Research Digest

๋ชฉ๋ก ๋ณด๊ธฐ
56/89
post-thumbnail

๊ฐ•ํ™”ํ•™์Šต์œผ๋กœ ์ง„ํ™”ํ•˜๋Š” AI ์—์ด์ „ํŠธ, ์ž์œจ์  ์ง€๋Šฅ์˜ ์ƒˆ๋กœ์šด ์ง€ํ‰์„ ์—ด๋‹ค
์ฝ”๋“œ ๋ณด์•ˆ, ๋กœ๋ด‡ ์ œ์–ด๋ถ€ํ„ฐ 3D ์ƒ์„ฑ๊นŒ์ง€, ํ˜„์‹ค ๋ฌธ์ œ ํ•ด๊ฒฐ์„ ์œ„ํ•œ ์ตœ์‹  ์—ฐ๊ตฌ ๋™ํ–ฅ

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper, Project
์ด ๋…ผ๋ฌธ์€ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ(LLM)์„ ํ™œ์šฉํ•˜๋Š” ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„์ธ '์—์ด์ „ํ‹ฑ ๊ฐ•ํ™”ํ•™์Šต(Agentic RL)'์— ๋Œ€ํ•œ ํฌ๊ด„์ ์ธ ์„œ๋ฒ ์ด ๋…ผ๋ฌธ์ด๋‹ค. ์ด ์—ฐ๊ตฌ๋Š” ๊ธฐ์กด ๊ฐ•ํ™”ํ•™์Šต์ด LLM์„ ์ˆ˜๋™์ ์ธ ํ…์ŠคํŠธ ์ƒ์„ฑ๊ธฐ๋กœ ๊ฐ„์ฃผํ–ˆ๋˜ ๊ฒƒ๊ณผ ๋‹ฌ๋ฆฌ, ์—์ด์ „ํ‹ฑ RL์€ LLM์„ ๋ณต์žกํ•œ ํ™˜๊ฒฝ์—์„œ ์Šค์Šค๋กœ ์˜์‚ฌ๊ฒฐ์ •ํ•˜๋Š” ์ž์œจ์  ์—์ด์ „ํŠธ๋กœ ๋ฐ”๋ผ๋ณด๋Š” ๊ฐœ๋…์  ์ „ํ™˜์„ ์ œ์‹œํ•œ๋‹ค. ๋…ผ๋ฌธ์€ ์—์ด์ „ํŠธ์˜ ํ•ต์‹ฌ ๋Šฅ๋ ฅ๊ณผ ์‘์šฉ ๋ถ„์•ผ๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์ฒด๊ณ„์ ์ธ ๋ถ„๋ฅ˜๋ฒ•์„ ์ œ์•ˆํ•˜๋ฉฐ, ๊ฐ•ํ™”ํ•™์Šต์ด ์ด๋Ÿฌํ•œ ๋Šฅ๋ ฅ๋“ค์„ ์‹ค์ œ ์—์ด์ „ํŠธ์˜ ํ–‰๋™์œผ๋กœ ์ „ํ™˜ํ•˜๋Š” ํ•ต์‹ฌ ๋ฉ”์ปค๋‹ˆ์ฆ˜์ž„์„ ์ฃผ์žฅํ•œ๋‹ค. 500ํŽธ ์ด์ƒ์˜ ์ตœ์‹  ์—ฐ๊ตฌ๋ฅผ ์ข…ํ•ฉํ•˜์—ฌ ๋น ๋ฅด๊ฒŒ ๋ฐœ์ „ํ•˜๋Š” AI ์—์ด์ „ํŠธ ๋ถ„์•ผ์˜ ์ „์ฒด์ ์ธ ์ง€ํ˜•๋„๋ฅผ ๊ทธ๋ฆฌ๊ณ  ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ์„ ์ œ์‹œํ•˜๋Š” ๋กœ๋“œ๋งต์„ ์ œ๊ณตํ•œ๋‹ค.

A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code

Paper
A.S.E๋Š” AI๊ฐ€ ์ƒ์„ฑํ•œ ์ฝ”๋“œ์˜ ๋ณด์•ˆ์„ฑ์„ ๋ณด๋‹ค ํ˜„์‹ค์ ์œผ๋กœ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๋ฒค์น˜๋งˆํฌ A.S.E๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๊ธฐ์กด ๋ฒค์น˜๋งˆํฌ๋“ค์ด ๋‹จํŽธ์ ์ธ ์ฝ”๋“œ ์กฐ๊ฐ๋งŒ ๋ณด๊ฑฐ๋‚˜ ์žฌํ˜„์„ฑ์ด ๋–จ์–ด์ง€๋Š” ํ•œ๊ณ„๋ฅผ ๊ฐ€์กŒ๋˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, ์ด ๋ฒค์น˜๋งˆํฌ๋Š” ์‹ค์ œ ๋ณด์•ˆ ์ทจ์•ฝ์ (CVE)์ด ์žˆ์—ˆ๋˜ ์ „์ฒด ์ฝ”๋“œ ์ €์žฅ์†Œ(repository)๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜์—ฌ ํ”„๋กœ์ ํŠธ์˜ ์™„์ „ํ•œ ๋งฅ๋ฝ ์†์—์„œ ํ‰๊ฐ€๋ฅผ ์ˆ˜ํ–‰ํ•œ๋‹ค. ์ปจํ…Œ์ด๋„ˆ ๊ธฐ์ˆ ์„ ํ†ตํ•ด ์•ˆ์ •์ ์ด๊ณ  ์žฌํ˜„ ๊ฐ€๋Šฅํ•œ ํ‰๊ฐ€ ํ™˜๊ฒฝ์„ ๊ตฌ์ถ•ํ–ˆ์œผ๋ฉฐ, ์‹คํ—˜ ๊ฒฐ๊ณผ ์ข…ํ•ฉ ์„ฑ๋Šฅ์€ Claude-3.7-Sonnet์ด ๊ฐ€์žฅ ๋›ฐ์–ด๋‚ฌ๊ณ  ๋ณด์•ˆ ํŒจ์น˜ ์ž‘์—…์—๋Š” ๋ณต์žกํ•œ ์ถ”๋ก ๋ณด๋‹ค ๊ฐ„๊ฒฐํ•˜๊ณ  ๋น ๋ฅธ ๋ฐฉ์‹์ด ๋” ํšจ๊ณผ์ ์ด๋ผ๋Š” ์‚ฌ์‹ค์„ ๋ฐœ๊ฒฌํ–ˆ๋‹ค.

A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers

Paper, Project
์ด ๋…ผ๋ฌธ์€ ๊ณผํ•™ ์—ฐ๊ตฌ ๋ถ„์•ผ์— ํŠนํ™”๋œ LLM, ์ฆ‰ Sci-LLM์˜ ๋ฐœ์ „์„ ๋ฐ์ดํ„ฐ ์ค‘์‹ฌ์  ๊ด€์ ์—์„œ ์ข…ํ•ฉ์ ์œผ๋กœ ๋ถ„์„ํ•œ ์„œ๋ฒ ์ด ๋…ผ๋ฌธ์ด๋‹ค. ์ด ์—ฐ๊ตฌ๋Š” Sci-LLM์˜ ๋ฐœ์ „์ด ๋‹ค์ค‘ ๋ชจ๋“œ, ๋‹ค์ค‘ ์Šค์ผ€์ผ ๋“ฑ ๊ณ ์œ ํ•œ ๋ณต์žก์„ฑ์„ ์ง€๋‹Œ ๊ณผํ•™ ๋ฐ์ดํ„ฐ์™€์˜ ์ƒํ˜ธ ์ง„ํ™” ๊ณผ์ •์ด๋ผ๊ณ  ์ฃผ์žฅํ•œ๋‹ค. ๋…ผ๋ฌธ์€ 270๊ฐœ ์ด์ƒ์˜ ๋ฐ์ดํ„ฐ์…‹๊ณผ 190๊ฐœ ์ด์ƒ์˜ ๋ฒค์น˜๋งˆํฌ๋ฅผ ์ฒด๊ณ„์ ์œผ๋กœ ๋ถ„์„ํ•˜๋ฉฐ, ๋ฏธ๋ž˜์—๋Š” Sci-LLM ๊ธฐ๋ฐ˜์˜ ์ž์œจ ์—์ด์ „ํŠธ๊ฐ€ ์ง์ ‘ ์‹คํ—˜ํ•˜๊ณ  ์ง€์‹์„ ๊ฒ€์ฆํ•˜๋Š” 'ํ์‡„ ๋ฃจํ”„(closed-loop)' ์‹œ์Šคํ…œ์œผ๋กœ ๋ฐœ์ „ํ•˜์—ฌ ๊ณผํ•™์  ๋ฐœ๊ฒฌ์˜ ์ง„์ •ํ•œ ํŒŒํŠธ๋„ˆ๊ฐ€ ๋  ๊ฒƒ์ด๋ผ๋Š” ์ „๋ง์„ ์ œ์‹œํ•œ๋‹ค.

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper, Project
R-4B๋Š” ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ LLM(MLLM)์ด ๋ฌธ์ œ์˜ ๋‚œ์ด๋„์— ๋”ฐ๋ผ ์ƒ๊ฐ ๊ณผ์ •์˜ ํ™œ์„ฑํ™” ์—ฌ๋ถ€๋ฅผ ์Šค์Šค๋กœ ๊ฒฐ์ •ํ•˜๋„๋ก ๋งŒ๋“œ๋Š” ํšจ์œจ์ ์ธ ๋ชจ๋ธ์ด๋‹ค. ๋ณต์žกํ•œ ๋ฌธ์ œ ํ•ด๊ฒฐ์— ํšจ๊ณผ์ ์ธ ๋‹จ๊ณ„๋ณ„ ์‚ฌ๊ณ  ๋ฐฉ์‹์ด ๊ฐ„๋‹จํ•œ ๋ฌธ์ œ์—๋Š” ๋ถˆํ•„์š”ํ•œ ๊ณ„์‚ฐ ๋‚ญ๋น„๋ผ๋Š” ์ ์— ์ฐฉ์•ˆํ•˜์—ฌ, '์ƒ๊ฐํ•˜๋Š” ๋ชจ๋“œ'์™€ '์ƒ๊ฐํ•˜์ง€ ์•Š๋Š” ๋ชจ๋“œ'๋ฅผ ๋ชจ๋‘ ํ•™์Šต์‹œํ‚จ๋‹ค. ์ดํ›„ ๊ฐ•ํ™”ํ•™์Šต์„ ํ†ตํ•ด ๋ฌธ์ œ์— ๋งž์ถฐ ์ ์ ˆํ•œ ๋ชจ๋“œ๋ฅผ ์„ ํƒํ•˜๋Š” ๋Šฅ๋ ฅ์„ ์ตœ์ ํ™”ํ•จ์œผ๋กœ์จ, ๋” ์ ์€ ๊ณ„์‚ฐ ๋น„์šฉ์œผ๋กœ๋„ ํ›จ์”ฌ ํฐ ๋ชจ๋ธ๊ณผ ํ•„์ ํ•˜๋Š” ๋†’์€ ์ถ”๋ก  ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ–ˆ๋‹ค.

Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth

Paper, Project
Drivel-ology๋Š” "๊นŠ์ด๊ฐ€ ์žˆ๋Š” ํ—›์†Œ๋ฆฌ"๋ผ๋Š” ๋…ํŠนํ•œ ์–ธ์–ด ํ˜„์ƒ์„ ์†Œ๊ฐœํ•˜๊ณ , ํ˜„์žฌ์˜ LLM์ด ์ด๋ฅผ ์ดํ•ดํ•˜๋Š” ๋ฐ ํ•œ๊ณ„๊ฐ€ ์žˆ์Œ์„ ๋ฐํžˆ๋Š” ์—ฐ๊ตฌ์ด๋‹ค. Drivelology๋Š” ๋ฌธ๋ฒ•์ ์œผ๋กœ ์™„๋ฒฝํ•˜์ง€๋งŒ ์ˆจ๊ฒจ์ง„ ์—ญ์„ค์ด๋‚˜ ๊ฐ์ •, ์ˆ˜์‚ฌ์  ์˜๋„๋ฅผ ๋‹ด๊ณ  ์žˆ๋Š” ํ‘œํ˜„์„ ์˜๋ฏธํ•œ๋‹ค. ์—ฐ๊ตฌ์ง„์€ ์—ฌ๋Ÿฌ ์–ธ์–ด๋กœ ๊ตฌ์„ฑ๋œ ๋ฒค์น˜๋งˆํฌ๋ฅผ ๊ตฌ์ถ•ํ•˜์—ฌ LLM์„ ํ‰๊ฐ€ํ–ˆ์œผ๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ ๋ชจ๋ธ๋“ค์ด ์ด๋Ÿฌํ•œ ํ‘œํ˜„์„ ๋‹จ์ˆœํ•œ ๋„Œ์„ผ์Šค๋กœ ์˜คํ•ดํ•˜๊ฑฐ๋‚˜ ์ˆจ์€ ์˜๋ฏธ๋ฅผ ํŒŒ์•…ํ•˜์ง€ ๋ชปํ•จ์„ ๋ฐœ๊ฒฌํ–ˆ๋‹ค. ์ด๋Š” LLM์˜ ํ†ต๊ณ„์  ์œ ์ฐฝํ•จ์ด ์ธ์ง€์  ์ดํ•ด์™€๋Š” ๋‹ค๋ฅด๋‹ค๋Š” ์ ์„ ๋ณด์—ฌ์ค€๋‹ค.

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper, Project
์ด ๋…ผ๋ฌธ์€ ๊ทธ๋ž˜ํ”ฝ ์‚ฌ์šฉ์ž ์ธํ„ฐํŽ˜์ด์Šค(GUI)๋ฅผ ์ž์œจ์ ์œผ๋กœ ์กฐ์ž‘ํ•˜๋Š” ์—์ด์ „ํŠธ ๋ชจ๋ธ UI-TARS-2์˜ ๊ฐœ๋ฐœ ๊ณผ์ •๊ณผ ์„ฑ๊ณผ๋ฅผ ์ƒ์„ธํžˆ ๊ธฐ์ˆ ํ•œ ๋ณด๊ณ ์„œ์ด๋‹ค. ์ด ๋ชจ๋ธ์€ ๋ฐ์ดํ„ฐ ํ™•์žฅ์„ฑ, ์•ˆ์ •์ ์ธ ๋‹ค์ค‘ ํ„ด ๊ฐ•ํ™”ํ•™์Šต, GUI ์™ธ๋ถ€ ํ™˜๊ฒฝ๊ณผ์˜ ์—ฐ๋™ ๋“ฑ ๊ธฐ์กด GUI ์—์ด์ „ํŠธ์˜ ์ฃผ์š” ๋‚œ์ œ๋“ค์„ ์ฒด๊ณ„์ ์ธ ํ›ˆ๋ จ ๋ฐฉ๋ฒ•๋ก ์œผ๋กœ ํ•ด๊ฒฐํ–ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ Mind2Web, OSWorld ๋“ฑ ์ฃผ์š” ๋ฒค์น˜๋งˆํฌ์—์„œ ๊ธฐ์กด์˜ ๊ฐ•๋ ฅํ•œ ๋ชจ๋ธ๋“ค์„ ๋Šฅ๊ฐ€ํ•˜๋Š” ์ตœ๊ณ  ์ˆ˜์ค€์˜ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ–ˆ์œผ๋ฉฐ, ๊ฒŒ์ž„ ๋ฐ ์†Œํ”„ํŠธ์›จ์–ด ์—”์ง€๋‹ˆ์–ด๋ง ๋“ฑ ๋‹ค์–‘ํ•œ ์ž‘์—…์—์„œ๋„ ๋›ฐ์–ด๋‚œ ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์„ ์ž…์ฆํ–ˆ๋‹ค.

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper, Project
SimpleTIR์€ LLM์ด ์—ฌ๋Ÿฌ ํ„ด์— ๊ฑธ์ณ ์™ธ๋ถ€ ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์ถ”๋ก ์„ ๊ฐ•ํ™”ํ•™์Šต์œผ๋กœ ์•ˆ์ •์ ์œผ๋กœ ํ›ˆ๋ จ์‹œํ‚ค๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค. ๋‹ค์ค‘ ํ„ด ๋„๊ตฌ ์‚ฌ์šฉ ํ›ˆ๋ จ ์‹œ, ์œ ํšจํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋‚ด์ง€ ๋ชปํ•˜๋Š” '๋ฌดํšจ ํ„ด(void turns)'์ด ํ•™์Šต ๋ถˆ์•ˆ์ •์„ฑ๊ณผ ์„ฑ๋Šฅ ๋ถ•๊ดด๋ฅผ ์ผ์œผํ‚ค๋Š” ํ•ต์‹ฌ ์›์ธ์ž„์„ ๋ฐœ๊ฒฌํ–ˆ๋‹ค. SimpleTIR์€ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ๋Š” ํ„ด์ด ํฌํ•จ๋œ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์ •์ฑ… ์—…๋ฐ์ดํŠธ์—์„œ ์ œ์™ธํ•˜๋Š” ๊ฐ„๋‹จํ•œ ๋ฐฉ์‹์œผ๋กœ ํ•ด๋กœ์šด ๊ทธ๋ž˜๋””์–ธํŠธ ํญ๋ฐœ์„ ๋ง‰๊ณ  ํ•™์Šต์„ ์•ˆ์ •ํ™”์‹œํ‚ค๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ์ˆ˜ํ•™ ์ถ”๋ก  ๋ฒค์น˜๋งˆํฌ์—์„œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ์ „๋ก€ ์—†๋Š” ์ˆ˜์ค€์œผ๋กœ ๋Œ์–ด์˜ฌ๋ ธ๋‹ค.

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper, Project
LLaVA-Critic-R1์€ ์‘๋‹ต์„ ํ‰๊ฐ€ํ•˜๋Š” '๋น„ํ‰๊ฐ€(Critic) ๋ชจ๋ธ'๊ณผ ์ƒ์„ฑํ•˜๋Š” '์ •์ฑ…(Policy) ๋ชจ๋ธ'์„ ๋ถ„๋ฆฌํ•˜๋˜ ๊ธฐ์กด ๊ด€ํ–‰์— ๋„์ „ํ•˜๋Š” ์—ฐ๊ตฌ์ด๋‹ค. ์—ฐ๊ตฌ์ง„์€ ๋น„ํ‰๊ฐ€ ๋ชจ๋ธ ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•ด ์ƒ์„ฑ ๋ชจ๋ธ์— ์ง์ ‘ ๊ฐ•ํ™”ํ•™์Šต์„ ์ ์šฉํ•˜์—ฌ, ๋น„ํ‰๊ณผ ์ƒ์„ฑ์„ ๋ชจ๋‘ ์ˆ˜ํ–‰ํ•˜๋Š” ํ†ตํ•ฉ ๋ชจ๋ธ LLaVA-Critic-R1์„ ๋งŒ๋“ค์—ˆ๋‹ค. ๋†€๋ž๊ฒŒ๋„ ์ด ๋ชจ๋ธ์€ ๋›ฐ์–ด๋‚œ ๋น„ํ‰๊ฐ€์ผ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, ์ „๋ฌธ์ ์ธ ์ƒ์„ฑ ๋ชจ๋ธ๋“ค๊ณผ ๋Œ€๋“ฑํ•˜๊ฑฐ๋‚˜ ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€์œผ๋ฉฐ, ์ถ”๋ก  ์‹œ '์ž๊ฐ€ ๋น„ํ‰'์„ ํ†ตํ•ด ์ถ”๊ฐ€ ํ•™์Šต ์—†์ด๋„ ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ์„ ์ฆ๋ช…ํ–ˆ๋‹ค.

EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control

Paper, Project
EmbodiedOneVision์€ ์ธ๊ฐ„์ฒ˜๋Ÿผ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ถ”๋ก ๊ณผ ๋ฌผ๋ฆฌ์  ์ƒํ˜ธ์ž‘์šฉ์„ ์œ ์—ฐํ•˜๊ฒŒ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฒ”์šฉ ๋กœ๋ด‡ ์ œ์–ด ๋ชจ๋ธ EO-1์„ ์ œ์•ˆํ•œ๋‹ค. ์ด ๋ชจ๋ธ์˜ ํ•ต์‹ฌ์€ ์ด๋ฏธ์ง€, ํ…์ŠคํŠธ, ํ–‰๋™ ๋“ฑ ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ์ฐจ๋ณ„ ์—†์ด ์ฒ˜๋ฆฌํ•˜๋Š” ํ†ตํ•ฉ ์•„ํ‚คํ…์ฒ˜์™€, 150๋งŒ ๊ฐœ ์ด์ƒ์˜ ๋ฐฉ๋Œ€ํ•œ ์‹œ๊ฐ-ํ…์ŠคํŠธ-ํ–‰๋™ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ด์€ EO-Data1.5M ๋ฐ์ดํ„ฐ์…‹์ด๋‹ค. ์ด ๋ฐ์ดํ„ฐ์…‹์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ํ•™์Šต์„ ํ†ตํ•ด EO-1์€ ์—ฌ๋Ÿฌ ์ข…๋ฅ˜์˜ ๋กœ๋ด‡์„ ์ด์šฉํ•œ ๊ธธ๊ณ  ๋ณต์žกํ•œ ์กฐ์ž‘ ์ž‘์—…์—์„œ ํ˜„์‹ค ์„ธ๊ณ„์— ๋Œ€ํ•œ ๊นŠ์€ ์ดํ•ด์™€ ๋›ฐ์–ด๋‚œ ์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ์„ ์„ฑ๊ณต์ ์œผ๋กœ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค.

Droplet3D: Commonsense Priors from Videos Facilitate 3D Generation

Paper, Project
Droplet3D๋Š” 3D ๋ฐ์ดํ„ฐ ๋ถ€์กฑ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์ธํ„ฐ๋„ท์— ํ’๋ถ€ํ•œ ๋™์˜์ƒ์—์„œ ์ƒ์‹์  ์‚ฌ์ „ ์ง€์‹(commonsense priors)์„ ์ถ”์ถœํ•˜์—ฌ 3D ์ฝ˜ํ…์ธ  ์ƒ์„ฑ์„ ๋•๋Š” ์ ‘๊ทผ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ๋™์˜์ƒ์—๋Š” ์‚ฌ๋ฌผ์˜ ์—ฌ๋Ÿฌ ์‹œ์ ์„ ํ†ตํ•ด ๊ณต๊ฐ„์  ์ผ๊ด€์„ฑ์„ ์ œ๊ณตํ•˜๊ณ , ํ’๋ถ€ํ•œ ๋งฅ๋ฝ์„ ํ†ตํ•ด ์˜๋ฏธ์  ์ •๋ณด๋ฅผ ์ „๋‹ฌํ•˜๋Š” ๋‘ ๊ฐ€์ง€ ์œ ์šฉํ•œ ๋‹จ์„œ๊ฐ€ ์กด์žฌํ•œ๋‹ค. ์—ฐ๊ตฌ์ง„์€ ์ด๋ฅผ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด ์„ธ๊ณ„ ์ตœ์ดˆ์˜ ๋Œ€๊ทœ๋ชจ ๋‹ค์ค‘ ์‹œ์  ๋น„๋””์˜ค ๋ฐ์ดํ„ฐ์…‹ Droplet3D-4M์„ ๊ตฌ์ถ•ํ•˜๊ณ  ์ด๋ฅผ ํ•™์Šตํ•œ ์ƒ์„ฑ ๋ชจ๋ธ์„ ๊ณต๊ฐœํ–ˆ์œผ๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ ๊ณต๊ฐ„์ ์œผ๋กœ ์ผ๊ด€๋˜๊ณ  ์˜๋ฏธ์ ์œผ๋กœ ํƒ€๋‹นํ•œ ๊ณ ํ’ˆ์งˆ 3D ์ฝ˜ํ…์ธ ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์„ฑ๊ณตํ–ˆ๋‹ค.

profile
XR๊ณผ AI์— ๊ด€์‹ฌ์ด ๋งŽ์€ Sky ์ž…๋‹ˆ๋‹ค.

0๊ฐœ์˜ ๋Œ“๊ธ€