profile
AI ๐Ÿ˜Ž
post-thumbnail

AI EXPO 2026 ๊ตญ์ œ์ธ๊ณต์ง€๋Šฅ๋Œ€์ „ ๊ด€๋žŒ ํ›„๊ธฐ

์ฝ”์—‘์Šค์—์„œ ์ง„ํ–‰๋œ AI EXPO 2026 ์— ๋‹ค๋…€์™”์Šต๋‹ˆ๋‹ค. ํ–‰์‚ฌ์ •๋ณด ๋งํฌ ์ฝ”์—‘์Šค : https://www.coex.co.kr/exhibitions/๊ตญ์ œ์ธ๊ณต์ง€๋Šฅ๋Œ€์ „-2/ ๊ตญ์ œ์ธ๊ณต์ง€๋Šฅ๋Œ€์ „: http://www.aiexpo.co.kr/home/v4.php?s=34 ์‹œ๊ฐ„: 05/06(์ˆ˜) - 05/08(๊ธˆ) 10:00 - 17:00 ์žฅ์†Œ: ์ฝ”์—‘์Šค Hall ...

2026๋…„ 5์›” 7์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[Paper Review] Easy Turn: Integrating Acoustic and Linguistic Modalities for Robust Turn-Taking in Full-Duplex Spoken Dialogue Systems

#Full-duplex spoken dialogue systems #turn taking detection #์Œํ–ฅ๊ณผ ์–ธ์–ด๋ฅผ ํ•จ๊ป˜ ์จ์„œ, ๋” ์ž์—ฐ์Šค๋Ÿฌ์šด ๋Œ€ํ™”๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์„๊นŒ? โœ”๏ธ ๋ฐฐ๊ฒฝ ์ตœ๊ทผ spoken dialogue system์€ ๋‹จ์ˆœํžˆ โ€œ์งˆ๋ฌธํ•˜๋ฉด ๋Œ€๋‹ตํ•˜๋Š”โ€

2026๋…„ 5์›” 3์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท

PersonaPlex ๋ฆฌ๋ทฐ

PersonaPlex: Voice and Role Control for Full Duplex Conversational Speech Models โœ”๏ธ ๋ฐฐ๊ฒฝ ์ตœ๊ทผ ์Œ์„ฑ AI๋Š” ๋‹จ์ˆœํžˆ ์ž์—ฐ์Šค๋Ÿฌ์šด ์Œ์„ฑ์„ ํ•ฉ์„ฑํ•˜๋Š” TTS๋ฅผ ๋„˜์–ด์„œ, ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋“ฃ๊ณ , ๋งํ•˜๊ณ , ๋ผ์–ด๋“ค๊ณ , ๋งž์žฅ๊ตฌ์น˜๋ฉฐ, ์ƒํ™ฉ์— ๋งž๋Š” ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋Œ€ํ™”ํ˜• ์Œ์„ฑ ์—์ด์ „ํŠธ๋กœ ๋น ๋ฅด๊ฒŒ ํ™•์žฅ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํŠน...

2026๋…„ 4์›” 26์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

ํ™”์ž ๋ถ„๋ฆฌ(Speaker Diarization) ๊ธฐ์ดˆ(2) - VAD, UBM

์ž‘์„ฑ์ค‘

2026๋…„ 4์›” 13์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท

[Paper Review] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

https://arxiv.org/abs/2306.00978 ์ž‘์„ฑ์ค‘..

2026๋…„ 4์›” 11์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

ํ™”์ž ๋ถ„๋ฆฌ(Speaker Diarization) ๊ธฐ์ดˆ (1) - MFCC

์‹œ์ž‘ํ•˜๊ธฐ ์ „์—... ์šฉ์–ด๋ฅผ ํ—ท๊ฐˆ๋ ค ํ•˜์‹ค๊นŒ๋ด spectrum, spectrogram, mel spectrogram, mfcc ์˜ ์ฐจ์ด๋ฅผ ๊ฐ„๋‹จํžˆ ์ •๋ฆฌํ•˜์ž๋ฉด ์ด์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€ ์ถœ์ฒ˜ waveform โ†’ (pre-emphasis) โ†’ STFT(framing(hamming window, overlap, hop size) โ†’ ๊ฐ ํ”„๋ ˆ์ž„์— DFT(์‹ค์ œ๋กœ๋Š” ์—ฐ์‚ฐ ...

2026๋…„ 4์›” 9์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท

[Paper Review] Emotion Concepts and their Function in a Large Language Model

https://transformer-circuits.pub/2026/emotions/index.html ์ž‘์„ฑ์ค‘..

2026๋…„ 4์›” 8์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท

chown โ€“ ํŒŒ์ผ ์†Œ์œ ์ž ๋ณ€๊ฒฝ

chown โ€“ ํŒŒ์ผ ์†Œ์œ ์ž ๋ณ€๊ฒฝ โœ”๏ธ ๊ธฐ๋ณธ ๊ฐœ๋… chown์€ ํŒŒ์ผ์ด๋‚˜ ๋””๋ ‰ํ† ๋ฆฌ์˜ ์†Œ์œ ์ž(owner)์™€ ๊ทธ๋ฃน(group) ์„ ๋ณ€๊ฒฝํ•˜๋Š” ๋ช…๋ น์–ด์ด๋‹ค. โœ”๏ธ ์‚ฌ์šฉ ์˜ˆ์‹œ sudo chown -R [์†Œ์œ ์ž]:[๊ทธ๋ฃน] [๋Œ€์ƒ ๋””๋ ‰ํ† ๋ฆฌ] โœ”๏ธ ์˜๋ฏธ sudo : ๊ด€๋ฆฌ์ž ๊ถŒํ•œ์œผ๋กœ ์‹คํ–‰ chown : ์†Œ์œ ์ž ๋ณ€๊ฒฝ -R : ํ•˜์œ„ ๋””๋ ‰ํ† ๋ฆฌ๊นŒ์ง€ ์žฌ๊ท€์ ์œผ๋กœ ์ ์šฉ

2026๋…„ 4์›” 7์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท

kaldi ์„ธํŒ…ํ•˜๊ธฐ

kaldi ์„ธํŒ…ํ•˜๊ธฐ..

2026๋…„ 4์›” 6์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท

๊ตฌ๊ธ€ ํด๋ผ์šฐ๋“œ์—์„œ GPU ์„ธํŒ…ํ•˜๋Š” ๋ฒ•

์ž‘์„ฑ์ค‘...

2026๋…„ 4์›” 2์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[Paper Review] โ€“ Qwen3-TTS

qwen3-tts

2026๋…„ 4์›” 2์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[Paper Review] โ€“ StreamFlow: Streaming Audio Generation from Discrete Tokens via Streaming Flow Matching

#streaming_decoder(โ‰ˆvocoder)

2026๋…„ 3์›” 31์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

DNS/URGENT ์ฑŒ๋ฆฐ์ง€

DNS ์ฑŒ๋ฆฐ์ง€์™€ URGENT ์ฑŒ๋ฆฐ์ง€ ์†Œ๊ฐœ

2026๋…„ 3์›” 24์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

์˜จ๋””๋ฐ”์ด์Šค AI

์˜ค๋Š˜์€ ์˜จ๋””๋ฐ”์ด์Šค์— ๋Œ€ํ•ด์„œ ๊ฐ€๋ณ๊ฒŒ ์ ์–ด๋ณด๋ ค๊ณ  ํ•œ๋‹ค. ์ธํ„ฐ๋„ท ์ž๋ฃŒ์— ์˜์ง€๋ฅผ ๋งŽ์ด ํ–ˆ๋‹ค๋ณด๋‹ˆ ๋‚ด์šฉ ๊ฒ€์ฆ์€ ๋” ํ•„์š”ํ•˜๋‹คใ…Ž;; ์•„๋ฌดํŠผ... ์˜จ๋””๋ฐ”์ด์Šค AI๋Š” ํด๋ผ์šฐ๋“œ ์„œ๋ฒ„ ๋Œ€์‹  ๊ธฐ๊ธฐ ๋‚ด๋ถ€์—์„œ AI ์ถ”๋ก ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค. ๋ณด์•ˆ, ์ง€์—ฐ์‹œ๊ฐ„, ์˜คํ”„๋ผ์ธ ๊ฐ€์šฉ์„ฑ, ๋น„์šฉ ์ธก๋ฉด์—์„œ ์ด์ 

2026๋…„ 3์›” 22์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[Paper Review] FSPEN: An Ultra-Lightweight Network for Real Time Speech Enhancement

์ถœ์ฒ˜: https://research.samsung.com/blog/FSPEN-AN-ULTRA-LIGHTWEIGHT-NETWORK-FOR-REAL-TIME-SPEECH-ENAHNCMENT ์ตœ๊ทผ์— speech enhancement ๋ถ„์•ผ๋ฅผ ๋ณด๊ณ  ์žˆ๋Š” ์ค‘์ธ๋ฐ, ๊ฒฝ๋Ÿ‰ํ™”๋œ ๋ชจ

2026๋…„ 3์›” 12์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

์ง„ํญ ๋ณ€์กฐ ์ŠคํŽ™ํŠธ๋Ÿผ ๋ถ„์„

์ถœ์ฒ˜: https://www.youtube.com/watch?v=7g1BCQk226A ์šฐ๋ฆฌ๊ฐ€ ์ผ์ƒ์—์„œ ๋“ฃ๋Š” ์Œ์„ฑ์€ ๋‹จ์ˆœํžˆ ์†Œ๋ฆฌ์˜ ๋†’๋‚ฎ์ด๋‚˜ ํฌ๊ธฐ๋งŒ์œผ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์ง€ ์•Š๋‹ค. ์‚ฌ๋žŒ์˜ ๋ง์†Œ๋ฆฌ์—๋Š” ์–ธ์–ด์  ์ •๋ณด(๋ฌด์Šจ ๋ง์„ ํ•˜๋Š”์ง€)๋ฟ ์•„๋‹ˆ๋ผ ๋น„์–ธ์–ด์  ์ •๋ณด(๊ฐ์ •, ๊ฑด๊ฐ• ์ƒํƒœ, ํ™˜๊ฒฝ ๋“ฑ)๊นŒ์ง€ ๋‹ด๊ฒจ ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋‹ค์–‘ํ•œ ์ •๋ณด๋ฅผ ๋” ๊นŠ์ด ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•ด ์—ฐ๊ตฌ์ž๋“ค์€ ์˜ค๋žซ๋™์•ˆ ์Œ...

2026๋…„ 3์›” 8์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท

Accent ยท Prosody ยท Emotion ยท Duration ์ •๋ฆฌ

์˜ˆ์ „์— ์ฝ์—ˆ๋˜(2024๋…„๋„) VC/TTS์—์„œ์˜ Accent / Prosody / Emotion / Duration modeling์„ ์ •๋ฆฌํ•˜๊ณ ์ž ํ•œ๋‹ค. Feature๋ฅผ ์–ด๋–ป๊ฒŒ ์ •์˜ํ•˜๊ณ  ๋ชจ๋ธ๋งํ•  ๊ฒƒ์ธ๊ฐ€? ์šฐ์„  ๊ฐ€์žฅ ๊ทผ๋ณธ์ ์ธ ์งˆ๋ฌธ์€ ์ด๊ฒƒ์ด๋‹ค. f0, duration, en

2026๋…„ 3์›” 1์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท

[Paper Review] GLASS Flows

GLASS Flows: Transition Sampling for Alignment of Flow and Diffusion

2026๋…„ 2์›” 22์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท
post-thumbnail

[Paper Review] moshi - temporal/depth transformer

moshi - temporal and depth transformer

2026๋…„ 2์›” 8์ผ
ยท
0๊ฐœ์˜ ๋Œ“๊ธ€
ยท