profile
Currently pursuing my Ph.D. in GIST, I am deeply intrigued by the field of speaker diarization and committed to making meaningful contributions to it.
post-thumbnail

2024, Prompt-driven Target Speech Diarization (PTSD) [ICASSP]

Prompt-driven Target Speech Diarization

2023년 11월 16일
·
0개의 댓글
·

2022, VBx [Computer Speech and Language]

Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks PaperNotion Link

2023년 11월 16일
·
0개의 댓글
·

2023, Disentangling Voice and Content with Self-Supervision [NeurIPS]

2023, Disentangling Voice and Content with Self-Supervision [NeurIPS]

2023년 11월 16일
·
0개의 댓글
·

2021, Xi-Vector [SPL]

Neural Speaker Embeddings with Uncertainty

2023년 11월 16일
·
0개의 댓글
·
post-thumbnail

2023, Frame-wise and overlap-robust speaker embeddings [ICASSP]

논문 리뷰

2023년 11월 16일
·
0개의 댓글
·

2023, DR-DESA [ICASSP]

DR-DESA: "Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity"

2023년 11월 16일
·
0개의 댓글
·

2021, AutoEncoder, attention-aggregation [Interspeech]

AA+DR+NS: "Adapting Speaker Embeddings for Speaker Diarisation", in Proc. Interspeech, 2021. (Naver)

2023년 11월 16일
·
0개의 댓글
·
post-thumbnail

2019, DNC [SLT]

Discriminative Neural Clustering (DNC)

2023년 6월 5일
·
0개의 댓글
·
post-thumbnail

2023, Simulated Conversations [ICASSP]

From simulated mixtures to simulated conversations (BUT)

2023년 6월 1일
·
0개의 댓글
·
post-thumbnail

2023, In Search of Strong Embedding Extractors For Speaker Diarization [ICASSP]

Published on ICASSP 2023, Naver CLOCA

2023년 5월 24일
·
1개의 댓글
·
post-thumbnail

2021, TalkNet-ASD [MMSP]

Published on ACM.MMSP 2021

2023년 5월 22일
·
0개의 댓글
·
post-thumbnail

2022, DAB-DETR [ICLR]

Published on ICLR 2022 , Tsinghua University

2023년 5월 17일
·
0개의 댓글
·
post-thumbnail

2023, Seq2Seq-TS-VAD [ICASSP]

2023, Seq2Seq-TS-VAD [ICASSP]

2023년 5월 17일
·
0개의 댓글
·
post-thumbnail

2022, End-to-End Audio-Visual Neural Speaker Diarization [2022, Interspeech]

MISP baseline, paper, githubmultimodal inputsuses audio features, lip regions of interest, and i-vector embeddingsI-vectors are the key point to solve

2023년 5월 17일
·
0개의 댓글
·
post-thumbnail

2023, WHU-Alibaba [MISP 2022]

Figure SYSTEM DESCRIPTION Visual front-end modified ResNet18-3D model for processing lip videos They make three changes to the standard Pytorch imp

2023년 5월 17일
·
0개의 댓글
·
post-thumbnail

2022, AV-HuBERT [ICLR]

Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction, in Proc. ICLR 2022

2023년 5월 17일
·
0개의 댓글
·
post-thumbnail

2022, Rethinking Audio-Visual Synchronization for Active Speaker Detection [MLSP]

IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2022)

2023년 4월 26일
·
0개의 댓글
·
post-thumbnail

2020, Active Speaker in Context (ASC) [CVPR]

2020, Active Speakers in Context" in CVPR

2023년 4월 21일
·
0개의 댓글
·
post-thumbnail

2023, LoCoNet: Long-Short Context Network for Active Speaker Detection [CVPR]

2023, LoCoNet: Long-Short Context Network for Active Speaker Detection, in CVPR

2023년 4월 20일
·
0개의 댓글
·