[ICML 2024] SleepFM: Multi-modal Representation for Sleep Across Brain Activity, ECG and Respiratory Signals

Sarah Leeยท2025๋…„ 5์›” 18์ผ

Foundation Models for Health

๋ชฉ๋ก ๋ณด๊ธฐ
2/6
post-thumbnail

๐Ÿ–‹๏ธ ์ด ๋…ผ๋ฌธ์€ Standford ๋Œ€ํ•™๊ต์—์„œ ์ง์ ‘ ์ˆ˜์ง‘ํ•œ 14,000๋ช…์˜ ํ™˜์ž ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  Sleep Foundation Model์„ ๊ตฌ์ถ•ํ•œ ๋‚ด์šฉ์„ ๋‹ด๊ณ  ์žˆ๋‹ค. ICML์— 2024๋…„ ์ฑ„ํƒ๋˜์—ˆ์œผ๋ฉฐ, ์ฝ”๋“œ๋„ ๊ณต๊ฐœ๋˜์–ด์žˆ๋‹ค: https://github.com/rthapa84/sleepfm-codebase. Biomedical AI ๋ถ„์•ผ์—์„œ ์˜ํ•™ ์ €๋„์ด ์•„๋‹Œ AI ํ•™ํšŒ์— ๊ฐœ์ œ๋˜์—ˆ๋‹ค๋Š” ์ ์ด ์ธ์ƒ๊นŠ์—ˆ๊ณ , ์ด๋ฅผ ์œ„ํ•ด ์–ด๋– ํ•œ ๋ฐฉ๋ฒ•๋ก ๊ณผ novelty๋ฅผ ๊ฐ€์ง€๊ณ  ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ–ˆ๋Š”์ง€ ์‚ดํŽด๋ณด๋ฉด ์ถ”ํ›„์— ๋‚˜๋„ AI ํ•™ํšŒ์— ๋…ผ๋ฌธ์„ ์ œ์ถœํ•  ๋•Œ ๋„์›€์ด ๋งŽ์ด ๋  ๊ฒƒ ๊ฐ™์•„ ๋ฆฌ๋ทฐํ•ด ๋ณด์•˜๋‹ค.

1. Introduction

  1. Sleep ์—ฐ๊ตฌ์—์„œ gold standard๋กœ ์“ฐ์ด๋Š” PSG๋Š” ํฌ๊ฒŒ ์„ธ ๊ฐ€์ง€ modality๋กœ ๋‚˜๋ˆŒ ์ˆ˜ ์žˆ๋‹ค:
    • Brain Activity Signals (BAS) : electroencephalogram (EEG - ๋‡ŒํŒŒ๋„), electrooculograms (EOG - ์•ˆ์ „๋„), electroencephalogram (EMG - ๊ทผ์ „๋„). ๋ณดํ†ต ์ˆ˜๋ฉด ๋‹จ๊ณ„๋ฅผ ํŒ๋‹จํ•  ๋•Œ ๊ฐ€์žฅ ๋งŽ์ด ์‚ฌ์šฉ๋œ๋‹ค. ์ด 10๊ฐœ ์ฑ„๋„.
    • Electrocardiogram (ECG, ์‹ฌ์ „๋„) : ์‹ฌ๋ฐ•๋™์˜ ์ฃผ๊ธฐ์ค‘์— ์ผ์–ด๋‚˜๋Š” ์‹ฌ์žฅ์˜ ์ „๊ธฐ์  ํ™œ๋™ ์ƒํƒœ๋ฅผ ์ธก์ •. sleep disordered breathing events (SDB) ๊ฒ€์ถœ์— ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Œ. ์ด 2๊ฐœ ์ฑ„๋„.
    • Respiratory sensors : ๊ฐ€์Šด, ๋ณต๋ถ€ ์›€์ง์ž„, ์‹ฌ๋ฐ•, ๋น„๊ฐ• ํ๋ฆ„(nasal flow), ๊ตฌ๊ฐ• ํ๋ฆ„(oral flow)๋ฅผ ํฌํ•จ. SDB ๊ฒ€์ถœ์— ์ง์ ‘์ ์œผ๋กœ ํ™œ์šฉ๋จ. ์ด 7๊ฐœ ์ฑ„๋„.
  1. ๊ธฐ์กด ์ˆ˜๋ฉด ์—ฐ๊ตฌ๋“ค์˜ ํ•œ๊ณ„:
    • ์ˆ˜๋ฉด ํŒ๋… ์ž๋™ํ™”๋Š” labeled data์— ํ•œ์ •์ ์œผ๋กœ ์—ฐ๊ตฌ๊ฐ€ ์ด๋ฃจ์–ด์ง.
    • ํ•œ ๊ฐ€์ง€ ํƒœ์Šคํฌ์—๋งŒ ์ ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ
  1. Contrastive Learning
    • ์ˆ˜๋ฉด ๋ถ„์•ผ์—์„œ CL์„ ํ™œ์šฉํ•œ ์—ฐ๊ตฌ: single channel ECG, ECG + electronic health records (EHR)
    • multi-modal CL ์ ‘๊ทผ ๋ฐฉ๋ฒ•์œผ๋กœ PSG ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•œ ๊ฒƒ์€ ๋ณธ ์—ฐ๊ตฌ๊ฐ€ ์ตœ์ดˆ์ด๋‹ค.
  1. Contribution
    • Stanford Sleep Clinic์—์„œ 14,000๋ช…์˜ ํ™˜์ž์—๊ฒŒ์„œ ์ˆ˜์ง‘ํ•œ 100,000 ์‹œ๊ฐ„์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•œ ์ตœ์ดˆ์˜ contrastive learning ๊ธฐ๋ฐ˜์˜ foundation model์ด๋‹ค.
    • ์ธ๊ตฌํ†ต๊ณ„ํ•™์  ์ •๋ณด (๋‚˜์ด, ์„ฑ๋ณ„), ์ˆ˜๋ฉด ๋‹จ๊ณ„ ๋ถ„๋ฅ˜, ์ˆ˜๋ฉด ํ˜ธํก ์žฅ์•  ์ด๋ฒคํŠธ ๊ฒ€์ถœ์˜ ํƒœ์Šคํฌ์—์„œ SleepFM์ด baseline (end-to-end CNN model)๋ณด๋‹ค ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.
    • Leave-one-out CL ๊ธฐ๋ฒ•์„ ์ƒˆ๋กญ๊ฒŒ ๋„์ž…ํ•ด์„œ Pairwise CL๋ณด๋‹ค downstream task์—์„œ ์„ฑ๋Šฅ์ด ์ข‹์Œ์„ ์ฆ๋ช…ํ•˜์˜€๋‹ค.
  1. Machine Learning for Analyzing Sleep Data
    • Autoencoders, CNNs, RNNs, DNNs ๋“ฑ์ด ์ˆ˜๋ฉด ๋‹จ๊ณ„ ๋ถ„๋ฅ˜์— ์“ฐ์ž„.
    • ํ˜ธํก ์ด๋ฒคํŠธ ๊ฒ€์ถœ ํƒœ์Šคํฌ์—์„œ๋Š” ECG, EEG์™€ respiratory channels ๋“ฑ์ด ์ฃผ๋กœ ์“ฐ์ž„. multi-modal์„ ์‚ฌ์šฉํ•˜์—ฌ (EEG, EOG, EMG) multi-task (e.g. sleep stages, arousal, leg movements, and sleep-disordered breathing) learning ๋ชจ๋ธ๋„ ์žˆ์—ˆ์ง€๋งŒ, ๋ชจ๋‘ supervised learning ์ด์—ˆ๋‹ค.
  1. Contrastive Learning
    • ์ปดํ“จํ„ฐ ๋น„์ „์—์„œ ์‹œ์ž‘๋˜์–ด ๋ฐœ์ „ํ•œ self-supervised learning ํƒœํฌ๋‹‰. InfoNCE, SimCLR, MoCo, SupCon ๋“ฑ์ด ์žˆ์Œ. ์ด๋“ค์€ ๋Œ€๋ถ€๋ถ„ image-based์˜ uni-modal contrastive appraoch ๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Œ.
    • ๊ทธ์— ๋ฐ˜ํ•ด Contrastive Language-Image Pretraining (CLIP) ์€ image์™€ text embedding์„ ํ™œ์šฉํ•œ multi-modal model.
    • ConVIRT: chest radiographs ์™€ report ๋ฅผ ํ™œ์šฉ (multi-modal)
    • ์ปดํ“จํ„ฐ ๋น„์ „ ์™ธ ์ˆ˜๋ฉด ์˜ํ•™๊ณ„์—์„œ ์“ฐ์ธ multi-modal contrastive learning: ECG + structured records / ECG + EHR + clinical notes
    • SleepFM์€ PSG ๋ฐ์ดํ„ฐ์˜ 19๊ฐœ channel์„ ํ™œ์šฉํ•œ ์ตœ์ดˆ์˜ multi-modal contrastive model์ด๋‹ค.

3. Method

3.1. Dataset and Preprocessing

  • 30-second ๋‹จ์œ„ - ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” clip์ด๋ผ๊ณ  ์นญํ•จ.
  • resampled to 256 Hz
  • Sleep Stage: Wake, Stage1, Stage2, Stage3, REM ๋ถ„๋ฅ˜ ๋ฌธ์ œ
  • SDB: binary label
  • ์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์„ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋‚˜๋ˆ”: pretrain/train/validation/test as (11,261/1,265/141/1,401) - ํ™˜์ž ๊ธฐ์ค€
  • pretrained dataset์€ Foundation model์„ pretrain ํ•  ๋•Œ์—๋งŒ ์“ฐ์ž„.

3.2. Embedding Model

  • 3๊ฐœ์˜ EfficientNet architecture๋ฅผ ํ™œ์šฉํ•œ multi-channel 1D convolution์ด ๊ฐ๊ฐ modality์˜ embedding model (Encoder)๋กœ ์‚ฌ์šฉ๋จ.
  • Depthwise separable convolution, Dropout layer, Residual Connection์ด ์‚ฌ์šฉ๋จ.

3.3. Multi-modal Contrastive Learning

  • postive pair ๊ธฐ์ค€: ๊ฐ™์€ ์‹œ๊ฐ„์˜ different modality - temporally aligned 30-second clips across modality
  • negative pair: ๊ทธ ์™ธ training batch์•ˆ์˜ non-matching instances๋Š” ๋ชจ๋‘ negative pair๋กœ ํ•™์Šต.
  • 2๊ฐ€์ง€ Contrastive Learning Types (+ ์—ฌ๊ธฐ์—์„œ ์ƒˆ๋กญ๊ฒŒ ์ œ์‹œํ•œ Leave-one-out CL)
    • Pairwise CL: 3๊ฐœ์˜ modality ์ค‘ 2๊ฐœ๋ฅผ ๊ณ ๋ฅด๋Š” ์กฐํ•ฉ์œผ๋กœ ๊ฐ๊ฐ contrastive loss๋ฅผ ๊ณ„์‚ฐ, ์ตœ์ข…์ ์œผ๋กœ ๋ชจ๋“  ์กฐํ•ฉ์—์„œ ๋‚˜์˜จ loss๋ฅผ ํ•ฉ์ณ์„œ ์‚ฌ์šฉ.
    • Leave-one-out CL: 3๊ฐœ์˜ modality ์ค‘ 2๊ฐœ์˜ embedding์„ ํ‰๊ท  ๋‚ธ ๊ฒƒ๊ณผ ๋‹ค๋ฅธ ํ•˜๋‚˜์˜ modality๋ฅผ pair๋กœ ์‚ฌ์šฉ.

3.4. Model Training

  • Baseline model: 1D EfficientNet ๊ตฌ์กฐ๋ฅผ ์‚ฌ์šฉํ•œ CNN ๊ธฐ๋ฐ˜ ๋ชจ๋ธ
  • Downstream task๋ฅผ ์œ„ํ•ด modality encoder์—์„œ ๊ฐ๊ฐ training, validation, test set์˜ embedding์„ ๋ฝ‘๋Š”๋‹ค. ์ดํ›„ Logistic Regresion classifier๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ sleep stage์™€ SDB ํƒœ์Šคํฌ๋ฅผ ํ•™์Šต์‹œํ‚จ๋‹ค.
  • ์ฝ”๋“œ๋ฅผ ๋ณด๋‹ˆ Logistic Regression์œผ๋กœ multi-class training ์€ OVR(one vs. rest) ๋ฐฉ์‹์œผ๋กœ ํ•˜์˜€์Œ. XGBClassifier๋กœ๋„ ํ…Œ์ŠคํŠธ ํ•ด๋ณธ ๊ฒƒ ๊ฐ™์€๋ฐ Logistic Regression์ด ๋” ์ž˜ ๋‚˜์™€์„œ ํ›„์ž๋ฅผ ์„ ํƒํ•œ ๊ฒƒ ๊ฐ™์Œ.

๐Ÿ’ก ๋‚˜์˜ ๊ถ๊ธˆ์ฆ: ์™œ Foundation Model์—์„œ๋Š” Ridge/Logistic Regression๊ณผ ๊ฐ™์€ ๋‹จ์ˆœํ•œ ๋จธ์‹ ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜์˜ linear probing์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ผ๊นŒ? Deep Neural Network (FC layer ํ˜น์€ CNN layer) ๋“ฑ์„ ์‚ฌ์šฉํ•˜๋ฉด ์•ˆ๋˜๋‚˜? ์•„๋ž˜๋Š” ChatGPT์˜ ๋‹ต๋ณ€์ด๋‹ค.

  • ์šฐ๋ฆฌ๋Š” Foundation model์˜ embedding์ด ์–ผ๋งˆ๋‚˜ informative ํ•œ์ง€, linearly separable ํ•œ ๊ฒƒ์ด ๊ถ๊ธˆํ•จ.
  • ๋งŒ์•ฝ DNN์„ ์‚ฌ์šฉํ•œ๋‹ค๋ฉด:
    • Overfit to small downstream data
    • Hide the weakness of poor embeddings by adding capacity ๋“ฑ์˜ ๋ฌธ์ œ๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Œ.
  • ๋”ฐ๋ผ์„œ logistic/ridge regression์€ model capacity๊ฐ€ ๋ณ€์ˆ˜๊ฐ€ ๋  ์ˆ˜ ์žˆ๋Š” ๊ฐ€๋Šฅ์„ฑ์„ ์—†์• ๋ฏ€๋กœ, foundation model์˜ embedding quality์—๋งŒ ์˜์กดํ•˜์—ฌ ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋ฅผ ํ’€ ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค.

4. Experiments and Results

๐Ÿ–‹๏ธ ๋…ผ๋ฌธ์˜ ํ•˜์ด๋ผ์ดํŠธ์ธ ์‹คํ—˜ ํŒŒํŠธ์ด๋‹ค. ๋‹ค์–‘ํ•œ ๋‹ค์šด์ŠคํŠธ๋ฆผ ํƒœ์Šคํฌ์—์„œ SleepFM ์›”๋“ฑํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๊ณ , multi-modality, few-shot ์„ฑ๋Šฅ, public dataset์„ ํ™œ์šฉํ•œ external validation์„ ํ•˜์˜€๋‹ค.

4.1. Demographic Attributes Classification

  1. ๋‚˜์ด๋ฅผ ์ด 4๊ฐœ์˜ ๊ทธ๋ฃน์œผ๋กœ ๋‚˜๋ˆ„์–ด์„œ ๋งž์ถ”๋Š” ๋ถ„๋ฅ˜ ๋ฌธ์ œ:
  • Leave-One-Out ๋ฐฉ์‹์ด ์„ฑ๋Šฅ์ด ๊ฐ€์žฅ ์ข‹์•˜๋‹ค.
  1. ์„ฑ๋ณ„ ๋‚จ vs. ์—ฌ ๋ถ„๋ฅ˜ ๋ฌธ์ œ:
  • ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ Leave-One-Out ๋ฐฉ์‹์ด ์„ฑ๋Šฅ์ด ๊ฐ€์žฅ ์ข‹์•˜๋‹ค.

4.2. Retrieval Analysis

  • ํ•˜๋‚˜์˜ modality embedding์„ ๊ฐ€์ง€๊ณ  cosine similarity๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด embedding์„ ์ฐพ๊ธฐ
  • Top10 ์ค‘์—์„œ ์ •๋‹ต pair๊ฐ€ ์žˆ์œผ๋ฉด count
  • ์ด 90,000๊ฐœ์˜ ๋žœ๋คํ•˜๊ฒŒ ์„ ํƒ๋œ clip์„ ๊ฐ€์ง€๊ณ  ์‹œํ–‰.
  • ์—ฌ๊ธฐ์—์„œ randomํ•˜๊ฒŒ 10๊ฐœ๋ฅผ ๊ณจ๋ž์„ ๋•Œ Recall@10๋Š” 10/90000 = 0.0001 (์ด ๊ฐ’๋ณด๋‹ค ์ปค์•ผ ํ•จ)

  • ๊ฒฐ๊ณผ: Pairwise CL์ด ์›”๋“ฑํžˆ ์„ฑ๋Šฅ์ด ์ข‹์•˜์Œ. ๋ฌธ์ œ ์ž์ฒด๊ฐ€ pairwise cosine similiarty๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— training ์‹œ์— pairwise๋กœ ํ›ˆ๋ จ์‹œํ‚ค๋Š” ๊ฒƒ์ด ๋” ํšจ๊ณผ์ ์ด์—ˆ์„ ๊ฒƒ.
  • ๋‹ค๋งŒ Respiratory๋กœ ๋‹ค๋ฅธ modality๋ฅผ retrieval ํ•˜๋Š” ํƒœ์Šคํฌ์—์„œ๋Š” ๋น„๊ต์  ์„ฑ๋Šฅ์ด ์ข‹์ง€ ์•Š์•˜์Œ. ๊ทธ ์ด์œ ๋กœ๋Š” BAS๋‚˜ ECG๋Š” ์ง์ ‘์ ์ธ ์ „๊ธฐ ์ž๊ทน์œผ๋กœ ์ธก์ •ํ•˜๋Š” ๊ฒƒ์ธ๋ฐ Repiratory channel์€ ๊ฐ„์ ‘์ ์ธ ์›€์ง์ž„์„ ๊ธฐ๋กํ•˜๋Š” ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— ๋ชธ์˜ ์ž์„ธ๋‚˜ non-breathing related motion ๋“ฑ์— ์˜ํ–ฅ์„ ๋ฐ›์„ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋ผ๊ณ  ์„ค๋ช….

4.3. Downstream Classification Tasks

  1. Sleep Stage classification ์„ฑ๋Šฅ

  2. SDB classification ์„ฑ๋Šฅ (Binary)

  • Supervised CNN: pretraining + training dataset์„ ๋ชจ๋‘ ํ™œ์šฉํ•˜์—ฌ ํ›ˆ๋ จ
  • SleepFM (Leave-On-Out / Pairwise): pretraining์œผ๋กœ FM ๋ชจ๋ธ ํ›ˆ๋ จ, training data๋กœ linear probing.
  • ๊ฒฐ๊ณผ: SleepFM ์ค‘ Leave-One-Out ์ด ์„ฑ๋Šฅ์ด ์ข‹์•˜๋‹ค.
  • ์ถ”๊ฐ€์ ์œผ๋กœ, ๊ฐ๊ฐ์˜ single modality embedding๋งŒ์„ ๊ฐ€์ง€๊ณ  ์œ„ ํƒœ์Šคํฌ๋ฅผ ์ง„ํ–‰. BAS๋Š” sleep stage classification์—์„œ ์ข‹์€ ์„ฑ๋Šฅ์„, repiratory signal์€ SDB event detection์—์„œ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ๊ฐ๊ฐ์˜ single modality๋งŒ์„ ๊ฐ€์ง€๊ณ ๋„ ์–ด๋А ์ •๋„์˜ ์„ฑ๋Šฅ์€ ๋‚˜์˜ค๋”๋ผ.
  • ์„œ๋กœ ๋‹ค๋ฅธ age์™€ gender group์•ˆ์—์„œ๋„ ๊ฐ๊ฐ์˜ ํƒœ์Šคํฌ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•ด ๋ณด์•˜์„ ๋•Œ, ํฐ ์ฐจ์ด๊ฐ€ ์—†์—ˆ๋‹ค.

4.4. Few-Shot Evaluation

  • few shot์˜ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜๊ธฐ ์œ„ํ•ด model์ด ๋ณด๋Š” ํ™˜์ž์˜ ์ˆ˜๋ฅผ k=1 ๋ถ€ํ„ฐ ์ตœ๋Œ€(1265๋ช…)๊นŒ์ง€ ๋Š˜๋ ค๊ฐ€๋ฉฐ ๋น„๊ตํ•˜์˜€์Œ.
  • Supervised CNN: few-shot example๋งŒ ๊ฐ€์ง€๊ณ  ํ›ˆ๋ จ
  • SleepFM: ์ด๋ฏธ pretrained ๋œ ๋ชจ๋ธ์— few-shot๋งŒ ๊ฐ€์ง€๊ณ  logistic regression ๋ชจ๋ธ ํ›ˆ๋ จ
  • ์˜ˆ์ƒํ•  ์ˆ˜ ์žˆ๋“ฏ์ด, ๊ฒฐ๊ณผ๋Š” SleepFM, Leave-One-Out์ด ๊ฐ€์žฅ ์ข‹์•˜๋‹ค.

4.5. Benefit of Multi-Modal Pretraining

  • 3๊ฐœ์˜ modality๋ฅผ ์ „๋ถ€ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๊ฐ€์žฅ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€์Œ.
  • 2 modality์™€ single modality๋ฅผ ๊ฐ€์ง€๊ณ  pretraining ์„ ํ†ตํ•ด Foundation Model์„ ํ•™์Šต
  • ์ดํ›„ 3๊ฐœ, 2๊ฐœ, 1๊ฐœ์˜ modality๋ฅผ ๊ฐ€์ง€๊ณ  ํ•™์Šตํ•œ Foundation Model์—์„œ ๊ฐ๊ฐ BAS embedding์„ ๋ฝ‘์•„ sleep stage classification์„ ์œ„ํ•œ logistic regression ์„ ํ•™์Šต, ๋™์ผํ•˜๊ฒŒ SDB detection์—์„œ๋Š” respiratory embedding๋งŒ ๋ฝ‘์•„์„œ ํ•™์Šตํ•˜์—ฌ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜์˜€๋‹ค.
  • Sleep Stage์—์„œ๋Š” BAS๊ฐ€ ๊ฐ€์žฅ ์—ฐ๊ด€์„ฑ์ด ๋†’์œผ๋ฏ€๋กœ BAS / BAS+RESP / BAS+ECG / BAS+ECG+RESP ๋ฅผ ํ™œ์šฉํ•œ Foundation Model ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜์˜€๊ณ ,
  • SDB์—์„œ๋Š” Respiratory signal์ด ๊ฐ€์žฅ ์—ฐ๊ด€์„ฑ์ด ๋†’์œผ๋ฏ€๋กœ RESP / RESP + BAS / RESP + ECG / RESP + BAS + ECG ๋ฅผ ํ™œ์šฉํ•œ Foundation Model ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜์˜€์Œ.
  • Few-shot evaluation์œผ๋กœ training dataset์— ์“ฐ์ธ ํ™˜์ž ์ˆ˜๋ฅผ ์ ์ฐจ ๋Š˜๋ ค๊ฐ€๋ฉฐ ์„ฑ๋Šฅ ๋น„๊ต
  • ๊ฒฐ๋ก ์ ์œผ๋กœ 2๊ฐœ์˜ modality๋ฅผ ๊ฒฐํ•ฉํ•œ BAS-ECG ๋ชจ๋ธ๊ณผ RESP-ECG ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด 3๊ฐœ modality๋ฅผ ์‚ฌ์šฉํ•œ ๊ฒƒ๊ณผ ๋น„์Šทํ•œ(ํ˜น์€ ๋” ์ข‹์€) ์„ฑ๋Šฅ์„ ๋ณด์˜€๋Š”๋ฐ, ์ด๋ฅผ ํ†ตํ•ด ECG signal์ด ์ข€๋” pretraining ์‹œ์— representation์„ ๋”์šฑ ํ’๋ถ€ํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์ค€๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.
  • ๋˜ํ•œ BAS๋‚˜ RESP๋ฅผ ๋‹จ๋…์œผ๋กœ pretraining ์— ์‚ฌ์šฉ์‹œ์—๋Š” ์„ฑ๋Šฅ์ด ์ง€์†์ ์œผ๋กœ ์ข‹์ง€ ์•Š์•˜๋‹ค.

4.6. External Validation

  • pretraining stage์—์„œ๋Š” Stanford ์—์„œ ์ˆ˜์ง‘ํ•œ ๋ฐ์ดํ„ฐ์…‹์„ ํ™œ์šฉํ•˜๊ณ , downstream task์—์„œ๋Š” PhysioNet2018 ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ generalizability ๋ฅผ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค.
  • ๋น„๊ต: Supervised-CNN์€ external dataset์œผ๋กœ๋งŒ supervised learning ์„ ์ง„ํ–‰ํ•˜์˜€์Œ.
  • 100๊ฐœ์˜ test set์— ๋Œ€ํ•œ ์„ฑ๋Šฅ ๋น„๊ต ๊ฒฐ๊ณผ: SleepFM์ด Supervised CNN๋ณด๋‹ค ์ข‹์•˜๋‹ค.
  • ์ฆ‰, domain์ด ๋‹ค๋ฅธ ๋ณ‘์›์˜ ๋ฐ์ดํ„ฐ๋ฅผ ํ•˜๋‚˜๋„ ๋ณด์ง€ ์•Š์€ pretrained foundation model์ด ์ƒˆ๋กœ์šด site์˜ ๋ฐ์ดํ„ฐ์…‹์—๋„ adaptationํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์˜€๋‹ค.

5. Discussion and Conclusion

Future Work

  • Standford์—์„œ ์ˆ˜์ง‘ํ•œ one institution sleep data๋งŒ์„ ๊ฐ€์ง€๊ณ  ์‹คํ—˜์„ ์ง„ํ–‰ํ•˜์˜€๋‹ค. -> ์ข€ ๋” ๋‹ค์–‘ํ•œ site๋ฅผ ๊ฐ€์ง€๊ณ  generalizability๋ฅผ ์ธก์ •ํ•˜๋Š” ๊ฒƒ์ด ์œ ์˜๋ฏธํ•  ๊ฒƒ.
  • Downstream task๋กœ sleep stage์™€ SDB detection๋งŒ์„ ์‚ฌ์šฉํ–ˆ์ง€๋งŒ, arousal detection, periodic leg movements, disease ๋“ฑ์„ ๋ถ„๋ฅ˜ํ•˜๋Š” ํƒœ์Šคํฌ๋„ ๊ฐ€๋Šฅํ•  ๊ฒƒ์ด๋‹ค.
  • Contrastive Learning์„ ์ œ์™ธํ•˜๊ณ ๋„ ๋‹ค๋ฅธ SSL method๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜๋Š” ๊ฒƒ.
profile
AI ์„ธ์ƒ์—์„œ ๊ฐœ๋ฐœ์ž๋กœ ์‚ด์•„๋‚จ๊ธฐ

0๊ฐœ์˜ ๋Œ“๊ธ€