MLOps - Deployment

yozzum·2025년 1월 25일
0

MLOps

목록 보기
2/19


[Key challenges in "Deploy in production"]

  • Software Engineering
    • Realtime or Batch
    • Cloud vs Edge/Browser
    • Compute resources (CPU/GPU/memory)
    • Latency, throughput(QPS)
    • Logging
    • Security and privacy

[Key challenges in "Monitor & maintain system"]

  • ★ Concept drift: changes in the relationship between the input features and the target variable
  • ★ Data drift: changes in the distribution of input data itself

[Deployment patterns]

  • Shadow mode:

    • ML system shadows the human and runs in parallel.
    • ML system's output not used for any decisions during this phase.
  • Canary deployment

    • Roll out to small fraction (say 5%) of traffic initially.
    • Monitor system and ramp up traffic gradually.
  • Blue Green deployment

    • Use a router to switch between old version and new version.
    • easy way to enable rollback.

[Degrees of automation]

  • Human only > Shadow mode > AI assistance > Partial automation > Full automation

[Monitoring]

  • Brainstorm the things that could go wrong
  • Brainstorm a few stats/metrics that will detect the problem.
  • Software metrics: memory, compute, latency, throughput, server load, ...
  • Input metrics: Avg input length, Avg input volume, Num missing values, ...
  • Output metrics: Num missing values, Num outliers, ...

As a result, either manual retraining or automatic retraining is performed.

profile
yozzum

0개의 댓글