250101

Kim YeonJu·2025년 1월 1일

2025 Paper

목록 보기

1/1

오늘은 DiT inference accerlate 관련 논문을 읽었습니다.

Adaptive Caching for Faster Video Generation with Diffusion Transformers

Cache residual computations를 선택함
Caching schedule은 content-dependent

Adaptive caching

위와 같이 residual feature 간의 L1 distance를 계산함
여기서 k가 얼마나 떨어져 있는 것인지는 정확히 모르겠음..
만약 변화가 크면 작은 caching rate, 변화가 작으면 큰 caching rate

codebook에서 미리 정해놓은 cache-rate를 가져와서 적용

Motion regularization

최적의 denoising steps의 개수는 motion content에 따라 다르다.
생성 중에 motion을 측정하기가 쉽지 않기 때문에
noisy latent motion-score 를 계산한다. 이것은 residual frame differences로 계산됨.

이걸로 motion score를 계산함
초기 프레임과 제일 나중 프레임의 차이를 계산을 한다.

diffusion에서의 motion-gradient를 계산해서 motion을 측정함.

distance metric의 scaling factor로 motion score와 motion gradient를 사용한다.

Delta-dit: A training-free acceleration method tailored for diffusion transformers

∆-DiT: using a designed cache mechanism to accelerate the rear DiT
blocks in the early sampling stages and the front DiT blocks in the later stages

Unet 기반으로 할 거면 위 논문들도 읽어봐야 함.

DiT의 구조의 차별성과 DiT 구조의 영향에 대해서 연구가 덜되어 있어서 DiT accelerating 연구가 어렵다.

초기 layer는 outline 잡는다, 나중 layer는 좀 더 detail해짐.
이 논문에서는 DiT에 특화된 cache method를 제안한다.

0개의 댓글