Skip to content

[ICLR 2026] Scalingcache: extreme acceleration of dits through difference scaling and dynamic interval caching

Notifications You must be signed in to change notification settings

KlingAIResearch/ScalingCache

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ScalingCache: Extreme Acceleration of DiTs through Difference Scaling and Dynamic Interval Caching

Lihui Gu1*‡, Jingbin He2‡, Lianghao Su2, Kang He2, Wenxiao Wang1†, Yuliang Liu2†
(*Work done during an internship at Kling AI Infra, Kuaishou Technology ‡contributed equally to this work †corresponding author)

1Zhejiang University, 2Kuaishou Technology

🔥 News

  • 2024/09/30 🚀🚀 We release ScalingCach e for Wan2.1, HunyuanVideo and FLUX.

  • 2024/09/20 🤗🤗 We release ScalingCache project page

🚀 Main Performance

Text to Video

Methods Speedup ↑ PSNR ↑ SSIM ↑ LPIPS ↓ VBench (%) ↑
Wan2.1 1.3B (T = 50) - - - 83.31
+ 40% steps 2.5× 14.50 0.523 0.437 80.30
+ Teacache₀.₀₈ 2.0× 22.57 0.806 0.128 81.04
+ Taylorseer 1.9× 13.52 0.510 0.447 81.97
+ EasyCache 2.5× 25.24 0.834 0.095 82.48
+ Ours₁₀ (ours) 2.5× 26.61 0.890 0.071 82.92
Methods Speedup ↑ PSNR ↑ SSIM ↑ LPIPS ↓ VBench (%) ↑
Wan2.1 14B (T = 50) - - - 84.05
+ 50% steps 2.0× 15.82 0.696 0.336 79.36
+ TeaCache₀.₁₄ 1.5× 18.60 0.688 0.244 83.95
+ MixCache 1.8× 23.45 0.814 0.124 83.97
+ Ours₁₀ (ours) 2.5× 25.63 0.861 0.083 83.87
Methods Speedup ↑ PSNR ↑ SSIM ↑ LPIPS ↓ VBench (%) ↑
HunyuanVideo (T = 50) - - - 81.40
+ 50% steps 2.0× 17.57 0.734 0.247 78.78
+ TeaCache₀.₁ 1.5× 23.85 0.819 0.173 80.87
+ MixCache 1.8× 26.86 0.906 0.060 80.98
+ Taylorseer 2.8× 26.57 0.860 0.135 80.74
+ EasyCache 2.2× 29.20 0.904 0.063 80.69
+ Ours₁₂ (ours) 2.2× 30.80 0.930 0.049 81.13

Text to Image

Methods Speedup ↑ PSNR ↑ SSIM ↑ LPIPS ↓ Clip Score (%) ↑
FLUX 1.dev (T = 50) - - - 80.17
+ 50% steps 2.0× 29.36 0.683 0.318 78.88
+ TeaCache₀.₆ 2.0× 28.08 0.400 0.690 81.79
+ Taylorseer₃ 2.8× 30.76 0.780 0.230 80.17
+ Ours₁₀ (ours) 3.0× 32.28 0.819 0.131 80.25

🛠️ Usage

ScalingCache operates in two main stages:

  1. offline computation for scaling coefficients
  2. online inference

We provide precomputed coefficient dictionaries under assets/alpha_dict/ for models including Wan2.1, HunyuanVideo, and Flux. You can skip step 1 and proceed directly with model inference.

Detailed instructions for each supported model are provided in their respective directories.

👍 Acknowledgements

  • Thanks to Taylorseer for proposing the use of Taylor expansion for feature prediction and caching.
  • Thanks to EasyCache for the inspiration it provided for our dynamic caching strategy.
  • Thanks to DiT for their great work and codebase upon which we build TaylorSeer-DiT.
  • Thanks to FLUX for their great work and codebase upon which we build TaylorSeer-FLUX.
  • Thanks to HunyuanVideo for their great work and codebase upon which we build TaylorSeer-HunyuanVideo.
  • Thanks to Wan2.1 for their great work and codebase upon which we build TaylorSeer-Wan2.1.
  • Thanks to VBench for Text-to-Video quality evaluation.
  • Thanks to Drawbench for providing the text-to-image dataset.

📌 Citation

📧 Contact

If you have any questions, please email glh9803@outlook.com.

About

[ICLR 2026] Scalingcache: extreme acceleration of dits through difference scaling and dynamic interval caching

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages