Fix slow ImageNet workloads on PyTorch #897

rka97 · 2026-01-12T04:26:42Z

Change the number of workers to be larger.
Add caching so that we don't have to rescan the file system every time we run the PyTorch workload (made a big difference while debugging!).
Use torch built-in attention for the vision transformer. The custom module was a lot slower than JAX's built-in attention on the A100 (but crucially not on the V100, where they were the same speed!).
Compile the ViT workload as well, without compilation it's a lot slower than JAX.
Add a test that measures the speed difference between the jax and pytorch workloads. Currently this depends on conda envs, but we can modify it to use docker instead.

github-actions · 2026-01-12T04:26:53Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

rka97 added 3 commits December 12, 2025 01:47

ImageNet caching for faster dataset access PyTorch

2f865a1

some benchmarking steps

9c93fc2

Change num_workers for imagenet, add validation tests for step times

f6974eb

rka97 requested a review from priyakasimbeg January 12, 2026 04:26

rka97 self-assigned this Jan 12, 2026

rka97 requested a review from a team as a code owner January 12, 2026 04:26

priyakasimbeg merged commit 056281e into a100 Jan 12, 2026
43 checks passed

github-actions bot locked and limited conversation to collaborators Jan 12, 2026

Provide feedback