Skip to content

Conversation

@vsabavat
Copy link

@vsabavat vsabavat commented Dec 31, 2025

Summary

This PR adds --container-cache to Pyxis to enable persistent reuse of the unpacked Enroot rootfs across jobs on the same node, reducing warm-start latency.

It also introduces opportunistic GC/LRU eviction for cache entries to prevent the cache filesystem from filling up, and adds a Bats test suite for cache-mode behavior.

User-facing behavior

  • New flag: srun --container-cache (also PYXIS_CONTAINER_CACHE=1)
  • Cache mode:
    • Requires --container-image
    • Rejects --container-writable and --container-save
    • Forces read-only rootfs (ENROOT_ROOTFS_WRITABLE=n)
    • Derives a stable cached container name: pyxis_cache_u<uid>_<hash>
    • Uses non-job-scoped naming so it survives Epilog cleanup patterns

Configuration

Cache mode requires an explicit cache root configured via plugstack:

  • container_cache_data_path=/raid/containers/data (required for cache mode)
  • Optional GC tuning:
    • container_cache_gc_high=85
    • container_cache_gc_low=80

Example (cluster-specific):

required /path/to/spank_pyxis.so container_cache_data_path=/raid/containers/data container_cache_gc_high=85 container_cache_gc_low=80## Cache directory layout

  • Cache root: <base> = container_cache_data_path
  • Per-user Enroot data dir: <base>/<uid> (mode 0700, owned by <uid>:<gid>)
  • Cached rootfs: <base>/<uid>/pyxis_cache_u<uid>_<hash>
  • Pyxis sets ENROOT_DATA_PATH=<base>/<uid> for cache mode.

GC / LRU eviction

  • Triggered opportunistically at job start in cache mode only when we’re about to create a new cached rootfs (cold path).
  • Uses high/low watermarks to decide when to evict.
  • Candidate selection:
    • Scans <base>/*/pyxis_cache_* (global across users)
    • Uses directory mtime as LRU (Pyxis touches the dir on use)
  • Safety:
    • Each cached rootfs has .pyxis_cache_lock
    • Jobs hold a shared lock for their lifetime
    • GC attempts an exclusive non-blocking lock; if it can’t lock, it treats the entry as in-use and skips it
    • GC is serialized with <base>/pyxis-container-cache-gc.lock

Tests

  • Added bats tests/container_cache.bats for:
    • Policy enforcement (writable/save/name-flags)
    • Stable naming/layout under <base>/<uid>
    • Read-only enforcement
    • Env var enablement (PYXIS_CONTAINER_CACHE=1)
    • GC behavior (including cross-user eviction) when the cache filesystem is above the configured high watermark

To run tests (cluster-specific; adjust paths as needed):

export SLURM_ROOT=/cm/local/apps/slurm/24.11
export PATH="$SLURM_ROOT/bin:$SLURM_ROOT/sbin:$PATH"
export LD_LIBRARY_PATH="$SLURM_ROOT/lib64:${LD_LIBRARY_PATH:-}"
export SLURM_CONF=/etc/slurm/slurm.conf

bats tests/container_cache.bats(Optional if your squashfs image differs: PYXIS_TEST_SQSH_IMAGE=/path/to/image.sqsh bats tests/container_cache.bats)

@vsabavat vsabavat changed the title cache rootfs of containers pyxis: add --container-cache persistent rootfs reuse + GC/LRU + tests Dec 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant