Skip to content

Conversation

@are-ces
Copy link
Contributor

@are-ces are-ces commented Dec 11, 2025

Description

  • Consuming the downstream llama-stack image (in server mode)
  • Bumped up to RHOAI 3.2 (lls 0.3.4)
  • Unified configs (removed different config for library mode)
  • Updated run.yaml, run-azure.yaml and run-ci.yaml to the new format

Breaking change: consumers of server mode will need to request access to the downstream llama-stack image from AIPCC.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: NA
  • Generated by: NA

Related Tickets & Documents

  • Related Issue # LCORE-1107, LCORE-1073
  • Closes # LCORE-1107, LCORE-1073

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Optional debug logging for the llama-stack service; CI workflows now perform authenticated Docker login for quay access.
  • Documentation

    • Added registry login guidance before starting services.
  • Bug Fixes

    • Improved CI configuration discovery and clearer error messages when configs are missing; tests now expect llama-stack v0.3.4.
  • Chores

    • Bumped llama-stack and client to v0.3.4, consolidated runtime storage to the user home, and updated base container image and related configs.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 11, 2025

Walkthrough

Replaces mode-based e2e workflow config selection with environment-driven CONFIG_ENVIRONMENT, adds quay.io Docker login steps in multiple workflows, bumps llama-stack packages to 0.3.4 (and test/constants), centralizes storage to ~/.llama/storage with unified kv/sql backends, removes legacy library/server-mode configs, and adds LLAMA_STACK_LOGGING env support.

Changes

Cohort / File(s) Summary
Dependency / version updates
pyproject.toml, requirements.aarch64.txt, requirements.x86_64.txt, src/constants.py, tests/e2e/features/info.feature
Bump llama-stack and llama-stack-client 0.3.0 → 0.3.4; update MAXIMAL_SUPPORTED_LLAMA_STACK_VERSION to 0.3.4; update e2e test expectation to 0.3.4; multiple dependency hash/version refreshes across platform requirements files.
CI / workflow changes
.github/workflows/e2e_tests.yaml, .github/workflows/e2e_tests_rhaiis.yaml, .github/workflows/e2e_tests_rhelai.yaml
Replace CONFIG_MODE with CONFIG_ENVIRONMENT and always select tests/e2e/configs/run-$ENVIRONMENT.yaml (copy to run.yaml or fail with available list); add Docker login to quay.io using workflow secrets (QUAY_* credentials); validate presence of tests/e2e/configs.
Storage / config restructuring
run.yaml, tests/e2e/configs/run-*.yaml (e.g., tests/e2e/configs/run-ci.yaml, .../run-azure.yaml), tests/e2e/configs/run-library.yaml (deleted), run_library.yaml (deleted)
Consolidate per-section db_path into centralized storage.backends (kv_default, sql_default) and storage.stores; migrate paths from /tmp~/.llama/storage; introduce registered_resources (models, tool_groups, etc.); remove inline models blocks and delete library-mode config files.
Container & compose changes
Containerfile, test.containerfile, docker-compose.yaml
Change created directories from .llama/distributions/ollama.llama/storage; update test base image in test.containerfile to quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2; add LLAMA_STACK_LOGGING=${LLAMA_STACK_LOGGING:-} env var to docker-compose.yaml.
Docs
README.md
Add IMPORTANT note about quay.io membership and provide podman/docker login guidance before starting services.
Removed test/config files
tests/e2e/configuration/lightspeed-stack-library-mode.yaml, tests/e2e/configuration/lightspeed-stack-server-mode.yaml
Remove legacy Lightspeed library/server mode configuration files (complete deletions).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Areas needing extra attention:
    • run.yaml and tests/e2e/configs/* — persistence/storage schema changes, path migrations, and resource registry adjustments.
    • Workflow changes (.github/workflows/*) — CONFIG_ENVIRONMENT handling, file discovery logic, and Docker login secret names/placement.
    • Dependency bumps (pyproject.toml, requirements files, src/constants.py) and their runtime/test impacts.

Possibly related PRs

Suggested labels

ok-to-test

Suggested reviewers

  • tisnik

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main changes: version bump to RHOAI 3.2 (llama-stack 0.3.4) and consuming the downstream image. It directly maps to the primary objectives and file modifications.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f86a818 and fa629c8.

📒 Files selected for processing (2)
  • requirements.aarch64.txt (9 hunks)
  • requirements.x86_64.txt (9 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.

Applied to files:

  • requirements.x86_64.txt
  • requirements.aarch64.txt
📚 Learning: 2025-08-18T10:58:14.951Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:47-47
Timestamp: 2025-08-18T10:58:14.951Z
Learning: psycopg2-binary is required by some llama-stack providers in the lightspeed-stack project, so it cannot be replaced with psycopg v3 or moved to optional dependencies without breaking llama-stack functionality.

Applied to files:

  • requirements.x86_64.txt
  • requirements.aarch64.txt
📚 Learning: 2025-08-18T10:56:55.349Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:0-0
Timestamp: 2025-08-18T10:56:55.349Z
Learning: The lightspeed-stack project intentionally uses a "generic image" approach, bundling many dependencies directly in the base runtime image to work for everyone, rather than using lean base images with optional dependency groups.

Applied to files:

  • requirements.x86_64.txt
  • requirements.aarch64.txt
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (16)
requirements.x86_64.txt (7)

2-2: Keep the generator command in sync with CI/docs (now explicitly refreshing llama-stack packages).
Since this lockfile is auto-generated, ensure any documented regen steps and CI workflows use the same uv pip compile ... --refresh-package llama-stack --refresh-package llama-stack-client invocation to avoid churn/diffs.


233-235: Authlib bump: verify downstream usage paths still authenticate correctly.
authlib==1.6.6 can affect OAuth/OIDC flows and token validation behaviors; please confirm any integration points (e.g., provider auth, callback handlers) still pass e2e.


659-661: FastAPI bump: double-check FastAPI↔Starlette compatibility and behavior changes.
You’re on fastapi==0.124.2 with starlette==0.50.0; please confirm this pairing is what FastAPI expects and that server-mode endpoints (incl. streaming/SSE if used) still behave the same.

Also applies to: 3373-3380


1396-1402: llama-stack/client bump to 0.3.4: validate runtime API/config compatibility with RHOAI 3.2 expectations.
Given this PR’s primary goal is moving to llama-stack==0.3.4 / llama-stack-client==0.3.4, please ensure the server-mode image boots with the new config format and that any client calls used by lightspeed-stack (or tests) still match the updated APIs.


1965-1967: OpenAI SDK bump: verify litellm/llama-stack integrations and any API surface changes.
openai==2.11.0 can change request/response typing and error shapes; please confirm integrations (direct or via litellm) still work and tests cover key paths.


1972-2022: OpenTelemetry suite bump: good that versions are consistent; verify exporter wiring and runtime compatibility.
All opentelemetry-* pins look aligned (1.39.1 and 0.60b1). Please confirm traces/metrics export still works in your target env (collector/export endpoints, grpc/http exporters) and that no semantic-conventions changes affect dashboards/alerts.


3325-3325: SQLAlchemy hash-only changes: ensure lock reproducibility and wheel availability for the target platform.
Since sqlalchemy==2.0.45 didn’t change but hashes did, please confirm this lock still installs cleanly for python 3.12 on x86_64-unknown-linux-gnu and that CI won’t see intermittent hash mismatches.

Also applies to: 3332-3332, 3351-3351, 3353-3353

requirements.aarch64.txt (9)

1-2: Keep the uv compile command in-sync with CI/release build inputs (platform, py=3.12, refresh-package).
Since this file is generated, any mismatch between this command and CI’s actual compile/install settings can cause non-reproducible dependency sets (or hash mismatches). Consider ensuring the exact command (or its inputs) is enforced/documented in the workflow that regenerates requirements.


233-235: Authlib bump: verify downstream integrations (OIDC/JWT) still work with 1.6.6.
If any providers rely on specific Authlib APIs/behaviors, this can be a subtle runtime break.


659-665: FastAPI bump: double-check compatibility across FastAPI ↔ Starlette ↔ Pydantic pins.
Given FastAPI/Pydantic are tightly coupled, please confirm the full set works together at runtime (especially request/response models + validation errors + OpenAPI generation).

Also applies to: 2618-2627, 3374-3380


1396-1406: llama-stack / llama-stack-client bump to 0.3.4: confirm it matches the downstream image’s expectations.
Since the PR objective is “consume downstream llama-stack image (server mode)”, please verify the Python deps here don’t drift from what the downstream image ships/assumes (client/server protocol, config schema expectations, optional providers).


1965-1972: OpenAI SDK bump to 2.11.0: verify compatibility with litellm + any direct SDK usage.
If you have any code paths importing openai directly (vs via litellm), this is worth a quick smoke test for breaking API changes.


1972-2022: OpenTelemetry family bumps: verify exporter wiring + semantic conventions behavior doesn’t change unexpectedly.
In particular, check OTLP HTTP/GRPC export, resource attributes, and any auto-instrumentation expectations in your runtime env.


1643-1699: matplotlib / sentence-transformers bumps: confirm aarch64 wheels are available for Py3.12 and installs don’t fall back to source builds.
This is a common failure mode on aarch64 if a wheel is missing for the exact python/platform tag.

Also applies to: 3292-3295


3314-3369: SQLAlchemy hash refresh: ensure it’s not masking a resolver change.
A few hashes are marked as changed—worth confirming this was a controlled uv refresh (same version) and not a subtle platform wheel swap that breaks your runtime constraints.


642-658: Keeping faiss-cpu and psycopg2-binary top-level pins is consistent with this repo’s dependency policy.
No concerns here given prior rationale about controlling transitive deps and llama-stack provider requirements. Based on learnings, this is the right shape.

Also applies to: 2433-2504


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
README.md (1)

645-647: Uncomment the login command for clarity.

The login guidance includes a commented-out podman login quay.io command, which might confuse users. Consider uncommenting it or removing the comment, as users need to execute this command before starting services.

Apply this diff to uncomment the login command:

 # Login to quay.io to access the downstream llama-stack image
-# podman login quay.io
+podman login quay.io
 
 # Start both services
 podman compose up --build
.github/workflows/e2e_tests.yaml (1)

142-147: Quay login step is fine; add minimal hardening

The quay login is wired correctly to QUAY_DOWNSTREAM_USERNAME / QUAY_DOWNSTREAM_TOKEN and will fail fast on bad credentials.

To make it a bit more robust:

  • Quote the secret to avoid edge cases with shell metacharacters:
    echo "$QUAY_ROBOT_TOKEN" | docker login quay.io -u="$QUAY_ROBOT_USERNAME" --password-stdin
  • Optionally add a simple guard so missing secrets fail with a clearer message before attempting login.

Also please confirm in repo settings that both secrets are defined for this workflow’s environment(s).

.github/workflows/e2e_tests_rhaiis.yaml (1)

125-130: Verify quay secret names and mirror hardening from main e2e workflow

The quay login step is structurally fine, but this workflow uses QUAY_ROBOT_USERNAME / QUAY_ROBOT_TOKEN whereas the main e2e workflow uses QUAY_DOWNSTREAM_*. That’s OK if both pairs are intentionally defined, but it’s an easy place for config drift.

  • Confirm that QUAY_ROBOT_USERNAME / QUAY_ROBOT_TOKEN exist in the repo/Org secrets and are the intended credentials for this job.
  • For consistency and safety, you can also quote the secret:
    echo "$QUAY_ROBOT_TOKEN" | docker login quay.io -u="$QUAY_ROBOT_USERNAME" --password-stdin
    and optionally align the secret naming with the other workflows if they are meant to share the same robot account.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 84b8d92 and 8daa53b.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (17)
  • .github/workflows/e2e_tests.yaml (2 hunks)
  • .github/workflows/e2e_tests_rhaiis.yaml (1 hunks)
  • .github/workflows/e2e_tests_rhelai.yaml (1 hunks)
  • Containerfile (1 hunks)
  • README.md (2 hunks)
  • docker-compose.yaml (1 hunks)
  • pyproject.toml (1 hunks)
  • requirements.aarch64.txt (1 hunks)
  • requirements.x86_64.txt (1 hunks)
  • run.yaml (3 hunks)
  • run_library.yaml (0 hunks)
  • src/constants.py (1 hunks)
  • test.containerfile (1 hunks)
  • tests/e2e/configs/run-azure.yaml (2 hunks)
  • tests/e2e/configs/run-ci.yaml (2 hunks)
  • tests/e2e/configs/run-library.yaml (0 hunks)
  • tests/e2e/features/info.feature (1 hunks)
💤 Files with no reviewable changes (2)
  • run_library.yaml
  • tests/e2e/configs/run-library.yaml
🧰 Additional context used
📓 Path-based instructions (4)
pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

pyproject.toml: Configure pylint with source-roots = "src"
Exclude src/auth/k8s.py from pyright type checking

Files:

  • pyproject.toml
tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Files:

  • tests/e2e/features/info.feature
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Use absolute imports for internal modules in LCS project (e.g., from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Use logger = logging.getLogger(__name__) pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) and Optional[Type] or Type | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Use async def for I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes: Configuration for config classes, Error/Exception for exceptions, Resolver for strategy patterns, Interface for abstract base classes
Abstract classes must use ABC with @abstractmethod decorators
Include complete type annotations for all class attributes in Python classes
Use import logging and module logger pattern with standard log levels: debug, info, warning, error

Files:

  • src/constants.py
src/**/constants.py

📄 CodeRabbit inference engine (CLAUDE.md)

Define shared constants in central constants.py file with descriptive comments

Files:

  • src/constants.py
🧠 Learnings (3)
📚 Learning: 2025-08-18T10:56:55.349Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:0-0
Timestamp: 2025-08-18T10:56:55.349Z
Learning: The lightspeed-stack project intentionally uses a "generic image" approach, bundling many dependencies directly in the base runtime image to work for everyone, rather than using lean base images with optional dependency groups.

Applied to files:

  • README.md
  • requirements.x86_64.txt
  • requirements.aarch64.txt
  • test.containerfile
📚 Learning: 2025-08-18T10:58:14.951Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:47-47
Timestamp: 2025-08-18T10:58:14.951Z
Learning: psycopg2-binary is required by some llama-stack providers in the lightspeed-stack project, so it cannot be replaced with psycopg v3 or moved to optional dependencies without breaking llama-stack functionality.

Applied to files:

  • requirements.x86_64.txt
  • requirements.aarch64.txt
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.

Applied to files:

  • requirements.x86_64.txt
  • requirements.aarch64.txt
🪛 GitHub Actions: E2E Tests
test.containerfile

[error] 2-2: Docker build failed: failed to resolve source metadata for quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2 (401 UNAUTHORIZED).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-pr
🔇 Additional comments (16)
docker-compose.yaml (1)

26-28: LGTM!

The optional debug logging environment variable is a good addition for troubleshooting. The default empty value ensures no behavior change unless explicitly enabled.

README.md (1)

626-628: LGTM - Clear access requirement documentation.

The IMPORTANT note clearly communicates the authentication requirement for accessing the downstream llama-stack image.

test.containerfile (1)

10-11: Storage path migration is consistent.

The change from .llama/distributions/ollama to .llama/storage aligns with the storage path consolidation across the codebase.

pyproject.toml (1)

31-32: Both llama-stack==0.3.4 and llama-stack-client==0.3.4 are available on PyPI. The specified versions are valid and no action is required.

.github/workflows/e2e_tests_rhelai.yaml (1)

126-132: Secrets are already properly configured and used across workflows.

The QUAY_ROBOT_USERNAME and QUAY_ROBOT_TOKEN secrets are referenced in multiple workflow files (e2e_tests_rhelai.yaml, e2e_tests_rhaiis.yaml, and e2e_tests.yaml), confirming they are configured in the repository. No action needed.

src/constants.py (1)

5-5: The version bump to llama-stack 0.3.4 is safe. The release contains only bug fixes and backports (Starlette dependency bump, queue initialization fixes, adapter policies, error handling improvements, and RBAC hardening) with no breaking changes reported.

requirements.aarch64.txt (2)

1396-1399: llama-stack and llama-stack-client bumped to 0.3.4 — versions aligned across all files.

Version bump verified:

  • Both requirements.aarch64.txt and requirements.x86_64.txt pin llama-stack==0.3.4 with matching hashes
  • pyproject.toml specifies llama-stack==0.3.4 and llama-stack-client==0.3.4
  • No stale 0.3.0–0.3.3 pins remain
  • aarch64 and x86_64 maintain parity
  • Test expectations (tests/e2e/features/info.feature) already expect version 0.3.4

1400-1406: llama-stack-client bumped to 0.3.4 — version-aligned with server.

Confirmed: all four requirements files (aarch64, hermetic, torch, x86_64) and pyproject.toml pin llama-stack-client==0.3.4. No stale pins remain. The version is consistent across the codebase and actively used in 70+ locations. Approve.

requirements.x86_64.txt (1)

1396-1402: llama-stack / llama-stack-client bump looks consistent

The pins to llama-stack==0.3.4 and llama-stack-client==0.3.4 align with the PR’s stated upgrade and the updated e2e expectations; no issues from this file alone.

Please double‑check that:

  • pyproject.toml and requirements.aarch64.txt are on the same versions, and
  • the e2e info tests pass against a real deployment with 0.3.4.
tests/e2e/features/info.feature (1)

19-19: llama-stack version assertion matches the upgrade; confirm LCS version

The info scenario now correctly expects llama-stack version 0.3.4, which matches the dependency bump.

Please just confirm that the service’s own version reported in the same payload is still 0.3.1 (line 18). If the LCS app version was also bumped as part of this work, that expectation should be updated in lockstep to avoid flaky e2e failures.

tests/e2e/configs/run-ci.yaml (1)

17-27: Centralized storage layout and path changes look coherent; confirm against llama-stack 0.3.4 schema

The run-ci config now:

  • Moves conversations_store, inference_store, and metadata_store under ~/.llama/storage/..., avoiding ephemeral /tmp-style locations.
  • Introduces a storage block with:
    • backends.kv_default / sql_default pointing to dedicated KV/SQL SQLite files.
    • stores for metadata, inference, conversations, and prompts, wired by namespace/table_name.
  • Updates providers (agents, batches, datasetio, eval, files, vector_io) to use those backends via persistence / kvstore / metadata_store settings.
  • Registers the builtin rag-runtime tool group under registered_resources.tool_groups.

From a structure and wiring standpoint this all hangs together, and it matches the PR’s goal of unifying storage.

Since this depends on the exact config schema in llama-stack==0.3.4, please verify against the current upstream docs that:

  • The combination of legacy *_store keys with the new storage section is supported (or, if only storage is honored now, that any remaining legacy fields are safe to ignore), and
  • The chosen namespaces/table names (agents_state, agents_responses, faiss_store, etc.) match how your code queries these stores so you don’t silently bifurcate data between old and new locations.

Also applies to: 32-38, 44-45, 51-52, 64-65, 71-73, 102-105, 110-142

tests/e2e/configs/run-azure.yaml (3)

16-26: Clarify duplication of storage store definitions.

The configuration defines storage stores in two places: top-level conversations_store, inference_store, metadata_store (lines 16–26) with direct db_path entries, and under the new storage.stores section (lines 120–134) with backend references. This duplication may cause confusion or runtime conflicts.

Confirm whether:

  1. Both definitions are intentional (serving different purposes)
  2. The top-level definitions are legacy and should be removed
  3. Which one takes precedence during initialization

69-75: Verify storage_dir configuration in files provider.

The files provider uses a mix of centralized storage (metadata_store via sql_default backend) and a direct filesystem path (storage_dir). Confirm that:

  1. storage_dir: ~/.llama/storage is the correct location for file storage and is distinct from database storage
  2. The directory is properly created and has appropriate permissions at runtime
  3. This mixed approach (DB metadata + filesystem storage) is the intended design for the files provider

112-134: Configuration pattern is correct for llama-stack 0.3.4.

The centralized storage/backends structure with kv_sqlite and sql_sqlite backend types is the standard, documented pattern for llama-stack 0.3.x. The configuration correctly uses named backends and references them via backend + namespace (for KV stores) or table_name (for SQL stores), matching documented examples. All namespace and table_name identifiers used by providers follow the established pattern.

run.yaml (2)

1-1: No action required. The version: 2 schema is already validated by the existing test suite in the repository. All test configurations and examples consistently use version: '2', and the E2E tests would fail if llama-stack 0.3.4 did not recognize this version or support the new storage/backends configuration structure. The schema validation is implicitly confirmed through these passing test fixtures.


77-83: No action needed. The empty config {} for sentence-transformers is valid and intentional—the inline::sentence-transformers provider in llama-stack uses sensible defaults and explicitly documents empty config as the standard configuration pattern. The provider functions correctly for embedding tasks without additional configuration fields, provided the sentence-transformers library is available in the runtime environment.

Likely an incorrect or invalid review comment.

Comment on lines +96 to 126
CONFIG_ENVIRONMENT: ${{ matrix.environment || 'ci' }}
run: |
CONFIGS_DIR="tests/e2e/configs"
MODE="$CONFIG_MODE"
ENVIRONMENT="$CONFIG_ENVIRONMENT"
echo "Deployment mode: $MODE"
echo "Looking for configurations in $CONFIGS_DIR/"
# Select config based on mode:
# - library mode: run-library.yaml (llama-stack 0.3.0 format)
# - server mode: run-ci.yaml (original format)
if [ "$MODE" == "library" ]; then
CONFIG_FILE="$CONFIGS_DIR/run-library.yaml"
# List available configurations
if [ -d "$CONFIGS_DIR" ]; then
echo "Available configurations:"
ls -la "$CONFIGS_DIR"/*.yaml 2>/dev/null || echo "No YAML files found in $CONFIGS_DIR/"
else
CONFIG_FILE="$CONFIGS_DIR/run-ci.yaml"
echo "Configs directory '$CONFIGS_DIR' not found!"
exit 1
fi
echo "Using configuration: $CONFIG_FILE"
# Determine which config file to use
CONFIG_FILE="$CONFIGS_DIR/run-$ENVIRONMENT.yaml"
echo "Looking for: $CONFIG_FILE"
if [ ! -f "$CONFIG_FILE" ]; then
echo "❌ Configuration not found: $CONFIG_FILE"
echo "Available configs:"
ls -la "$CONFIGS_DIR"/*.yaml
if [ -f "$CONFIG_FILE" ]; then
echo "✅ Found config for environment: $ENVIRONMENT"
cp "$CONFIG_FILE" run.yaml
echo "✅ Config copied to run.yaml"
else
echo "❌ Configuration file not found: $CONFIG_FILE"
echo "Available files in $CONFIGS_DIR:"
ls -la "$CONFIGS_DIR/"
exit 1
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Config selection is correct; fix hard‑coded source path in summary

The new CONFIG_ENVIRONMENT / run-$ENVIRONMENT.yaml selection logic is sound and will work for ci and azure environments.

However, the summary step always prints:

echo "Source config: tests/e2e/configs/run-ci.yaml"

so azure runs will be mis‑reported.

Consider deriving this from the matrix value:

-echo "Source config: tests/e2e/configs/run-ci.yaml"
+echo "Source config: tests/e2e/configs/run-${{ matrix.environment }}.yaml"

This keeps logs accurate if more environments are added later.

Also applies to: 133-133

🤖 Prompt for AI Agents
In .github/workflows/e2e_tests.yaml around lines 96-126 (and also line 133), the
summary log prints a hard-coded path ("tests/e2e/configs/run-ci.yaml") which
misreports the actual chosen config for other environments; update the echo to
derive the path from the variables you already set — e.g. use CONFIGS_DIR and
CONFIG_ENVIRONMENT or the CONFIG_FILE variable (echo "Source config:
$CONFIG_FILE" or echo "Source config:
$CONFIGS_DIR/run-$CONFIG_ENVIRONMENT.yaml") so the log reflects the real
selected file for all matrix environments.

Comment on lines +78 to 79
RUN mkdir -p /opt/app-root/src/.llama/storage /opt/app-root/src/.llama/providers.d && \
chown -R 1001:1001 /opt/app-root/src/.llama
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for any remaining references to the old storage path structure
rg -n "distributions/ollama" --type yaml --type py

Repository: lightspeed-core/lightspeed-stack

Length of output: 7007


Storage path migration is incomplete and will cause runtime failures.

The change in the Containerfile creates the new .llama/storage directory, but 65+ references to the old .llama/distributions/ollama path remain active across configuration files (tests/, examples/). These files still expect databases and storage at the old location. The migration must include updating all db_path, storage_dir, and related environment variable defaults in YAML configuration files to use .llama/storage instead of .llama/distributions/ollama.

🤖 Prompt for AI Agents
In Containerfile around lines 78-79, the Dockerfile creates .llama/storage but
the project still references the old .llama/distributions/ollama path across
config and example files; update all YAML and env defaults (db_path,
storage_dir, STORAGE_DIR, DB_PATH, etc.) in tests/, examples/, configs/ and any
scripts to point to .llama/storage (or make path configurable via a single env
var), ensure relative paths and permissions remain correct, and run a
project-wide search/replace for ".llama/distributions/ollama" to replace with
".llama/storage", then update any README or docs referencing the old path and
add a compatibility note or symlink creation step if backward compatibility is
required.

@@ -1,13 +1,13 @@
# Custom Red Hat llama-stack image with missing dependencies
FROM quay.io/opendatahub/llama-stack:rhoai-v3.0-latest
FROM quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Pipeline failure due to unauthorized access to base image.

The GitHub Actions pipeline is failing with 401 UNAUTHORIZED when attempting to pull quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2. This prevents the PR from being tested and merged.

Root causes:

  1. The GitHub Actions workflow may not have access to the aipcc organization on quay.io
  2. The authentication step added in .github/workflows/e2e_tests_rhelai.yaml may need to be added to other workflows as well
  3. The secrets (QUAY_ROBOT_USERNAME, QUAY_ROBOT_TOKEN) may not be configured or may lack permissions

Resolution steps:

  1. Verify that the robot account has pull permissions for quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2
  2. Add Docker login steps to all workflows that build or pull this image (similar to the change in .github/workflows/e2e_tests_rhelai.yaml)
  3. Verify the image tag rhoai-3.2 exists in the registry

Run this script to check which workflow files might need the login step:

#!/bin/bash
# Find workflow files that reference the new base image or use test.containerfile
rg -l "test\.containerfile|quay\.io/aipcc" .github/workflows/
🧰 Tools
🪛 GitHub Actions: E2E Tests

[error] 2-2: Docker build failed: failed to resolve source metadata for quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2 (401 UNAUTHORIZED).

🤖 Prompt for AI Agents
In test.containerfile around lines 2 to 2, the base image
quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2 is causing 401 UNAUTHORIZED in CI;
ensure the robot account can pull the image, confirm the quay robot credentials
(QUAY_ROBOT_USERNAME and QUAY_ROBOT_TOKEN) are configured in GitHub secrets with
pull permissions, add Docker/login (quay) steps to every GitHub Actions workflow
that builds or pulls this image (mirror the login step from
.github/workflows/e2e_tests_rhelai.yaml), and verify the rhoai-3.2 tag exists in
the registry; to find all workflows needing the change, run the provided search
script to list workflows referencing test.containerfile or quay.io/aipcc and
update each to perform quay.io authentication before any docker pull/build.

@are-ces
Copy link
Contributor Author

are-ces commented Dec 11, 2025

PS: I will update other configs (RHAIIS, RHEL AI) in another PR

@are-ces are-ces force-pushed the bumpup-downstream-rhoai3.2 branch from 8daa53b to 681b7e2 Compare December 12, 2025 08:51
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
.github/workflows/e2e_tests.yaml (1)

133-133: Fix hard-coded config path in log.

The summary still prints a hard-coded path that will misreport the actual config for the azure environment.

Apply this diff to use the actual config file:

-          echo "Source config: tests/e2e/configs/run-ci.yaml"
+          echo "Source config: $CONFIG_FILE"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8daa53b and 681b7e2.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (19)
  • .github/workflows/e2e_tests.yaml (2 hunks)
  • .github/workflows/e2e_tests_rhaiis.yaml (1 hunks)
  • .github/workflows/e2e_tests_rhelai.yaml (1 hunks)
  • Containerfile (1 hunks)
  • README.md (2 hunks)
  • docker-compose.yaml (1 hunks)
  • pyproject.toml (1 hunks)
  • requirements.aarch64.txt (1 hunks)
  • requirements.x86_64.txt (1 hunks)
  • run.yaml (3 hunks)
  • run_library.yaml (0 hunks)
  • src/constants.py (1 hunks)
  • test.containerfile (1 hunks)
  • tests/e2e/configs/run-azure.yaml (2 hunks)
  • tests/e2e/configs/run-ci.yaml (2 hunks)
  • tests/e2e/configs/run-library.yaml (0 hunks)
  • tests/e2e/configuration/lightspeed-stack-library-mode.yaml (0 hunks)
  • tests/e2e/configuration/lightspeed-stack-server-mode.yaml (0 hunks)
  • tests/e2e/features/info.feature (1 hunks)
💤 Files with no reviewable changes (4)
  • tests/e2e/configs/run-library.yaml
  • run_library.yaml
  • tests/e2e/configuration/lightspeed-stack-library-mode.yaml
  • tests/e2e/configuration/lightspeed-stack-server-mode.yaml
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/e2e/features/info.feature
  • Containerfile
  • requirements.x86_64.txt
  • src/constants.py
  • README.md
🧰 Additional context used
📓 Path-based instructions (1)
pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

pyproject.toml: Configure pylint with source-roots = "src"
Exclude src/auth/k8s.py from pyright type checking

Files:

  • pyproject.toml
🧠 Learnings (3)
📚 Learning: 2025-08-18T10:58:14.951Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:47-47
Timestamp: 2025-08-18T10:58:14.951Z
Learning: psycopg2-binary is required by some llama-stack providers in the lightspeed-stack project, so it cannot be replaced with psycopg v3 or moved to optional dependencies without breaking llama-stack functionality.

Applied to files:

  • requirements.aarch64.txt
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.

Applied to files:

  • requirements.aarch64.txt
📚 Learning: 2025-08-18T10:56:55.349Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:0-0
Timestamp: 2025-08-18T10:56:55.349Z
Learning: The lightspeed-stack project intentionally uses a "generic image" approach, bundling many dependencies directly in the base runtime image to work for everyone, rather than using lean base images with optional dependency groups.

Applied to files:

  • requirements.aarch64.txt
  • test.containerfile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-pr
🔇 Additional comments (16)
test.containerfile (2)

2-2: LGTM! Verify image access is configured.

The base image change to quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2 aligns with the PR objectives to consume the downstream image. The Docker login steps have been added to the three e2e test workflows, which should resolve the previous 401 UNAUTHORIZED issue.

Ensure the robot account credentials (QUAY_ROBOT_USERNAME/QUAY_ROBOT_TOKEN or QUAY_DOWNSTREAM_USERNAME/QUAY_DOWNSTREAM_TOKEN) are configured in GitHub secrets and have pull permissions for this image. Also verify the rhoai-3.2 tag exists in the registry.


10-10: LGTM! Storage path updated consistently.

The directory change from .llama/distributions/ollama to .llama/storage aligns with the centralized storage model introduced across all configuration files in this PR.

.github/workflows/e2e_tests_rhelai.yaml (1)

126-132: LGTM! Docker authentication added correctly.

The Docker login step is properly positioned before service startup and uses secure password-stdin input. This enables authenticated pulls of the new downstream base image.

.github/workflows/e2e_tests.yaml (2)

96-126: LGTM! Config selection logic improved.

The environment-based configuration selection using CONFIG_ENVIRONMENT is cleaner than the previous mode-based branching. The validation and error handling are appropriate.


142-148: LGTM! Note secret name inconsistency across workflows.

The Docker login step is correctly implemented. However, this workflow uses QUAY_DOWNSTREAM_USERNAME/QUAY_DOWNSTREAM_TOKEN while the RHELAI and RHAIIS workflows use QUAY_ROBOT_USERNAME/QUAY_ROBOT_TOKEN. Ensure both sets of secrets are configured, or standardize on one naming convention.

Verify that the secrets QUAY_DOWNSTREAM_USERNAME and QUAY_DOWNSTREAM_TOKEN are configured in the GitHub repository settings.

.github/workflows/e2e_tests_rhaiis.yaml (1)

125-131: LGTM! Docker authentication added correctly.

The Docker login step matches the implementation in the RHELAI workflow and is properly positioned before service startup.

tests/e2e/configs/run-ci.yaml (3)

17-27: LGTM! Storage paths centralized.

The consolidation of database paths under ~/.llama/storage/ improves organization and aligns with the new storage model introduced in this PR.


32-106: LGTM! Provider configs migrated to namespace/backend pattern.

The migration from direct db_path entries to the namespace/backend pattern for persistence and kvstore configurations is a clean refactoring that centralizes storage management. This approach aligns with llama-stack 0.3.4's two-level storage architecture (backend definitions + store references), allowing multiple stores to reuse a single backend and avoiding duplicated configuration.


110-142: Storage and registered_resources configuration is correct for llama-stack 0.3.4.

The new storage section with backends and stores aligns with llama-stack 0.3.4 documented structure (a breaking change introduced in 0.3.x that consolidates backend declarations and reduces credential duplication). The empty models list in registered_resources is intentional and consistent across CI and test configs in this codebase—models are registered at runtime. The tool_groups entry mapping builtin::rag to rag-runtime is correctly configured and matches the pattern used throughout example and configuration files.

tests/e2e/configs/run-azure.yaml (3)

17-27: LGTM! Storage paths centralized consistently.

The storage consolidation matches the pattern in run-ci.yaml, ensuring consistency across environments.


83-85: LGTM! Sentence-transformers provider added.

The addition of the sentence-transformers inference provider with an empty config is appropriate for embedding support.


112-148: LGTM! Storage and model registration configured for Azure.

The storage backend structure is consistent with run-ci.yaml, and the pre-registered gpt-4o-mini model in registered_resources is appropriate for the Azure environment, unlike the CI environment which leaves the models list empty.

run.yaml (2)

17-27: LGTM! Main config storage paths centralized.

The storage path updates in the main run.yaml are consistent with the test configurations, ensuring uniform storage management across all environments.


32-142: LGTM! Provider and storage configuration unified.

The migration to the namespace/backend pattern for provider configs and the addition of the storage and registered_resources sections bring the main config in line with the new llama-stack 0.3.4 architecture. The changes are consistent with the test configurations.

pyproject.toml (1)

31-32: Version bump to llama-stack 0.3.4 and llama-stack-client 0.3.4 is safe and compatible.

The bump introduces no breaking changes (only bugfixes and chores per release notes). The project's Python requirement (>=3.12,<3.14) is compatible with llama-stack-client 0.3.4's requirement of Python 3.12+. Additionally, the pyproject.toml correctly implements the required coding guidelines: pylint is configured with source-roots = "src" and pyright excludes src/authentication/k8s.py from type checking.

requirements.aarch64.txt (1)

1396-1406: Version alignment and regeneration look good. Both llama-stack==0.3.4 and llama-stack-client==0.3.4 are properly pinned, Python 3.12 is supported by this release, and the identical hashes across requirements.aarch64.txt and requirements.x86_64.txt confirm synchronized regeneration with no cross-arch drift.

Comment on lines +26 to +28
# Enable debug logging if needed
- LLAMA_STACK_LOGGING=${LLAMA_STACK_LOGGING:-}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Find references to LLAMA_STACK_LOGGING and locate the containerfile
rg "LLAMA_STACK_LOGGING" -i -A 3 -B 3

Repository: lightspeed-core/lightspeed-stack

Length of output: 939


🏁 Script executed:

# Locate and examine the test.containerfile referenced in docker-compose.yaml
fd "test.containerfile" -x cat {}

Repository: lightspeed-core/lightspeed-stack

Length of output: 574


🏁 Script executed:

# Check for any documentation or README about environment variables
fd -e "md" -e "txt" | head -20 | xargs grep -l "LLAMA_STACK_LOGGING" 2>/dev/null || echo "No docs found with LLAMA_STACK_LOGGING"

Repository: lightspeed-core/lightspeed-stack

Length of output: 114


🏁 Script executed:

# Find and examine run.yaml
fd "run.yaml" -x cat {}

Repository: lightspeed-core/lightspeed-stack

Length of output: 34362


🏁 Script executed:

# Check for any logging-related configuration or comments in the repo
rg "logging|LOGGING" -i -l | head -10 | xargs cat 2>/dev/null | grep -i "logging\|debug" | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 794


🏁 Script executed:

# Check README, docs, or scripts that might explain LLAMA_STACK_LOGGING
fd -e "md" -e "sh" -e "py" -e "txt" | xargs grep -l "LLAMA_STACK" 2>/dev/null | head -10

Repository: lightspeed-core/lightspeed-stack

Length of output: 212


🏁 Script executed:

# Look for any CI/CD or entrypoint scripts that might use or document this
fd -e "sh" -e "py" | xargs grep -B2 -A2 "LLAMA_STACK_LOGGING" 2>/dev/null

Repository: lightspeed-core/lightspeed-stack

Length of output: 60


🏁 Script executed:

# Check if there's a Makefile or setup guide
fd "Makefile\|makefile\|setup\|SETUP\|README" | head -10 | xargs cat 2>/dev/null | grep -i "llama\|logging" | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 58


🏁 Script executed:

# Read README to understand LLAMA_STACK_LOGGING context
cat README.md | grep -i "logging\|environment\|variable" -A 3 -B 3 | head -50

Repository: lightspeed-core/lightspeed-stack

Length of output: 1228


🏁 Script executed:

# Check the tutorial script for LLAMA_STACK usage patterns
cat scripts/llama_stack_tutorial.sh | head -100

Repository: lightspeed-core/lightspeed-stack

Length of output: 2830


🏁 Script executed:

# Check constants.py for any logging-related constants
cat src/constants.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 4784


Document expected values for LLAMA_STACK_LOGGING or remove if unused.

LLAMA_STACK_LOGGING is not documented in this repository. The variable is set to potentially empty, and the comment "Enable debug logging if needed" doesn't clarify what values are accepted or whether it's actually used by the llama-stack container. Either add documentation of supported values (e.g., debug|info|warn|error) or confirm this variable is intentionally optional and can be omitted entirely if not set in the host environment.

🤖 Prompt for AI Agents
In docker-compose.yaml around lines 26 to 28, the LLAMA_STACK_LOGGING env var is
undocumented and ambiguous; either document its accepted values or remove it if
unused — update the file to one of these: a) replace the placeholder comment
with a clear comment listing supported values (e.g.,
LLAMA_STACK_LOGGING=debug|info|warn|error) and add that the default is
empty/unspecified, or b) if the llama-stack image does not consume this
variable, remove the LLAMA_STACK_LOGGING line entirely from the service
definition and any related docs; ensure any chosen option is reflected in repo
README or an environment variable example file.

Copy link
Contributor

@asimurka asimurka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@radofuchs radofuchs merged commit 6bae4a2 into lightspeed-core:main Dec 12, 2025
21 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants