fix: load API keys from database at startup + setup wizard UX improvements#1
fix: load API keys from database at startup + setup wizard UX improvements#10pfleet wants to merge 7 commits intohyperb1iss:mainfrom
Conversation
Previously, API keys configured via the webapp were stored in the database but GraphClient read them from environment variables at import time. This caused "The api_key client option must be set" errors even when keys were configured in the UI. This fix loads API keys from the database into environment variables during server and worker startup, bridging the gap between webapp settings and GraphClient initialization. - Server: Load keys after DB connected, before GraphClient.connect() - Worker: Load keys in startup() before any jobs run Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When users update API keys through the settings UI, the running server now immediately uses the new keys without requiring a restart: 1. Update environment variables with new key values 2. Reset GraphClient singleton so next connection uses new keys 3. Handle key deletion by clearing env vars and resetting client This ensures a seamless experience when rotating or updating API keys. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Users often switch tabs during setup to copy API keys from provider dashboards. Previously, returning to Sibyl would reset the wizard to the Welcome step, losing their progress. Now the current step is saved to sessionStorage, so users return to the same step. Storage is cleared on completion. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Improved the API keys setup step to provide clear feedback when keys are saved individually: - Show green "Saved" badge on configured keys (persists across tab switches) - Display progress banner: "OpenAI key saved! Now add your Anthropic key." - Update button text contextually: "Save Key" vs "Save Keys" - Show "Enter Anthropic Key" when only that key is missing (and vice versa) - Use configured state (from DB) instead of volatile validation state This helps users understand their progress when switching tabs to copy API keys from provider dashboards. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Backend tests (test_settings_api_key_loading.py): - API key loaded from DB when env not set - Env var takes precedence over DB - Graceful handling when DB query fails - Partial key loading (only one key configured) - Hot-reload updates env var and resets GraphClient - Delete setting clears env var Frontend tests (setup-wizard.test.tsx, api-keys-step.test.tsx): - Step persistence in sessionStorage - Invalid step falls back to welcome - API key input rendering and save button states - Configured state shows Continue button - Progress messages for partial configuration - Error handling and navigation callbacks Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
📝 WalkthroughWalkthroughLoads OpenAI and Anthropic API keys from the DB into environment variables at startup/worker init, syncs env vars on settings updates/deletions (resetting GraphClient when keys change), and adds sessionStorage-backed progress persistence plus UI/UX changes and tests for the web setup wizard and API keys step. Changes
Sequence DiagramssequenceDiagram
participant App as Startup/Lifespan
participant DB as Settings DB
participant Env as Environment
participant Graph as GraphClient
App->>DB: fetch settings (openai_api_key, anthropic_api_key)
DB-->>App: return keys
App->>Env: if OPENAI_API_KEY unset -> set from DB
App->>Env: if ANTHROPIC_API_KEY unset -> set from DB
App->>Graph: initialize (reads env keys)
Graph-->>App: ready
sequenceDiagram
participant User as Admin UI
participant API as Settings API
participant DB as Settings DB
participant Env as Environment
participant Graph as GraphClient
User->>API: Update OpenAI key
API->>DB: validate & persist
DB-->>API: success
API->>Env: set OPENAI_API_KEY
API->>Graph: _try_reset_graph_client()
Graph-->>API: reset complete
API-->>User: update complete
User->>API: Delete Anthropic key
API->>DB: delete key
DB-->>API: success
API->>Env: remove ANTHROPIC_API_KEY
API->>Graph: _try_reset_graph_client()
Graph-->>API: reset complete
API-->>User: delete complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@apps/api/src/sibyl/api/routes/settings.py`:
- Around line 262-273: The delete_setting handler currently unconditionally
calls os.environ.pop for OPENAI_API_KEY/ANTHROPIC_API_KEY and then awaits
reset_graph_client(), which can remove externally-set env vars; instead, update
the logic in the delete_settings handler to only remove the environment variable
if it was originally loaded from the database (or if the current process env
value exactly matches the deleted DB value). Track which keys were sourced from
the DB at startup (e.g., a module-level set like db_loaded_keys populated in the
startup code in main.py) or, after deletion, re-read the environment/config to
enforce fallback behavior, and call reset_graph_client() only when the effective
runtime API key actually changed; reference the delete_setting function,
os.environ.pop usage, and reset_graph_client call to locate where to apply this
change.
In `@apps/api/tests/test_settings_api_key_loading.py`:
- Around line 14-48: The test duplicates the key-loading logic instead of
invoking the real startup code; replace the inline if/await/set logic in
TestApiKeyLoadingAtStartup with a call to the actual startup/lifespan entrypoint
(e.g., call the app startup() or lifespan() used in main.py/worker.py) while
mocking get_settings_service to return mock_settings_service; ensure the test
triggers the same startup lifecycle that sets OPENAI_API_KEY and
ANTHROPIC_API_KEY (so it exercises the real precedence and any future changes)
and keep cleaning env vars via monkeypatch before/after the startup call.
- Around line 150-191: The three tests (test_update_settings_updates_env_var,
test_update_settings_resets_graph_client, test_delete_setting_clears_env_var)
only exercise os.environ and patch primitives instead of invoking the real
handlers; update them to call the actual settings endpoint functions (e.g.,
update_settings and delete_setting from settings.py) or hit the router via a
TestClient so the real handler logic runs; mock external dependencies used by
those handlers (for example patch sibyl_core.graph.client.reset_graph_client and
any service that persists settings) and use monkeypatch to control os.environ
before/after the call so assertions verify the handler side-effects (env var
changes and reset_graph_client invocation) rather than stdlib behavior alone.
In `@apps/web/src/components/setup/setup-wizard.test.tsx`:
- Around line 4-15: The tests duplicate STEPS, STEP_STORAGE_KEY and
getStoredStep instead of exercising the production implementation; export the
constants and getStoredStep from the component module (setup-wizard.tsx) or a
shared util and update the test to import STEPS, STEP_STORAGE_KEY and
getStoredStep from that module so the test uses the real code path; ensure the
exported names match the existing symbols (STEPS, STEP_STORAGE_KEY,
getStoredStep) and remove the copy-pasted definitions from
apps/web/src/components/setup/setup-wizard.test.tsx.
In `@apps/web/src/components/setup/setup-wizard.tsx`:
- Around line 21-28: getStoredStep reads sessionStorage directly and can throw
in restricted browsers; wrap the sessionStorage.getItem(STEP_STORAGE_KEY) call
in a try/catch and return the safe default ('welcome') on any exception, keeping
the existing STEPS validation logic; likewise locate the other sessionStorage
uses in this file (the calls that call sessionStorage.setItem(...) and
sessionStorage.removeItem(...)) and guard each with try/catch so failures are
swallowed or logged (e.g., process a warning) but do not throw, ensuring the
component degrades gracefully.
In `@apps/web/src/components/setup/steps/api-keys-step.tsx`:
- Around line 304-309: The code assigns statusText = 'Required' when configured
is false but sets statusIcon = null so the status badge (render guarded by
statusIcon) never displays; either remove the dead assignment or ensure the
badge renders by giving statusIcon a subtle non-null indicator. Update the logic
in api-keys-step.tsx (the branch that sets statusIcon/statusText) so that when
configured is false you either (A) set statusIcon to a small, low-contrast
element (e.g., a subtle dot/span) so the existing badge renderer shows the
'Required' statusText, or (B) drop the statusText assignment entirely if the
intent is to hide the badge; adjust only the branch that assigns
statusIcon/statusText (referencing statusIcon, statusText, and configured)
accordingly.
🧹 Nitpick comments (3)
apps/api/src/sibyl/api/routes/settings.py (1)
223-232: Consider extracting the GraphClient reset into a shared helper.The same pattern (import
reset_graph_client, await it, catch and log) appears in both the update handler (Lines 226–232) and the delete handler (Lines 267–273). A small async helper like_try_reset_graph_client()would reduce duplication.♻️ Proposed helper extraction
+async def _try_reset_graph_client() -> None: + """Reset the global GraphClient, logging on failure.""" + try: + from sibyl_core.graph.client import reset_graph_client + await reset_graph_client() + log.info("Reset GraphClient after API key change") + except Exception as e: + log.warning("Failed to reset GraphClient", error=str(e)) +Then replace both call sites with
await _try_reset_graph_client().apps/api/src/sibyl/jobs/worker.py (1)
184-206: Duplicated key-loading logic and coarse error handling.Two issues:
Duplication: This block is nearly identical to
main.pyLines 116–139. Consider extracting a shared helper (e.g.,load_api_keys_from_db()in the settings service or a utility module) to keep the logic in one place.Error granularity: The single
try/exceptwraps both key loads. Ifget_openai_key()raises,get_anthropic_key()is never attempted. Wrapping each key load independently would improve resilience.♻️ Proposed fix for independent error handling + extraction hint
try: from sibyl.services.settings import get_settings_service settings_svc = get_settings_service() - # Load OpenAI key if not already set in environment - if not os.environ.get("OPENAI_API_KEY"): - openai_key = await settings_svc.get_openai_key() - if openai_key: - os.environ["OPENAI_API_KEY"] = openai_key - log.debug("Loaded OpenAI API key from database settings") - - # Load Anthropic key if not already set in environment - if not os.environ.get("ANTHROPIC_API_KEY"): - anthropic_key = await settings_svc.get_anthropic_key() - if anthropic_key: - os.environ["ANTHROPIC_API_KEY"] = anthropic_key - log.debug("Loaded Anthropic API key from database settings") + for env_var, getter, label in [ + ("OPENAI_API_KEY", settings_svc.get_openai_key, "OpenAI"), + ("ANTHROPIC_API_KEY", settings_svc.get_anthropic_key, "Anthropic"), + ]: + try: + if not os.environ.get(env_var): + key = await getter() + if key: + os.environ[env_var] = key + log.debug(f"Loaded {label} API key from database settings") + except Exception as e: + log.warning(f"Failed to load {label} API key from database", error=str(e)) except Exception as e: - log.warning("Failed to load API keys from database", error=str(e)) + log.warning("Failed to initialize settings service for key loading", error=str(e))apps/api/src/sibyl/main.py (1)
116-139: Same duplication and coarse error handling as flagged inworker.py.This block is the same key-loading pattern duplicated in
worker.py. See the comment onworker.pyLines 184–206 for the recommended refactor (shared helper + per-key error isolation). The placement here (after DB connect, beforeget_graph_client()) is correct.
|
Before: sibyl-api-keys-from-webapp-falkon-too-fast.1.movfalkor-db-does-not-updat.movAfter: Falkor-db-updates-better.mov |
|
Before: sibyl-api-key-onboarding-resets-on-tab-change.movAfter: api-keys-saved-on-tab-switch.mov |
- Add try/catch around sessionStorage operations for browser compatibility (Safari private mode and other restricted environments) - Remove dead code: statusText = 'Required' was never displayed - Extract _try_reset_graph_client() helper to reduce duplication - Extract load_api_keys_from_db() shared helper for main.py and worker.py - Add note about external env var behavior in delete handler Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@apps/api/src/sibyl/jobs/worker.py`:
- Line 172: Remove the unused top-level import statement "import os" found in
apps/api/src/sibyl/jobs/worker.py; locate the import near the module imports and
the startup() function and delete that unused symbol so there is no unused "os"
import left in the file.
🧹 Nitpick comments (3)
apps/api/src/sibyl/services/settings.py (2)
355-380: Use structlog's key-value parameters instead of f-strings for structured logging.Lines 377 and 379 use f-strings inside
log.debug()andlog.warning(), which embeds variable data into the message string. This defeats structlog's structured logging — thesetting_keyvalue won't be a separate, queryable field.♻️ Proposed fix
- log.debug(f"Loaded {setting_key} from database settings") + log.debug("Loaded API key from database settings", key=setting_key) except Exception as e: - log.warning(f"Failed to load {setting_key} from database", error=str(e)) + log.warning("Failed to load API key from database", key=setting_key, error=str(e))
367-370: Consider reusing_SETTING_ENV_VARSto reduce duplication of the key-to-env-var mapping.The mapping from setting keys to env vars is already defined at module level in
_SETTING_ENV_VARS(lines 33-36). This hardcoded list introduces a second source of truth. If a new API key is added to_SETTING_ENV_VARS, it would need to be remembered here too.That said,
_SETTING_ENV_VARSmaps to multiple env var names (e.g.,SIBYL_OPENAI_API_KEYandOPENAI_API_KEY), and here you specifically want the canonical name. A small helper or convention (e.g., always use the last entry) could bridge this.apps/api/src/sibyl/jobs/worker.py (1)
184-189: Consider logging the loaded keys for startup observability.The return value of
load_api_keys_from_db()(list of loaded key names) is discarded. Logging it would help diagnose "key not found" issues during worker startup, consistent with how stale agent counts are logged on lines 193-194.♻️ Proposed fix
- await load_api_keys_from_db() + loaded_keys = await load_api_keys_from_db() + if loaded_keys: + log.info("Loaded API keys from database", keys=loaded_keys)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@apps/api/src/sibyl/jobs/worker.py`:
- Around line 182-187: The call to load_api_keys_from_db() can raise when the DB
is unreachable and must be made resilient like the other startup steps; wrap the
await load_api_keys_from_db() invocation in a try/except that logs errors
(mirroring the pattern used in _cleanup_stale_working_agents and
_cleanup_orphaned_agent_jobs) so the worker still starts on failure, or
alternatively guard the call with the same db_connected check used by main
startup before calling load_api_keys_from_db(); ensure you reference
load_api_keys_from_db, the cleanup helpers, and db_connected in the
implementation so the behavior matches existing startup resilience.
| # Load API keys from database into environment BEFORE any jobs use GraphClient | ||
| # This bridges the gap between webapp-configured settings (stored in DB) | ||
| # and CoreConfig (which reads from env vars at import time) | ||
| from sibyl.services.settings import load_api_keys_from_db | ||
|
|
||
| await load_api_keys_from_db() |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, examine the worker.py file around lines 182-187 and the cleanup helpers
cat -n apps/api/src/sibyl/jobs/worker.py | sed -n '170,200p'Repository: hyperb1iss/sibyl
Length of output: 1469
🏁 Script executed:
# Search for the cleanup helpers to see if they have try/except
rg -n '_cleanup_stale_working_agents|_cleanup_orphaned_agent_jobs' --type=py -A 10 -B 2 apps/api/src/sibyl/jobs/worker.pyRepository: hyperb1iss/sibyl
Length of output: 1682
🏁 Script executed:
# Find how main server startup handles load_api_keys_from_db
rg -n 'load_api_keys_from_db' --type=py -g '!worker.py' -B 5 -A 5Repository: hyperb1iss/sibyl
Length of output: 1835
🏁 Script executed:
# Check the implementation of load_api_keys_from_db to understand what exceptions it might raise
fd 'settings.py' --type f | xargs grep -l 'load_api_keys_from_db' | head -1 | xargs cat -n | head -100Repository: hyperb1iss/sibyl
Length of output: 3642
🏁 Script executed:
# Get full implementation of _cleanup_stale_working_agents
sed -n '44,105p' apps/api/src/sibyl/jobs/worker.pyRepository: hyperb1iss/sibyl
Length of output: 2355
🏁 Script executed:
# Get full implementation of _cleanup_orphaned_agent_jobs
sed -n '107,150p' apps/api/src/sibyl/jobs/worker.pyRepository: hyperb1iss/sibyl
Length of output: 1724
🏁 Script executed:
# Get full context of main.py around the db_connected logic
sed -n '100,135p' apps/api/src/sibyl/main.pyRepository: hyperb1iss/sibyl
Length of output: 1438
🏁 Script executed:
# Get full implementation of load_api_keys_from_db
sed -n '355,390p' apps/api/src/sibyl/services/settings.pyRepository: hyperb1iss/sibyl
Length of output: 1041
Wrap load_api_keys_from_db() in a try/except to match the resilience of other startup steps.
The cleanup helpers (_cleanup_stale_working_agents, _cleanup_orphaned_agent_jobs) each have try/except blocks that allow the worker to start regardless of failures. Additionally, the main server startup guards this call with if db_connected: before invoking it. The worker startup currently has neither guard nor exception handling—if the DB is unreachable when load_api_keys_from_db() runs (e.g., get_settings_service() raises), the worker will crash. Either add a try/except block matching the cleanup helper pattern, or add the if db_connected: guard as used in the main startup.
🤖 Prompt for AI Agents
In `@apps/api/src/sibyl/jobs/worker.py` around lines 182 - 187, The call to
load_api_keys_from_db() can raise when the DB is unreachable and must be made
resilient like the other startup steps; wrap the await load_api_keys_from_db()
invocation in a try/except that logs errors (mirroring the pattern used in
_cleanup_stale_working_agents and _cleanup_orphaned_agent_jobs) so the worker
still starts on failure, or alternatively guard the call with the same
db_connected check used by main startup before calling load_api_keys_from_db();
ensure you reference load_api_keys_from_db, the cleanup helpers, and
db_connected in the implementation so the behavior matches existing startup
resilience.
|
@CodeRabbit review dude are you sure. That seems silly |
|
Let me do a proper review of your PR now. This includes significant changes across backend (API key loading from DB, hot-reload functionality, worker startup changes) and frontend (setup wizard persistence, improved UX with saved state indicators), plus comprehensive test coverage. I'll provide detailed feedback on all of it. ✅ Actions performedReview triggered.
|
Summary
Problem
When API keys were configured via the webapp, they were stored in PostgreSQL but never loaded into the environment.
GraphClientreads fromCoreConfigwhich checksos.environat import time - before the database is even connected. This caused the error:Additionally, the setup wizard had poor UX - switching tabs would reset progress and there was no clear feedback that keys were saved.
Solution
Backend:
os.environduring server startup (beforeGraphClient.connect())os.environand callreset_graph_client()Frontend:
sessionStorage(not the keys, just which step you're on)configuredfrom backend, not volatilevalidstate)Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests