Harden KR candles Timescale migration and SQL#186
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds TimescaleDB-backed KR candle storage and maintenance, plus sync tooling: a 1-minute hypertable, a 1-hour continuous aggregate with policy and guarded refresh, a sync service/CLI/job/task to populate it, Makefile targets, Alembic retention migration, tests, and docker image change to TimescaleDB. Changes
Sequence Diagram(s)sequenceDiagram
participant Scheduler as Scheduler (TaskIQ)
participant Task as Task (kr_candles_tasks)
participant Job as Job (run_kr_candles_sync)
participant Service as Service (kr_candles_sync_service)
participant KIS as KIS API
participant DB as Postgres/TimescaleDB
Scheduler->>Task: cron trigger (*/10 * * * 1-5 Asia/Seoul)
Task->>Job: await run_kr_candles_sync(mode="incremental")
Job->>Service: sync_kr_candles(mode="incremental", sessions, user_id)
Service->>DB: SELECT last timestamps / read kr_candles_1m / read kr_candles_1h
alt needs API fetch
Service->>KIS: fetch minute frames (per symbol/venue/session)
KIS-->>Service: minute frames
end
Service->>DB: INSERT ... ON CONFLICT upsert into public.kr_candles_1m
DB-->>Service: acknowledge rows
Service->>DB: manage continuous aggregate policy / refresh_continuous_aggregate on public.kr_candles_1h
DB-->>Service: policy/refresh result
Service-->>Job: result summary
Job-->>Task: {"status":"completed" | "failed", ...}
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Review: Harden KR candles Timescale migration and SQLOverall the migration is well-structured with good defensive guards (version check, Critical1. No SQLAlchemy ORM model for The table is created in the migration but there's no corresponding model in
At minimum, add a model and import it in 2. Alembic wraps migrations in a transaction by default ( (In TimescaleDB ≥ 2.x Consider adding 3. Initial If
Important4. Missing The default chunk interval for SELECT create_hypertable(
'public.kr_candles_1m',
'time',
chunk_time_interval => INTERVAL '1 day'
)5. Since SELECT create_hypertable('public.kr_candles_1m', 'time')6. Venue tie-break logic needs a comment The asymmetric -- open uses: KRX=0, NTX=1 → FIRST picks KRX when timestamps tie
-- close uses: KRX=1, NTX=0 → LAST picks KRX when timestamps tieBoth expressions prefer KRX as the authoritative source for open and close. That may be intentional (KRX is the primary venue), but a single-line comment explaining the intent would prevent future contributors from "fixing" this apparent asymmetry. 7. Docker Compose uses The dev 8. No tests There are no tests covering the new table/view. At a minimum consider:
Minor9. Duplicate logic between migration and SQL script
10. The policy only re-materialises the last 2 days. Data older than 2 days that has been corrected/inserted late won't be refreshed automatically. If backfill corrections are expected, consider a wider 11. Downgrade silently drops data op.execute("DROP TABLE IF EXISTS public.kr_candles_1m")The Validation GapThe PR notes:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
alembic/versions/87541fdbc954_add_kr_candles_timescale.py (1)
59-75: Consider adding a foreign key fromsymboltostock_infofor referential integrity.The
symbolcolumn lacks a FK constraint. While this table stores raw market data rather than analysis results, linkingsymbolto thestock_infotable would enforce data integrity and prevent orphaned rows with invalid symbols.Note: FKs from hypertables to regular tables are supported in TimescaleDB. However, ensure column type compatibility:
stock_info.symbolisVARCHAR(50)whilekr_candles_1m.symbolisTEXT, so the symbol column type may need to be adjusted to match for the FK to work properly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@alembic/versions/87541fdbc954_add_kr_candles_timescale.py` around lines 59 - 75, The kr_candles_1m table is missing a foreign key to stock_info.symbol which permits orphaned symbols; modify the CREATE TABLE for kr_candles_1m so symbol has a compatible type (change symbol from TEXT to VARCHAR(50)) and add a FK constraint (e.g., CONSTRAINT fk_kr_candles_1m_symbol FOREIGN KEY (symbol) REFERENCES stock_info(symbol)); ensure the column type change and FK are applied together in the op.execute block that creates kr_candles_1m so TimescaleDB accepts the reference.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@scripts/sql/kr_candles_timescale.sql`:
- Around line 57-83: The array_agg(DISTINCT venue ORDER BY venue) expression in
the CREATE MATERIALIZED VIEW for kr_candles_1h may not be supported by
TimescaleDB continuous aggregates/partial aggregation; replace it with a
Timescale-compatible approach: remove DISTINCT and ORDER BY from the aggregate
in the continuous aggregate (e.g., use array_agg(venue) AS venues) and move the
de-duplication/ordering into a separate step or subquery when querying the
materialized view, or compute a deduplicated ordered venues array via a lateral
subquery over public.kr_candles_1m keyed by time_bucket(bucket, time) and
symbol; update the SELECT for kr_candles_1h to reference the simplified
aggregate and ensure the downstream logic performs DISTINCT+ORDER to produce the
final venues list.
---
Nitpick comments:
In `@alembic/versions/87541fdbc954_add_kr_candles_timescale.py`:
- Around line 59-75: The kr_candles_1m table is missing a foreign key to
stock_info.symbol which permits orphaned symbols; modify the CREATE TABLE for
kr_candles_1m so symbol has a compatible type (change symbol from TEXT to
VARCHAR(50)) and add a FK constraint (e.g., CONSTRAINT fk_kr_candles_1m_symbol
FOREIGN KEY (symbol) REFERENCES stock_info(symbol)); ensure the column type
change and FK are applied together in the op.execute block that creates
kr_candles_1m so TimescaleDB accepts the reference.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #186 +/- ##
==========================================
- Coverage 63.53% 63.27% -0.26%
==========================================
Files 180 184 +4
Lines 22421 22927 +506
==========================================
+ Hits 14245 14508 +263
- Misses 8176 8419 +243 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
PR Review: Harden KR Candles Timescale Migration and SQLOverall this is a well-structured PR with solid TimescaleDB integration, good use of parametrized SQL, clean dataclasses, and meaningful test coverage. A few issues worth addressing before merge: Bugs / Correctness1.
# app/services/kr_candles_sync_service.py ~line 740
_ = await session.execute(_UPSERT_SQL, payload)
return len(payload) # This is rows attempted, not rows upsertedConsider using 2. Session not used as async context manager # ~line 948
session = cast(AsyncSession, cast(object, AsyncSessionLocal()))
try:
...
finally:
await session.close()The double 3. Partial failure aborts remaining pairs but commits already-processed pairs # ~lines 1014-1027
try:
stats = await _sync_symbol_venue(...)
await session.commit()
except Exception:
await session.rollback()
raise # Aborts ALL remaining pairsIf pair N fails, pairs 0..N-1 are already committed and pair N's data is rolled back, but pairs N+1..end are never processed. The function then returns Design / Reliability4. Cron runs every minute on weekdays — even outside all session hours # app/tasks/kr_candles_tasks.py
schedule=[{"cron": "*/1 * * * 1-5", "cron_offset": "Asia/Seoul"}]NTX session ends at 20:00 KST, but the task runs every minute from midnight to midnight on weekdays. The venue gate ( 5. # app/jobs/kr_candles.py
async def run_kr_candles_sync(*, mode: str, sessions: int = 10, user_id: int = 1)The task 6. Missing import exchange_calendars as xcalsThe diff doesn't include changes to Code Quality7. value_value = _parse_float(item.get("value"))
8. Retention migration missing module docstring
9. SQL script vs. migration inconsistency for retention policies
Missing Test Coverage10. This is the most complex function in the service — it controls the KIS API page cursor, loop termination, cutoff logic, and deduplication. None of the tests exercise it. A mock-based test covering the pagination boundary (when 11. The function's backfill-vs-incremental branching, the Minor Positives Worth Calling Out
|
There was a problem hiding this comment.
Actionable comments posted: 7
🧹 Nitpick comments (3)
app/services/kr_candles_sync_service.py (3)
510-521:min(allowed_days)recomputed on every loop iteration
allowed_daysis aset[date]andmin(allowed_days)is called inside thewhile Truebody on every pass. Althoughallowed_daysis bounded to at mostsessionselements (default 10), this is still an avoidable O(n) call per iteration. Precompute once before the loop.♻️ Proposed fix
+ min_allowed_day = min(allowed_days) if allowed_days is not None else None + while True: if _day_before_cutoff( target_day=current_day, venue=venue, cutoff_kst=cutoff_kst ): break if allowed_days is not None: - if current_day < min(allowed_days): + if current_day < min_allowed_day: # type: ignore[operator] break🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/services/kr_candles_sync_service.py` around lines 510 - 521, The loop repeatedly calls min(allowed_days) each iteration which is O(n); precompute it once before the while True loop (e.g., min_allowed_day = min(allowed_days) if allowed_days is not None else None) and then replace uses of min(allowed_days) inside the loop with min_allowed_day; update the condition that checks current_day < min(allowed_days) to use min_allowed_day and keep the rest of the logic (current_day adjustment and _day_before_cutoff calls) unchanged, referencing the variables allowed_days, current_day, and the while True block in kr_candles_sync_service.py.
598-598: Double-cast is a code smell; use a type annotation instead
cast(AsyncSession, cast(object, AsyncSessionLocal()))is a type-checker workaround. The idiomatic fix is a local type annotation:♻️ Proposed fix
- session = cast(AsyncSession, cast(object, AsyncSessionLocal())) + session: AsyncSession = AsyncSessionLocal()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/services/kr_candles_sync_service.py` at line 598, The double-cast on session is a type-checker workaround; replace it with a proper local type annotation to remove cast noise: declare session with type AsyncSession (e.g., "session: AsyncSession = AsyncSessionLocal()") instead of using cast(AsyncSession, cast(object, AsyncSessionLocal())), referencing the AsyncSessionLocal() factory and the session variable so the type checker knows the concrete type without casts.
664-681: Fail-fastraiseaborts the entire batch on a single pair errorA transient API error or DB issue on one symbol/venue pair re-raises immediately, causing all subsequent pairs to be silently skipped and their metrics lost. For a batch pipeline that should be resilient to per-symbol failures, consider logging the error and continuing:
♻️ Proposed fix – per-pair isolation with error logging
try: stats = await _sync_symbol_venue( session=session, kis=kis, symbol=symbol, venue=venue, mode=normalized_mode, now_kst=now_kst, backfill_days=backfill_days, ) await session.commit() except Exception: await session.rollback() - raise + logger.exception( + "KR candles sync failed symbol=%s venue=%s", + symbol, + venue.venue, + ) + pairs_skipped += 1 + continue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/services/kr_candles_sync_service.py` around lines 664 - 681, The current try/except around _sync_symbol_venue re-raises on any Exception which aborts the whole batch; instead, catch Exception, await session.rollback(), log the error with context (include symbol, venue, normalized_mode, and any error message/stack) and optionally increment a failure metric or a local counter, then continue to the next pair without re-raising so subsequent pairs still run; keep using the same symbols (session, kis, _sync_symbol_venue, stats, pairs_processed, rows_upserted, pages_fetched, now_kst, backfill_days) to locate where to add the logging and failure handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/services/kr_candles_sync_service.py`:
- Around line 632-639: The backfill include_today gate currently uses
_VENUE_CONFIG["KRX"].session_end (15:30) which causes partial NTX (20:00) days
to be included; change the logic that sets include_today inside the
normalized_mode == "backfill" branch to compute the threshold from the actual
planned venues' session_end values (e.g. take max(session_end) across the venues
you will fetch) instead of hardcoding KRX, then compare now_kst.time() against
that derived threshold to decide include_today before calling
_recent_session_days; update references to include_today, normalized_mode,
now_kst, _VENUE_CONFIG, session_end, and _recent_session_days accordingly so
_initial_end_time bounded partial NTX days are not persisted.
- Around line 175-185: The current block in sync_kr_candles that builds inactive
from target_symbols and raises ValueError stops the whole run when any holding
is flagged is_active=False; instead, detect inactive (using the existing
inactive = sorted(...) logic), log a warning including count and preview (use
process logger or the existing logger in sync_kr_candles), remove those symbols
from target_symbols/rows_by_symbol so they are skipped, and continue processing
the remaining active symbols; do not raise—just warn and filter out inactive
entries before any upserts.
In `@app/tasks/kr_candles_tasks.py`:
- Around line 15-24: The except block in sync_kr_candles_incremental_task is
dead because run_kr_candles_sync catches/returns all Exceptions; remove the
try/except and either (a) return await run_kr_candles_sync(mode="incremental")
directly, or (b) call run_kr_candles_sync, inspect its returned dict and
log/error-handle based on its "status" field; update the
sync_kr_candles_incremental_task function (and similarly any other sync_*
wrapper) to reflect that control flow change instead of relying on an
unreachable exception handler.
In `@scripts/sync_kr_candles.py`:
- Around line 51-64: The current except block is unreachable because
run_kr_candles_sync swallows exceptions, so failures where result.get("status")
!= "completed" never get reported to Sentry; update the failure branch to call
Sentry (e.g., capture_exception or capture_message) with the result and relevant
context so failures are routed to Sentry. Concretely, in the block after
run_kr_candles_sync where you check result.get("status") != "completed", invoke
capture_exception (or capture_message) with an informative message and include
the result payload and process="sync_kr_candles" as context, keep the existing
logger.error call, and ensure the original except still captures unexpected
exceptions via capture_exception and logger.error.
In `@tests/test_kr_candles_sync.py`:
- Around line 243-252: Remove the dead exception-path subcase in the
test_script_main_exit_codes test: delete the block that patches
sync_kr_candles.run_kr_candles_sync to raise RuntimeError (and the related
capture_mock/capture_exception assertions), because run_kr_candles_sync absorbs
exceptions in production and that branch is unreachable; leave the other
exit-code assertions intact so the test no longer asserts behavior for the
removed except branch in sync_kr_candles.main.
- Around line 255-258: The test uses relative Paths tied to the current working
directory; update them to be repo-root-relative by resolving from the test file
location: in
test_new_retention_migration_contains_upgrade_and_downgrade_policy_sql replace
Path("alembic/versions") with Path(__file__).resolve().parent.parent / "alembic"
/ "versions" (so the glob still works), and likewise replace
Path("scripts/sql/kr_candles_timescale.sql") with
Path(__file__).resolve().parent.parent / "scripts" / "sql" /
"kr_candles_timescale.sql"; keep the same variable names and match/glob logic in
the existing function to avoid changing behavior.
- Around line 28-276: All tests in this file lack the required pytest marker;
add classification by either decorating each test function (e.g.
`@pytest.mark.unit` above test_build_symbol_union_combines_kis_and_manual_symbols,
test_validate_universe_rows_fails_when_table_empty,
test_validate_universe_rows_fails_when_symbol_missing,
test_validate_universe_rows_fails_when_symbol_inactive,
test_build_venue_plan_maps_dual_and_single_venues,
test_should_process_venue_skips_holiday_in_incremental_mode,
test_should_process_venue_skips_outside_session_in_incremental_mode,
test_compute_incremental_cutoff_uses_five_minute_overlap,
test_convert_kis_datetime_to_utc_interprets_naive_as_kst,
test_run_kr_candles_sync_success_payload,
test_run_kr_candles_sync_failure_payload, test_task_payload_success,
test_task_payload_failure, test_script_main_exit_codes,
test_new_retention_migration_contains_upgrade_and_downgrade_policy_sql,
test_sql_script_contains_90_day_retention_policy_for_both_tables) or add a
module-level marker like pytestmark = pytest.mark.unit at top of the file so
every test is categorized as a unit test per guidelines.
---
Nitpick comments:
In `@app/services/kr_candles_sync_service.py`:
- Around line 510-521: The loop repeatedly calls min(allowed_days) each
iteration which is O(n); precompute it once before the while True loop (e.g.,
min_allowed_day = min(allowed_days) if allowed_days is not None else None) and
then replace uses of min(allowed_days) inside the loop with min_allowed_day;
update the condition that checks current_day < min(allowed_days) to use
min_allowed_day and keep the rest of the logic (current_day adjustment and
_day_before_cutoff calls) unchanged, referencing the variables allowed_days,
current_day, and the while True block in kr_candles_sync_service.py.
- Line 598: The double-cast on session is a type-checker workaround; replace it
with a proper local type annotation to remove cast noise: declare session with
type AsyncSession (e.g., "session: AsyncSession = AsyncSessionLocal()") instead
of using cast(AsyncSession, cast(object, AsyncSessionLocal())), referencing the
AsyncSessionLocal() factory and the session variable so the type checker knows
the concrete type without casts.
- Around line 664-681: The current try/except around _sync_symbol_venue
re-raises on any Exception which aborts the whole batch; instead, catch
Exception, await session.rollback(), log the error with context (include symbol,
venue, normalized_mode, and any error message/stack) and optionally increment a
failure metric or a local counter, then continue to the next pair without
re-raising so subsequent pairs still run; keep using the same symbols (session,
kis, _sync_symbol_venue, stats, pairs_processed, rows_upserted, pages_fetched,
now_kst, backfill_days) to locate where to add the logging and failure handling.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
Makefilealembic/versions/d31f0a2b4c6d_add_kr_candles_retention_policy.pyapp/jobs/kr_candles.pyapp/services/kr_candles_sync_service.pyapp/tasks/__init__.pyapp/tasks/kr_candles_tasks.pyscripts/sql/kr_candles_timescale.sqlscripts/sync_kr_candles.pytests/test_kr_candles_sync.py
| inactive = sorted( | ||
| symbol | ||
| for symbol in target_symbols | ||
| if symbol in rows_by_symbol and not rows_by_symbol[symbol].is_active | ||
| ) | ||
| if inactive: | ||
| preview = ", ".join(inactive[:10]) | ||
| raise ValueError( | ||
| f"KR symbol is inactive in kr_symbol_universe: " | ||
| f"count={len(inactive)} symbols=[{preview}]" | ||
| ) |
There was a problem hiding this comment.
Inactive symbols in holdings abort the entire sync run
If any symbol returned by kis.fetch_my_stocks() or the manual holdings is flagged is_active=False in kr_symbol_universe, the entire sync_kr_candles call raises before a single candle is upserted. In live portfolios it is normal for a ticker to become suspended or delisted while still appearing in holdings. This hard-raise blocks candle syncing for all other active symbols.
Consider filtering inactive symbols out (with a warning) and continuing rather than raising:
🛡️ Proposed fix – warn and filter instead of raise
- inactive = sorted(
- symbol
- for symbol in target_symbols
- if symbol in rows_by_symbol and not rows_by_symbol[symbol].is_active
- )
- if inactive:
- preview = ", ".join(inactive[:10])
- raise ValueError(
- f"KR symbol is inactive in kr_symbol_universe: "
- f"count={len(inactive)} symbols=[{preview}]"
- )
-
- return {symbol: rows_by_symbol[symbol] for symbol in target_symbols}
+ inactive = sorted(
+ symbol
+ for symbol in target_symbols
+ if symbol in rows_by_symbol and not rows_by_symbol[symbol].is_active
+ )
+ if inactive:
+ preview = ", ".join(inactive[:10])
+ logger.warning(
+ "KR symbol is inactive in kr_symbol_universe, skipping: "
+ "count=%d symbols=[%s]",
+ len(inactive),
+ preview,
+ )
+
+ active_symbols = target_symbols - set(inactive)
+ return {symbol: rows_by_symbol[symbol] for symbol in active_symbols}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/services/kr_candles_sync_service.py` around lines 175 - 185, The current
block in sync_kr_candles that builds inactive from target_symbols and raises
ValueError stops the whole run when any holding is flagged is_active=False;
instead, detect inactive (using the existing inactive = sorted(...) logic), log
a warning including count and preview (use process logger or the existing logger
in sync_kr_candles), remove those symbols from target_symbols/rows_by_symbol so
they are skipped, and continue processing the remaining active symbols; do not
raise—just warn and filter out inactive entries before any upserts.
| backfill_days: list[date] | None = None | ||
| if normalized_mode == "backfill": | ||
| include_today = now_kst.time() >= _VENUE_CONFIG["KRX"].session_end | ||
| backfill_days = _recent_session_days( | ||
| now_kst, | ||
| session_count, | ||
| include_today=include_today, | ||
| ) |
There was a problem hiding this comment.
include_today threshold is hardcoded to KRX session end, leaving partial NTX data in backfill runs
_VENUE_CONFIG["KRX"].session_end (15:30 KST) is used to decide whether to include today in backfill_days for all venues. NTX closes at 20:00 KST, so between 15:30 and 20:00 KST a backfill run will include today in backfill_days, fetch partial NTX today data (bounded by _initial_end_time), and store an incomplete day. The per-venue session_end should drive the gate for each venue, or the threshold should be the latest close across all planned venues.
🛡️ Proposed fix – derive threshold from the latest planned venue
- include_today = now_kst.time() >= _VENUE_CONFIG["KRX"].session_end
+ latest_close = max(
+ v.session_end
+ for venues in venue_plan.values()
+ for v in venues
+ )
+ include_today = now_kst.time() >= latest_close
backfill_days = _recent_session_days(
now_kst,
session_count,
include_today=include_today,
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/services/kr_candles_sync_service.py` around lines 632 - 639, The backfill
include_today gate currently uses _VENUE_CONFIG["KRX"].session_end (15:30) which
causes partial NTX (20:00) days to be included; change the logic that sets
include_today inside the normalized_mode == "backfill" branch to compute the
threshold from the actual planned venues' session_end values (e.g. take
max(session_end) across the venues you will fetch) instead of hardcoding KRX,
then compare now_kst.time() against that derived threshold to decide
include_today before calling _recent_session_days; update references to
include_today, normalized_mode, now_kst, _VENUE_CONFIG, session_end, and
_recent_session_days accordingly so _initial_end_time bounded partial NTX days
are not persisted.
| async def sync_kr_candles_incremental_task() -> dict[str, object]: | ||
| try: | ||
| return await run_kr_candles_sync(mode="incremental") | ||
| except Exception as exc: | ||
| logger.error("TaskIQ KR candles sync failed: %s", exc, exc_info=True) | ||
| return { | ||
| "status": "failed", | ||
| "mode": "incremental", | ||
| "error": str(exc), | ||
| } |
There was a problem hiding this comment.
Dead except block — run_kr_candles_sync never raises.
run_kr_candles_sync absorbs all Exception subclasses and returns a status dict. Lines 18-24 are unreachable, which creates a maintenance trap (developers may expect this logger path to fire).
♻️ Simplify to reflect the actual control flow
async def sync_kr_candles_incremental_task() -> dict[str, object]:
- try:
- return await run_kr_candles_sync(mode="incremental")
- except Exception as exc:
- logger.error("TaskIQ KR candles sync failed: %s", exc, exc_info=True)
- return {
- "status": "failed",
- "mode": "incremental",
- "error": str(exc),
- }
+ return await run_kr_candles_sync(mode="incremental")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/tasks/kr_candles_tasks.py` around lines 15 - 24, The except block in
sync_kr_candles_incremental_task is dead because run_kr_candles_sync
catches/returns all Exceptions; remove the try/except and either (a) return
await run_kr_candles_sync(mode="incremental") directly, or (b) call
run_kr_candles_sync, inspect its returned dict and log/error-handle based on its
"status" field; update the sync_kr_candles_incremental_task function (and
similarly any other sync_* wrapper) to reflect that control flow change instead
of relying on an unreachable exception handler.
| try: | ||
| result = await run_kr_candles_sync( | ||
| mode=args.mode, | ||
| sessions=max(args.sessions, 1), | ||
| user_id=args.user_id, | ||
| ) | ||
| except Exception as exc: | ||
| capture_exception(exc, process="sync_kr_candles") | ||
| logger.error("KR candles sync crashed: %s", exc, exc_info=True) | ||
| return 1 | ||
|
|
||
| if result.get("status") != "completed": | ||
| logger.error("KR candles sync failed: %s", result) | ||
| return 1 |
There was a problem hiding this comment.
capture_exception is unreachable — sync failures bypass Sentry.
run_kr_candles_sync swallows all Exception subclasses internally, so the except on line 57 is dead code and capture_exception is never called. The status != "completed" branch on line 62 only logs locally; no Sentry event is emitted for sync failures.
🛡️ Proposed fix — route failure status to Sentry
- try:
- result = await run_kr_candles_sync(
- mode=args.mode,
- sessions=max(args.sessions, 1),
- user_id=args.user_id,
- )
- except Exception as exc:
- capture_exception(exc, process="sync_kr_candles")
- logger.error("KR candles sync crashed: %s", exc, exc_info=True)
- return 1
+ result = await run_kr_candles_sync(
+ mode=args.mode,
+ sessions=max(args.sessions, 1),
+ user_id=args.user_id,
+ )
if result.get("status") != "completed":
logger.error("KR candles sync failed: %s", result)
+ capture_exception(
+ RuntimeError(f"KR candles sync failed: {result}"),
+ process="sync_kr_candles",
+ )
return 1📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| result = await run_kr_candles_sync( | |
| mode=args.mode, | |
| sessions=max(args.sessions, 1), | |
| user_id=args.user_id, | |
| ) | |
| except Exception as exc: | |
| capture_exception(exc, process="sync_kr_candles") | |
| logger.error("KR candles sync crashed: %s", exc, exc_info=True) | |
| return 1 | |
| if result.get("status") != "completed": | |
| logger.error("KR candles sync failed: %s", result) | |
| return 1 | |
| result = await run_kr_candles_sync( | |
| mode=args.mode, | |
| sessions=max(args.sessions, 1), | |
| user_id=args.user_id, | |
| ) | |
| if result.get("status") != "completed": | |
| logger.error("KR candles sync failed: %s", result) | |
| capture_exception( | |
| RuntimeError(f"KR candles sync failed: {result}"), | |
| process="sync_kr_candles", | |
| ) | |
| return 1 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/sync_kr_candles.py` around lines 51 - 64, The current except block is
unreachable because run_kr_candles_sync swallows exceptions, so failures where
result.get("status") != "completed" never get reported to Sentry; update the
failure branch to call Sentry (e.g., capture_exception or capture_message) with
the result and relevant context so failures are routed to Sentry. Concretely, in
the block after run_kr_candles_sync where you check result.get("status") !=
"completed", invoke capture_exception (or capture_message) with an informative
message and include the result payload and process="sync_kr_candles" as context,
keep the existing logger.error call, and ensure the original except still
captures unexpected exceptions via capture_exception and logger.error.
| def test_build_symbol_union_combines_kis_and_manual_symbols() -> None: | ||
| from app.services import kr_candles_sync_service as svc | ||
|
|
||
| kis_holdings = [ | ||
| {"pdno": "5930"}, | ||
| {"pdno": "035420"}, | ||
| {"pdno": None}, | ||
| ] | ||
| manual_holdings = [ | ||
| SimpleNamespace(ticker="005930"), | ||
| SimpleNamespace(ticker="000660"), | ||
| ] | ||
|
|
||
| symbols = svc._build_symbol_union(kis_holdings, manual_holdings) | ||
|
|
||
| assert symbols == {"005930", "035420", "000660"} | ||
|
|
||
|
|
||
| def test_validate_universe_rows_fails_when_table_empty() -> None: | ||
| from app.services import kr_candles_sync_service as svc | ||
|
|
||
| with pytest.raises(ValueError, match="kr_symbol_universe is empty"): | ||
| svc._validate_universe_rows( | ||
| target_symbols={"005930"}, | ||
| universe_rows=[], | ||
| table_has_rows=False, | ||
| ) | ||
|
|
||
|
|
||
| def test_validate_universe_rows_fails_when_symbol_missing() -> None: | ||
| from app.services import kr_candles_sync_service as svc | ||
|
|
||
| row = _make_universe_row("005930", nxt_eligible=True, is_active=True) | ||
|
|
||
| with pytest.raises(ValueError, match="not registered"): | ||
| svc._validate_universe_rows( | ||
| target_symbols={"005930", "000660"}, | ||
| universe_rows=[row], | ||
| table_has_rows=True, | ||
| ) | ||
|
|
||
|
|
||
| def test_validate_universe_rows_fails_when_symbol_inactive() -> None: | ||
| from app.services import kr_candles_sync_service as svc | ||
|
|
||
| inactive_row = _make_universe_row("005930", nxt_eligible=False, is_active=False) | ||
|
|
||
| with pytest.raises(ValueError, match="inactive"): | ||
| svc._validate_universe_rows( | ||
| target_symbols={"005930"}, | ||
| universe_rows=[inactive_row], | ||
| table_has_rows=True, | ||
| ) | ||
|
|
||
|
|
||
| def test_build_venue_plan_maps_dual_and_single_venues() -> None: | ||
| from app.services import kr_candles_sync_service as svc | ||
|
|
||
| rows_by_symbol = { | ||
| "005930": _make_universe_row("005930", nxt_eligible=True, is_active=True), | ||
| "000660": _make_universe_row("000660", nxt_eligible=False, is_active=True), | ||
| } | ||
|
|
||
| plan = svc._build_venue_plan(rows_by_symbol) | ||
|
|
||
| assert [v.venue for v in plan["005930"]] == ["KRX", "NTX"] | ||
| assert [v.market_code for v in plan["005930"]] == ["J", "NX"] | ||
| assert [v.venue for v in plan["000660"]] == ["KRX"] | ||
|
|
||
|
|
||
| def test_should_process_venue_skips_holiday_in_incremental_mode() -> None: | ||
| from app.services import kr_candles_sync_service as svc | ||
|
|
||
| venue = svc._VENUE_CONFIG["KRX"] | ||
| now_kst = datetime(2026, 1, 1, 10, 0, tzinfo=svc._KST) | ||
|
|
||
| should_process, reason = svc._should_process_venue( | ||
| mode="incremental", | ||
| now_kst=now_kst, | ||
| is_session_day=False, | ||
| venue=venue, | ||
| ) | ||
|
|
||
| assert should_process is False | ||
| assert reason == "holiday" | ||
|
|
||
|
|
||
| def test_should_process_venue_skips_outside_session_in_incremental_mode() -> None: | ||
| from app.services import kr_candles_sync_service as svc | ||
|
|
||
| venue = svc._VENUE_CONFIG["KRX"] | ||
| now_kst = datetime(2026, 2, 23, 8, 10, tzinfo=svc._KST) | ||
|
|
||
| should_process, reason = svc._should_process_venue( | ||
| mode="incremental", | ||
| now_kst=now_kst, | ||
| is_session_day=True, | ||
| venue=venue, | ||
| ) | ||
|
|
||
| assert should_process is False | ||
| assert reason == "outside_session" | ||
|
|
||
|
|
||
| def test_compute_incremental_cutoff_uses_five_minute_overlap() -> None: | ||
| from app.services import kr_candles_sync_service as svc | ||
|
|
||
| cursor_utc = datetime(2026, 2, 23, 1, 30, tzinfo=UTC) | ||
|
|
||
| cutoff_kst = svc._compute_incremental_cutoff_kst(cursor_utc) | ||
|
|
||
| assert cutoff_kst is not None | ||
| assert cutoff_kst.tzinfo == svc._KST | ||
| assert cutoff_kst.strftime("%Y-%m-%d %H:%M:%S") == "2026-02-23 10:25:00" | ||
|
|
||
|
|
||
| def test_convert_kis_datetime_to_utc_interprets_naive_as_kst() -> None: | ||
| from app.services import kr_candles_sync_service as svc | ||
|
|
||
| converted = svc._convert_kis_datetime_to_utc(datetime(2026, 2, 23, 9, 0, 0)) | ||
|
|
||
| assert converted.tzinfo == UTC | ||
| assert converted.strftime("%Y-%m-%d %H:%M:%S") == "2026-02-23 00:00:00" | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_run_kr_candles_sync_success_payload( | ||
| monkeypatch: pytest.MonkeyPatch, | ||
| ) -> None: | ||
| from app.jobs import kr_candles | ||
|
|
||
| monkeypatch.setattr( | ||
| kr_candles, | ||
| "sync_kr_candles", | ||
| AsyncMock(return_value={"mode": "incremental", "rows_upserted": 11}), | ||
| ) | ||
|
|
||
| result = await kr_candles.run_kr_candles_sync(mode="incremental") | ||
|
|
||
| assert result["status"] == "completed" | ||
| assert result["mode"] == "incremental" | ||
| assert result["rows_upserted"] == 11 | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_run_kr_candles_sync_failure_payload( | ||
| monkeypatch: pytest.MonkeyPatch, | ||
| ) -> None: | ||
| from app.jobs import kr_candles | ||
|
|
||
| monkeypatch.setattr( | ||
| kr_candles, | ||
| "sync_kr_candles", | ||
| AsyncMock(side_effect=RuntimeError("sync failure")), | ||
| ) | ||
|
|
||
| result = await kr_candles.run_kr_candles_sync(mode="incremental") | ||
|
|
||
| assert result["status"] == "failed" | ||
| assert "sync failure" in str(result["error"]) | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_task_payload_success(monkeypatch: pytest.MonkeyPatch) -> None: | ||
| from app.tasks import kr_candles_tasks | ||
|
|
||
| monkeypatch.setattr( | ||
| kr_candles_tasks, | ||
| "run_kr_candles_sync", | ||
| AsyncMock(return_value={"status": "completed", "rows_upserted": 3}), | ||
| ) | ||
|
|
||
| result = await kr_candles_tasks.sync_kr_candles_incremental_task() | ||
|
|
||
| assert result["status"] == "completed" | ||
| assert result["rows_upserted"] == 3 | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_task_payload_failure(monkeypatch: pytest.MonkeyPatch) -> None: | ||
| from app.tasks import kr_candles_tasks | ||
|
|
||
| monkeypatch.setattr( | ||
| kr_candles_tasks, | ||
| "run_kr_candles_sync", | ||
| AsyncMock(side_effect=RuntimeError("task crash")), | ||
| ) | ||
|
|
||
| result = await kr_candles_tasks.sync_kr_candles_incremental_task() | ||
|
|
||
| assert result["status"] == "failed" | ||
| assert "task crash" in str(result["error"]) | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_script_main_exit_codes(monkeypatch: pytest.MonkeyPatch) -> None: | ||
| from scripts import sync_kr_candles | ||
|
|
||
| monkeypatch.setattr(sync_kr_candles, "init_sentry", lambda **_: None) | ||
| monkeypatch.setattr( | ||
| sync_kr_candles, | ||
| "run_kr_candles_sync", | ||
| AsyncMock(return_value={"status": "completed", "rows_upserted": 1}), | ||
| ) | ||
| success_code = await sync_kr_candles.main(["--mode", "incremental"]) | ||
| assert success_code == 0 | ||
|
|
||
| monkeypatch.setattr( | ||
| sync_kr_candles, | ||
| "run_kr_candles_sync", | ||
| AsyncMock(return_value={"status": "failed", "error": "boom"}), | ||
| ) | ||
| failed_status_code = await sync_kr_candles.main(["--mode", "incremental"]) | ||
| assert failed_status_code == 1 | ||
|
|
||
| capture_mock = MagicMock() | ||
| monkeypatch.setattr(sync_kr_candles, "capture_exception", capture_mock) | ||
| monkeypatch.setattr( | ||
| sync_kr_candles, | ||
| "run_kr_candles_sync", | ||
| AsyncMock(side_effect=RuntimeError("hard crash")), | ||
| ) | ||
| exception_code = await sync_kr_candles.main(["--mode", "incremental"]) | ||
| assert exception_code == 1 | ||
| capture_mock.assert_called_once() | ||
|
|
||
|
|
||
| def test_new_retention_migration_contains_upgrade_and_downgrade_policy_sql() -> None: | ||
| versions_dir = Path("alembic/versions") | ||
| matches = sorted(versions_dir.glob("*_add_kr_candles_retention_policy.py")) | ||
| assert matches, "retention migration file is missing" | ||
|
|
||
| content = matches[-1].read_text(encoding="utf-8") | ||
|
|
||
| assert "add_retention_policy" in content | ||
| assert "remove_retention_policy" in content | ||
| assert "kr_candles_1m" in content | ||
| assert "kr_candles_1h" in content | ||
| assert "90 days" in content | ||
|
|
||
|
|
||
| def test_sql_script_contains_90_day_retention_policy_for_both_tables() -> None: | ||
| content = Path("scripts/sql/kr_candles_timescale.sql").read_text(encoding="utf-8") | ||
|
|
||
| assert "add_retention_policy" in content | ||
| assert "remove_retention_policy" in content | ||
| assert "public.kr_candles_1m" in content | ||
| assert "public.kr_candles_1h" in content | ||
| assert "90 days" in content |
There was a problem hiding this comment.
All tests are missing required category markers — coding guideline violation.
Every test function lacks @pytest.mark.unit (or integration/slow). All tests here are unit tests (no DB, no network). Per coding guidelines, all tests must be categorized with registered markers.
♻️ Example fix for the first few functions
+@pytest.mark.unit
def test_build_symbol_union_combines_kis_and_manual_symbols() -> None:
+@pytest.mark.unit
def test_validate_universe_rows_fails_when_table_empty() -> None:
+@pytest.mark.unit
+@pytest.mark.asyncio
async def test_run_kr_candles_sync_success_payload(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_kr_candles_sync.py` around lines 28 - 276, All tests in this file
lack the required pytest marker; add classification by either decorating each
test function (e.g. `@pytest.mark.unit` above
test_build_symbol_union_combines_kis_and_manual_symbols,
test_validate_universe_rows_fails_when_table_empty,
test_validate_universe_rows_fails_when_symbol_missing,
test_validate_universe_rows_fails_when_symbol_inactive,
test_build_venue_plan_maps_dual_and_single_venues,
test_should_process_venue_skips_holiday_in_incremental_mode,
test_should_process_venue_skips_outside_session_in_incremental_mode,
test_compute_incremental_cutoff_uses_five_minute_overlap,
test_convert_kis_datetime_to_utc_interprets_naive_as_kst,
test_run_kr_candles_sync_success_payload,
test_run_kr_candles_sync_failure_payload, test_task_payload_success,
test_task_payload_failure, test_script_main_exit_codes,
test_new_retention_migration_contains_upgrade_and_downgrade_policy_sql,
test_sql_script_contains_90_day_retention_policy_for_both_tables) or add a
module-level marker like pytestmark = pytest.mark.unit at top of the file so
every test is categorized as a unit test per guidelines.
| capture_mock = MagicMock() | ||
| monkeypatch.setattr(sync_kr_candles, "capture_exception", capture_mock) | ||
| monkeypatch.setattr( | ||
| sync_kr_candles, | ||
| "run_kr_candles_sync", | ||
| AsyncMock(side_effect=RuntimeError("hard crash")), | ||
| ) | ||
| exception_code = await sync_kr_candles.main(["--mode", "incremental"]) | ||
| assert exception_code == 1 | ||
| capture_mock.assert_called_once() |
There was a problem hiding this comment.
test_script_main_exit_codes exception-path branch tests dead production code.
The test patches run_kr_candles_sync to raise RuntimeError to cover lines 57-60 of sync_kr_candles.py. Those lines are unreachable in production (as run_kr_candles_sync absorbs all exceptions). If the dead except is removed per the fix suggested above, this test sub-case will need to be updated or removed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_kr_candles_sync.py` around lines 243 - 252, Remove the dead
exception-path subcase in the test_script_main_exit_codes test: delete the block
that patches sync_kr_candles.run_kr_candles_sync to raise RuntimeError (and the
related capture_mock/capture_exception assertions), because run_kr_candles_sync
absorbs exceptions in production and that branch is unreachable; leave the other
exit-code assertions intact so the test no longer asserts behavior for the
removed except branch in sync_kr_candles.main.
| def test_new_retention_migration_contains_upgrade_and_downgrade_policy_sql() -> None: | ||
| versions_dir = Path("alembic/versions") | ||
| matches = sorted(versions_dir.glob("*_add_kr_candles_retention_policy.py")) | ||
| assert matches, "retention migration file is missing" |
There was a problem hiding this comment.
Relative Path("alembic/versions") ties the test to the repo-root working directory.
This is fine when run via make test (repo root), but fragile if run from any other directory. Prefer anchoring to __file__:
♻️ Proposed fix
- versions_dir = Path("alembic/versions")
+ versions_dir = Path(__file__).parent.parent / "alembic" / "versions"Same fix applies to line 270 (Path("scripts/sql/kr_candles_timescale.sql")):
- content = Path("scripts/sql/kr_candles_timescale.sql").read_text(encoding="utf-8")
+ content = (Path(__file__).parent.parent / "scripts" / "sql" / "kr_candles_timescale.sql").read_text(encoding="utf-8")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_kr_candles_sync.py` around lines 255 - 258, The test uses relative
Paths tied to the current working directory; update them to be
repo-root-relative by resolving from the test file location: in
test_new_retention_migration_contains_upgrade_and_downgrade_policy_sql replace
Path("alembic/versions") with Path(__file__).resolve().parent.parent / "alembic"
/ "versions" (so the glob still works), and likewise replace
Path("scripts/sql/kr_candles_timescale.sql") with
Path(__file__).resolve().parent.parent / "scripts" / "sql" /
"kr_candles_timescale.sql"; keep the same variable names and match/glob logic in
the existing function to avoid changing behavior.
Switch docker-compose.yml to timescale/timescaledb-ha:pg17 to support KR candles hypertable and continuous aggregate features.
PR Review: Harden KR Candles Timescale Migration and SQLOverall this is a well-structured PR with solid TimescaleDB integration, good idempotent SQL guards, and respectable test coverage. The microsecond tie-break keys for deterministic FIRST/LAST ordering are a clever approach. Below are findings from high to low severity. Critical1. Session not used as async context manager # Current – fragile double-cast, not a proper context manager
session = cast(AsyncSession, cast(object, AsyncSessionLocal()))
try:
...
finally:
await session.close()This bypasses the SQLAlchemy session lifecycle. If the session factory returns an async with AsyncSessionLocal() as session:
...
# no explicit close neededThe double 2. while True:
...
if allowed_days is not None:
if current_day < min(allowed_days): # O(n) on every iteration
break
earliest_allowed = min(allowed_days) if allowed_days is not None else None
while True:
if earliest_allowed is not None and current_day < earliest_allowed:
breakBugs3. _ = await session.execute(_UPSERT_SQL, payload)
return len(payload) # counts no-ops tooThe 4. Layered exception handling creates dead code in
try:
result = await run_kr_candles_sync(...) # never raises
except Exception as exc:
capture_exception(exc, ...) # dead code for sync errors
return 1Either have the job layer re-raise (so the script layer is the single error boundary), or remove the redundant try/except in Code Quality5. Missing module docstring in retention migration The retention migration file is missing the module-level docstring and the 6. Undocumented venue priority logic in the continuous aggregate The FIRST/LAST tie-break keys use opposite priority for open vs. close:
This appears intentional (NTX closes later so it holds the canonical last trade price for an hour), but there is no comment explaining the reasoning. This logic will confuse future maintainers. A short inline comment is warranted. 7. Variable naming: value_value = _parse_float(item.get("value"))Consider 8. Redundant Applied in both 9. Tests missing Per CLAUDE.md, unit tests should be decorated with 10. if len(text_value) == 6 and text_value.isalnum():
return text_value
return NoneAny symbol that is 7+ characters returns 11. Dual-type handling in if isinstance(item, dict):
raw_symbol = cast(object | None, item.get("pdno"))
else:
raw_symbol = getattr(item, "pdno", None)If Minor Observations (no action required)
Summary: Two items need attention before merge (session lifecycle, |
PR Review: Harden KR candles Timescale migration and SQLOverall this is a well-structured addition. The architecture is clean — the sync service is decomposed into small, testable pure functions, the migration version-guards are solid, and the job/task separation follows the existing project patterns. Below are findings by severity. 🔴 High — Session creation anti-patternFile: session = cast(AsyncSession, cast(object, AsyncSessionLocal()))The double-cast is a workaround to suppress type errors and is fragile. If async with AsyncSessionLocal() as session:
# all sync logic here
...This makes rollback/close automatic and matches every other service in the codebase. 🟡 Medium —
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
docs/plans/2026-02-23-kr-candles-local-backfill-validation-design.md (1)
40-42: Add a precondition-check query and remove the fixedLIMIT 20cap.Two small but concrete gaps in the validation coverage:
- The failure criteria flag an empty/inactive
kr_symbol_universeas a hard blocker, but no SQL query is provided to check it before starting ingestion — making that precondition opaque.LIMIT 20in Query 1 silently hides any symbols beyond the 20th row. If the universe is larger, incomplete coverage passes unnoticed.📄 Suggested additions
Add a precondition query under Validation Queries (or in a new Precondition Queries subsection):
-- Precondition: confirm active symbols exist SELECT symbol, venue, is_active FROM public.kr_symbol_universe WHERE is_active = TRUE ORDER BY symbol;Drop the arbitrary
LIMIT 20from Query 1, or replace it with a count cross-check:-ORDER BY cnt DESC -LIMIT 20; +ORDER BY cnt DESC;Also applies to: 49-55
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/plans/2026-02-23-kr-candles-local-backfill-validation-design.md` around lines 40 - 42, Add an explicit precondition-check SQL query to verify active symbols exist in the kr_symbol_universe before ingestion (e.g., select symbol, venue, is_active where is_active = TRUE ordered by symbol) and place it under "Validation Queries" or a new "Precondition Queries" subsection; also remove the fixed "LIMIT 20" in Query 1 (or replace it with a count-based cross-check that validates the returned rows equal the universe size) so the validation covers the full kr_symbol_universe rather than silently truncating results.docker-compose.yml (2)
12-12: Heads-up: existingpg_datavolume must be destroyed before first use.The
pg_datavolume populated by the previouspostgres:17image is incompatible withtimescale/timescaledb-ha. Attempting to start the container against an existing vanilla-PostgreSQL data directory will fail or produce a broken state. Developers (and CI environments) must rundocker compose down -vbefore pulling the new image.Consider adding a note to the project README or a
docker-compose.override.ymlcomment to signal this one-time teardown requirement.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker-compose.yml` at line 12, The docker volume pg_data used in docker-compose.yml is incompatible with the new image timescale/timescaledb-ha and must be removed before first use; update documentation and/or the compose config to warn users to run docker compose down -v (or docker-compose down -v) before starting the new service, and add a clear comment near the pg_data entry in docker-compose.yml (reference symbol: pg_data) and a short note in the README describing the one-time teardown step and rationale (reference image names: postgres:17 → timescale/timescaledb-ha).
4-4: Pin the image to a specific TimescaleDB version tag.
timescale/timescaledb-ha:pg17is a floating tag that resolves to the latest TimescaleDB build at pull time. The codebase enforces a minimum TimescaleDB version of 2.8.1 (checked in migrations), and using an unpinned tag means different developers or CI runs may silently pull different versions, defeating that guarantee.Use a pinned tag like
pg17-ts2.24(or a more recent major.minor series) to ensure reproducible environments:🔧 Proposed fix
- image: timescale/timescaledb-ha:pg17 + image: timescale/timescaledb-ha:pg17-ts2.24🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker-compose.yml` at line 4, Replace the floating Docker image tag "timescale/timescaledb-ha:pg17" in docker-compose.yml with a pinned TimescaleDB release that satisfies the project's minimum (e.g., use "timescale/timescaledb-ha:pg17-ts2.24" or a newer pg17-ts<major.minor> tag); update the image line so CI and developers always pull the same TimescaleDB minor version that meets the migrations' minimum (>=2.8.1).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/plans/2026-02-23-kr-candles-local-backfill-validation-design.md`:
- Around line 48-62: Validation queries scan entire tables and can be fooled by
pre-existing rows; add explicit time-bounded predicates anchored to the
recent-3-trading-sessions window so they only validate newly targeted ranges.
Update the first query on public.kr_candles_1m to include a WHERE time >=
:start_time (or computed start_time for the 3-session window) and the second
query on public.kr_candles_1h to include WHERE bucket >= :start_bucket (or
equivalent timestamp) and, if helpful, AND bucket <= :end_bucket; use these
parameters when running the validation so results reflect only the current
backfill window.
- Around line 28-30: Add an explicit TimescaleDB continuous-aggregate refresh
between the ingestion step and validation to avoid stale/empty results: after
running the backfill (scripts/sync_kr_candles.py --mode backfill) call a refresh
for the public.kr_candles_1h continuous aggregate (using
refresh_continuous_aggregate or equivalent) so the 1h aggregate is up-to-date
before running the validation query against kr_candles_1h.
---
Nitpick comments:
In `@docker-compose.yml`:
- Line 12: The docker volume pg_data used in docker-compose.yml is incompatible
with the new image timescale/timescaledb-ha and must be removed before first
use; update documentation and/or the compose config to warn users to run docker
compose down -v (or docker-compose down -v) before starting the new service, and
add a clear comment near the pg_data entry in docker-compose.yml (reference
symbol: pg_data) and a short note in the README describing the one-time teardown
step and rationale (reference image names: postgres:17 →
timescale/timescaledb-ha).
- Line 4: Replace the floating Docker image tag "timescale/timescaledb-ha:pg17"
in docker-compose.yml with a pinned TimescaleDB release that satisfies the
project's minimum (e.g., use "timescale/timescaledb-ha:pg17-ts2.24" or a newer
pg17-ts<major.minor> tag); update the image line so CI and developers always
pull the same TimescaleDB minor version that meets the migrations' minimum
(>=2.8.1).
In `@docs/plans/2026-02-23-kr-candles-local-backfill-validation-design.md`:
- Around line 40-42: Add an explicit precondition-check SQL query to verify
active symbols exist in the kr_symbol_universe before ingestion (e.g., select
symbol, venue, is_active where is_active = TRUE ordered by symbol) and place it
under "Validation Queries" or a new "Precondition Queries" subsection; also
remove the fixed "LIMIT 20" in Query 1 (or replace it with a count-based
cross-check that validates the returned rows equal the universe size) so the
validation covers the full kr_symbol_universe rather than silently truncating
results.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docker-compose.ymldocs/plans/2026-02-23-kr-candles-local-backfill-validation-design.mddocs/plans/2026-02-23-kr-candles-local-backfill-validation-implementation-plan.md
✅ Files skipped from review due to trivial changes (1)
- docs/plans/2026-02-23-kr-candles-local-backfill-validation-implementation-plan.md
| 2. Ingestion: | ||
| - Run: | ||
| - `uv run python scripts/sync_kr_candles.py --mode backfill --sessions 3` |
There was a problem hiding this comment.
Add a manual continuous-aggregate refresh step before querying kr_candles_1h.
TimescaleDB continuous aggregates are NOT refreshed in real-time — they are driven by a background policy. On a local dev instance that policy may not have fired, meaning kr_candles_1h will be empty (or stale) even after a successful backfill of kr_candles_1m. The current Ingestion step goes straight from script execution to validation without triggering a refresh, which will produce a false failure every time the policy hasn't run yet.
Add an explicit refresh call between ingestion and validation:
📄 Proposed addition to the Ingestion step
2. Ingestion:
- Run:
- `uv run python scripts/sync_kr_candles.py --mode backfill --sessions 3`
+ - Then manually refresh the continuous aggregate so `kr_candles_1h` is up-to-date:
+ ```sql
+ CALL refresh_continuous_aggregate('public.kr_candles_1h', NULL, NULL);
+ ```📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 2. Ingestion: | |
| - Run: | |
| - `uv run python scripts/sync_kr_candles.py --mode backfill --sessions 3` | |
| 2. Ingestion: | |
| - Run: | |
| - `uv run python scripts/sync_kr_candles.py --mode backfill --sessions 3` | |
| - Then manually refresh the continuous aggregate so `kr_candles_1h` is up-to-date: |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/plans/2026-02-23-kr-candles-local-backfill-validation-design.md` around
lines 28 - 30, Add an explicit TimescaleDB continuous-aggregate refresh between
the ingestion step and validation to avoid stale/empty results: after running
the backfill (scripts/sync_kr_candles.py --mode backfill) call a refresh for the
public.kr_candles_1h continuous aggregate (using refresh_continuous_aggregate or
equivalent) so the 1h aggregate is up-to-date before running the validation
query against kr_candles_1h.
| ## Validation Queries | ||
| ```sql | ||
| SELECT symbol, venue, COUNT(*) AS cnt, MIN(time) AS min_time, MAX(time) AS max_time | ||
| FROM public.kr_candles_1m | ||
| GROUP BY symbol, venue | ||
| ORDER BY cnt DESC | ||
| LIMIT 20; | ||
| ``` | ||
|
|
||
| ```sql | ||
| SELECT symbol, bucket, open, high, low, close, volume, value, venues | ||
| FROM public.kr_candles_1h | ||
| ORDER BY bucket DESC | ||
| LIMIT 50; | ||
| ``` |
There was a problem hiding this comment.
Validation queries lack a time-bound filter — false positives from pre-existing data.
Both Query 1 and Query 2 scan the full table with no WHERE time/bucket >= ... predicate anchored to the "recent 3 trading sessions" window stated in the Goal. If any rows from a previous run already exist, the queries will report success even if the current backfill wrote nothing.
📄 Proposed time-bounded variants
-SELECT symbol, venue, COUNT(*) AS cnt, MIN(time) AS min_time, MAX(time) AS max_time
-FROM public.kr_candles_1m
-GROUP BY symbol, venue
-ORDER BY cnt DESC
-LIMIT 20;
+SELECT symbol, venue, COUNT(*) AS cnt, MIN(time) AS min_time, MAX(time) AS max_time
+FROM public.kr_candles_1m
+WHERE time >= now() - INTERVAL '3 days' -- adjust to session-start boundary as needed
+GROUP BY symbol, venue
+ORDER BY cnt DESC;-SELECT symbol, bucket, open, high, low, close, volume, value, venues
-FROM public.kr_candles_1h
-ORDER BY bucket DESC
-LIMIT 50;
+SELECT symbol, bucket, open, high, low, close, volume, value, venues
+FROM public.kr_candles_1h
+WHERE bucket >= now() - INTERVAL '3 days'
+ORDER BY bucket DESC;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/plans/2026-02-23-kr-candles-local-backfill-validation-design.md` around
lines 48 - 62, Validation queries scan entire tables and can be fooled by
pre-existing rows; add explicit time-bounded predicates anchored to the
recent-3-trading-sessions window so they only validate newly targeted ranges.
Update the first query on public.kr_candles_1m to include a WHERE time >=
:start_time (or computed start_time for the 3-session window) and the second
query on public.kr_candles_1h to include WHERE bucket >= :start_bucket (or
equivalent timestamp) and, if helpful, AND bucket <= :end_bucket; use these
parameters when running the validation so results reflect only the current
backfill window.
|
PR #186 Code Review: KR Candles Timescale Migration and SQLOverall this is a well-structured PR that introduces TimescaleDB-backed minute candle storage and a DB-first hourly aggregation layer. The test coverage is comprehensive and the migration design is careful. Below are the key findings, from most to least critical. Bugs / Correctness Issues1. OHLC accuracy in current-hour reconstruction ( When merging KRX and NTX data for the current hour, # combined is built as one row per minute, each with source.high/source.low
high_ = max(m.high for m in combined)
low_ = min(m.low for m in combined)If NTX has a higher high or lower low than KRX at the same minute, those extremes are silently discarded. The true hourly high/low should take the maximum/minimum across all venues per minute before aggregating: high_ = max(max(r.high for r in group.values()) for group in minutes_by_time.values())
low_ = min(min(r.low for r in group.values()) for group in minutes_by_time.values())2. CLOSE tie-break direction in continuous aggregate (migration + SQL script) LAST(
close,
((extract(epoch from time) * 1000000)::bigint * 2
+ CASE WHEN venue = 'KRX' THEN 1 ELSE 0 END)
) AS close
Code Quality / Design Issues3. Session not used as a context manager ( session = cast(AsyncSession, cast(object, AsyncSessionLocal()))
try:
...
finally:
await session.close()The double-cast is a type-checking workaround that bypasses async with AsyncSessionLocal() as session:
...If using the context manager form is impractical here due to the commit-per-pair loop, explicitly document why. The current pattern works but is fragile and misleads readers — in 4. Bare
5. Strict row count guard may fail during initial deployment ( if len(out) < capped_count:
raise ValueError(
f"DB does not have enough KR 1h candles for {universe.symbol}: ..."
)This raises on any call before the backfill is complete, including any tool call that requests more rows than exist in the DB. Consider returning available rows (up to Infrastructure / Breaking Change6. - image: postgres:17
+ image: timescale/timescaledb-ha:pg17This changes the database engine for all developers. The
Minor Issues7. Planning docs contain internal AI agent instructions ( The files added under These are internal AI workflow instructions and should be stripped before merging, or the 8. Missing module docstring in retention migration
9. 90-day retention on Applying a separate 90-day retention policy to the continuous aggregate view will drop hourly bucket data even if the underlying 1m data is also retained for 90 days. Since the 1h view is rebuilt on demand via What's Done Well
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
tests/test_kr_hourly_candles_read_service.py (1)
144-174: DummyDB is duplicated across all 9 tests with minor variations.Each test defines its own
DummyDBclass with nearly identical structure — only thenxt_eligible, returned hour/minute rows, and sometimes the active flag differ. Consider extracting a reusable factory or fixture to reduce ~200 lines of boilerplate.♻️ Sketch of a shared factory
def _make_dummy_db( *, symbol: str, nxt_eligible: bool = True, is_active: bool = True, hour_rows: list[dict] | None = None, minute_rows: list[dict] | None = None, ): class DummyDB: def __init__(self): self.calls: list[str] = [] async def execute(self, query, params=None): sql = str(getattr(query, "text", query)) self.calls.append(sql) if "FROM public.kr_symbol_universe" in sql and "LIMIT 1" in sql: return _ScalarResult(symbol) if "FROM public.kr_symbol_universe" in sql and "WHERE symbol" in sql: return _MappingsResult( [{"symbol": symbol, "nxt_eligible": nxt_eligible, "is_active": is_active}] ) if "FROM public.kr_candles_1h" in sql: return _MappingsResult(hour_rows or []) if "FROM public.kr_candles_1m" in sql: rows = list(minute_rows or []) if isinstance(params, dict) and params.get("start_time") is not None: start, end = params["start_time"], params["end_time"] rows = [r for r in rows if start <= r["time"] < end] return _MappingsResult(rows) raise AssertionError(f"unexpected sql: {sql}") return DummyDB()Also applies to: 289-315, 372-391, 454-473, 522-541, 613-632, 678-697, 748-767, 793-812, 836-855
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_kr_hourly_candles_read_service.py` around lines 144 - 174, Multiple tests duplicate the DummyDB class with only small variations; replace the repeated classes with a single reusable factory/fixture. Create a function (e.g., _make_dummy_db) that accepts parameters symbol, nxt_eligible, is_active, hour_rows, minute_rows and returns an instance whose execute method implements the existing SQL branching logic (matching "FROM public.kr_symbol_universe", "FROM public.kr_candles_1h", "FROM public.kr_candles_1m") and time-range filtering; then update each test to call _make_dummy_db(...) instead of redefining DummyDB. Ensure the returned object's .calls list and behavior exactly mirror the original DummyDB so existing assertions keep working.app/services/kr_hourly_candles_read_service.py (3)
29-30: Unnecessary doublecast.The
cast(AsyncSession, cast(object, AsyncSessionLocal()))is overly convoluted. A single cast suffices.♻️ Simplify
def _async_session() -> AsyncSession: - return cast(AsyncSession, cast(object, AsyncSessionLocal())) + return cast(AsyncSession, AsyncSessionLocal())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/services/kr_hourly_candles_read_service.py` around lines 29 - 30, The _async_session function uses an unnecessary double cast; replace cast(AsyncSession, cast(object, AsyncSessionLocal())) with a single cast of the AsyncSessionLocal() return to AsyncSession (e.g., cast(AsyncSession, AsyncSessionLocal())) or, even better, annotate AsyncSessionLocal to return AsyncSession and remove the cast entirely; update the _async_session function (symbol: _async_session and AsyncSessionLocal) accordingly.
181-215: Multiple independent DB sessions for a single logical read.
read_kr_hourly_candles_1hopens up to 4 separate DB sessions (universe check in_resolve_universe_row, hour rows in_fetch_hour_rows, minute rows in_fetch_minute_rowsvia_build_current_hour_row, plus the universe queries themselves open one session). Consider accepting an optional session parameter or using a shared session across the read path to reduce connection overhead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/services/kr_hourly_candles_read_service.py` around lines 181 - 215, The code currently opens multiple independent DB sessions across the read path; modify the functions to accept an optional AsyncSession and reuse a single session instead of creating new ones: add an optional parameter session: Optional[AsyncSession] = None to read_kr_hourly_candles_1h, _resolve_universe_row, _fetch_hour_rows, _fetch_minute_rows and _build_current_hour_row (or whichever helpers call DB), thread the provided session through each call, and only create/close a new session in read_kr_hourly_candles_1h when session is None (i.e., use "async with _async_session() as session" only in the top-level function when needed); update calls to pass the session to avoid multiple connections and ensure you only commit/close the session if you created it.
444-463: KRX-priority merge discards NTX high/low extremes.When both KRX and NTX report for the same minute,
source = group.get("KRX") or group.get("NTX")takes OHLC entirely from KRX. The per-minutehigh/lowcould be more extreme on NTX, and those values are silently dropped before the hourlymax(high)/min(low)aggregation at lines 469-470.If this is intentional (KRX is canonical price), consider adding a brief inline comment to document the design choice. If the hourly candle should reflect the true high/low across both venues, the merge should take
max(high)andmin(low)across all venue rows per minute.♻️ Option: true cross-venue high/low per minute
source = group.get("KRX") or group.get("NTX") if source is None: continue volume = sum(r.volume for r in group.values()) value = sum(r.value for r in group.values()) + high = max(r.high for r in group.values()) + low = min(r.low for r in group.values()) combined.append( _MinuteRow( minute_time=minute_time, venue=source.venue, open=source.open, - high=source.high, - low=source.low, + high=high, + low=low, close=source.close, volume=volume, value=value, ) )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/services/kr_hourly_candles_read_service.py` around lines 444 - 463, The current per-minute merge (iterating minutes_by_time and building combined with _MinuteRow) always selects OHLC from a single source via source = group.get("KRX") or group.get("NTX"), which discards NTX extrema; change the merge so minute high = max(r.high for r in group.values()) and minute low = min(r.low for r in group.values()) (keep open/close/venue selection as-is or document choice) and continue summing volume/value across group.values(); alternatively, if KRX dominance is intentional, add a concise inline comment next to the source selection explaining that KRX is canonical for OHLC to avoid silent data loss.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/plans/2026-02-23-kr-ohlcv-1h-v2.md`:
- Line 13: The headings jump from h1 to h3; change the task headings to h2 to
satisfy markdownlint MD001 — update the "### Task 1: Fix typing/LSP issues in
MCP integration tests" heading to "## Task 1: Fix typing/LSP issues in MCP
integration tests" and likewise change the "###" headings for Tasks 2–4 (the
headings referenced in the details block) to "##" so each task heading
increments properly from the document h1.
In `@tests/test_kr_hourly_candles_read_service.py`:
- Around line 95-96: Add the missing pytest unit markers by decorating each test
function with `@pytest.mark.unit` (ensure pytest is imported) — specifically add
`@pytest.mark.unit` above test_api_prefetch_plan_time_boundaries_for_nxt_eligible
and the other test functions referenced (test names at lines 250-251, 353-354,
410-411, 500-501, 581-582, 656-657, 715-716, 787-788, 829-830) so all nine tests
are explicitly marked as unit tests; keep the existing `@pytest.mark.asyncio`
decorator and place `@pytest.mark.unit` directly above or alongside it.
In `@tests/test_mcp_server_tools.py`:
- Around line 1430-1461: This test function
test_get_ohlcv_kr_equity_period_1h_includes_session_and_venues is fully mocked
and needs the pytest unit marker; add `@pytest.mark.unit` above the async def and
ensure pytest is imported at the top of the file if not already present so the
marker resolves.
---
Nitpick comments:
In `@app/services/kr_hourly_candles_read_service.py`:
- Around line 29-30: The _async_session function uses an unnecessary double
cast; replace cast(AsyncSession, cast(object, AsyncSessionLocal())) with a
single cast of the AsyncSessionLocal() return to AsyncSession (e.g.,
cast(AsyncSession, AsyncSessionLocal())) or, even better, annotate
AsyncSessionLocal to return AsyncSession and remove the cast entirely; update
the _async_session function (symbol: _async_session and AsyncSessionLocal)
accordingly.
- Around line 181-215: The code currently opens multiple independent DB sessions
across the read path; modify the functions to accept an optional AsyncSession
and reuse a single session instead of creating new ones: add an optional
parameter session: Optional[AsyncSession] = None to read_kr_hourly_candles_1h,
_resolve_universe_row, _fetch_hour_rows, _fetch_minute_rows and
_build_current_hour_row (or whichever helpers call DB), thread the provided
session through each call, and only create/close a new session in
read_kr_hourly_candles_1h when session is None (i.e., use "async with
_async_session() as session" only in the top-level function when needed); update
calls to pass the session to avoid multiple connections and ensure you only
commit/close the session if you created it.
- Around line 444-463: The current per-minute merge (iterating minutes_by_time
and building combined with _MinuteRow) always selects OHLC from a single source
via source = group.get("KRX") or group.get("NTX"), which discards NTX extrema;
change the merge so minute high = max(r.high for r in group.values()) and minute
low = min(r.low for r in group.values()) (keep open/close/venue selection as-is
or document choice) and continue summing volume/value across group.values();
alternatively, if KRX dominance is intentional, add a concise inline comment
next to the source selection explaining that KRX is canonical for OHLC to avoid
silent data loss.
In `@tests/test_kr_hourly_candles_read_service.py`:
- Around line 144-174: Multiple tests duplicate the DummyDB class with only
small variations; replace the repeated classes with a single reusable
factory/fixture. Create a function (e.g., _make_dummy_db) that accepts
parameters symbol, nxt_eligible, is_active, hour_rows, minute_rows and returns
an instance whose execute method implements the existing SQL branching logic
(matching "FROM public.kr_symbol_universe", "FROM public.kr_candles_1h", "FROM
public.kr_candles_1m") and time-range filtering; then update each test to call
_make_dummy_db(...) instead of redefining DummyDB. Ensure the returned object's
.calls list and behavior exactly mirror the original DummyDB so existing
assertions keep working.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
app/mcp_server/README.mdapp/mcp_server/tooling/market_data_quotes.pyapp/services/kr_hourly_candles_read_service.pydocs/plans/2026-02-23-kr-ohlcv-1h-v2.mdtests/test_kr_hourly_candles_read_service.pytests/test_mcp_server_tools.py
|
|
||
| --- | ||
|
|
||
| ### Task 1: Fix typing/LSP issues in MCP integration tests |
There was a problem hiding this comment.
Heading level skips from h1 to h3.
Static analysis (markdownlint MD001) flags that heading levels should increment by one. Line 1 is # (h1), but line 13 jumps to ### (h3), skipping h2.
📝 Fix heading levels
-### Task 1: Fix typing/LSP issues in MCP integration tests
+## Task 1: Fix typing/LSP issues in MCP integration testsAnd similarly for Tasks 2–4 on lines 34, 57, 77.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ### Task 1: Fix typing/LSP issues in MCP integration tests | |
| ## Task 1: Fix typing/LSP issues in MCP integration tests |
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 13-13: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/plans/2026-02-23-kr-ohlcv-1h-v2.md` at line 13, The headings jump from
h1 to h3; change the task headings to h2 to satisfy markdownlint MD001 — update
the "### Task 1: Fix typing/LSP issues in MCP integration tests" heading to "##
Task 1: Fix typing/LSP issues in MCP integration tests" and likewise change the
"###" headings for Tasks 2–4 (the headings referenced in the details block) to
"##" so each task heading increments properly from the document h1.
| @pytest.mark.asyncio | ||
| async def test_api_prefetch_plan_time_boundaries_for_nxt_eligible(monkeypatch): |
There was a problem hiding this comment.
Missing @pytest.mark.unit markers on all tests.
Per the coding guidelines, tests should be categorized using registered pytest markers (@pytest.mark.unit, @pytest.mark.integration, @pytest.mark.slow). These are all unit tests (mocked DB and API) but lack the @pytest.mark.unit marker.
📝 Example fix for each test
+@pytest.mark.unit
`@pytest.mark.asyncio`
async def test_api_prefetch_plan_time_boundaries_for_nxt_eligible(monkeypatch):Apply the same pattern to all 9 test functions.
As per coding guidelines: "Use test markers (@pytest.mark.unit, @pytest.mark.integration, @pytest.mark.slow) to categorize tests appropriately."
Also applies to: 250-251, 353-354, 410-411, 500-501, 581-582, 656-657, 715-716, 787-788, 829-830
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_kr_hourly_candles_read_service.py` around lines 95 - 96, Add the
missing pytest unit markers by decorating each test function with
`@pytest.mark.unit` (ensure pytest is imported) — specifically add
`@pytest.mark.unit` above test_api_prefetch_plan_time_boundaries_for_nxt_eligible
and the other test functions referenced (test names at lines 250-251, 353-354,
410-411, 500-501, 581-582, 656-657, 715-716, 787-788, 829-830) so all nine tests
are explicitly marked as unit tests; keep the existing `@pytest.mark.asyncio`
decorator and place `@pytest.mark.unit` directly above or alongside it.
| async def test_get_ohlcv_kr_equity_period_1h_includes_session_and_venues(monkeypatch): | ||
| tools = build_tools() | ||
| df = _single_row_df() | ||
| monkeypatch.setattr(settings, "kis_ohlcv_cache_enabled", False, raising=False) | ||
| route_mock = AsyncMock(return_value="UN") | ||
| monkeypatch.setattr(market_data_quotes, "_resolve_kr_intraday_route", route_mock) | ||
|
|
||
| class DummyKISClient: | ||
| async def inquire_time_dailychartprice( | ||
| self, code, market, n, end_date=None, end_time=None | ||
| ): | ||
| del code, market, n, end_date, end_time | ||
| return df | ||
| df = pd.DataFrame( | ||
| [ | ||
| { | ||
| "datetime": pd.Timestamp("2026-02-23 09:00:00"), | ||
| "date": date(2026, 2, 23), | ||
| "time": datetime.time(9, 0, 0), | ||
| "open": 100.0, | ||
| "high": 110.0, | ||
| "low": 90.0, | ||
| "close": 105.0, | ||
| "volume": 1000, | ||
| "value": 105000.0, | ||
| "session": "REGULAR", | ||
| "venues": ["KRX", "NTX"], | ||
| } | ||
| ] | ||
| ) | ||
| read_mock = AsyncMock(return_value=df) | ||
| monkeypatch.setattr(market_data_quotes, "read_kr_hourly_candles_1h", read_mock) | ||
|
|
||
| _patch_runtime_attr(monkeypatch, "KISClient", DummyKISClient) | ||
| result = await tools["get_ohlcv"]("005930", market="kr", count=50, period="1h") | ||
|
|
||
| read_mock.assert_awaited_once_with(symbol="005930", count=50, end_date=None) | ||
| assert result["instrument_type"] == "equity_kr" | ||
| assert result["period"] == "1h" | ||
| assert result["source"] == "kis" | ||
| route_mock.assert_awaited_once_with("005930") | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_get_ohlcv_kr_equity_period_1h_backfills_multiple_days(monkeypatch): | ||
| tools = build_tools() | ||
| target_day = date(2026, 2, 19) | ||
| monkeypatch.setattr(settings, "kis_ohlcv_cache_enabled", False, raising=False) | ||
| route_mock = AsyncMock(return_value="UN") | ||
| monkeypatch.setattr(market_data_quotes, "_resolve_kr_intraday_route", route_mock) | ||
| calls: list[tuple[date | None, str | None, str]] = [] | ||
|
|
||
| class DummyKISClient: | ||
| async def inquire_time_dailychartprice( | ||
| self, code, market, n, end_date=None, end_time=None | ||
| ): | ||
| del code | ||
| calls.append((end_date, end_time, market)) | ||
| requested_day = end_date or target_day | ||
| assert n >= 1 | ||
| return pd.DataFrame( | ||
| [ | ||
| { | ||
| "datetime": pd.Timestamp(f"{requested_day} 09:00:00"), | ||
| "date": requested_day, | ||
| "time": pd.Timestamp("2026-01-01 09:00:00").time(), | ||
| "open": 100.0, | ||
| "high": 101.0, | ||
| "low": 99.0, | ||
| "close": 100.5, | ||
| "volume": 1000, | ||
| "value": 100500.0, | ||
| }, | ||
| { | ||
| "datetime": pd.Timestamp(f"{requested_day} 10:00:00"), | ||
| "date": requested_day, | ||
| "time": pd.Timestamp("2026-01-01 10:00:00").time(), | ||
| "open": 100.5, | ||
| "high": 102.0, | ||
| "low": 100.0, | ||
| "close": 101.5, | ||
| "volume": 1100, | ||
| "value": 111650.0, | ||
| }, | ||
| ] | ||
| ) | ||
|
|
||
| _patch_runtime_attr(monkeypatch, "KISClient", DummyKISClient) | ||
| result = await tools["get_ohlcv"]( | ||
| "005930", | ||
| market="kr", | ||
| count=4, | ||
| period="1h", | ||
| end_date=target_day.isoformat(), | ||
| ) | ||
|
|
||
| assert len(calls) >= 3 | ||
| assert calls[0][2] == "UN" | ||
| assert calls[0][0] == target_day | ||
| assert calls[1][0] == target_day | ||
| assert calls[0][1] == "200000" | ||
| assert calls[1][1] is not None | ||
| assert calls[1][1] < calls[0][1] | ||
| assert any( | ||
| day == target_day - timedelta(days=1) and end_time == "200000" | ||
| for day, end_time, _ in calls | ||
| ) | ||
| assert len(result["rows"]) == 4 | ||
| assert result["period"] == "1h" | ||
| assert result["instrument_type"] == "equity_kr" | ||
| route_mock.assert_awaited_once_with("005930") | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_get_ohlcv_kr_equity_period_1h_backfill_uses_j_close_for_history( | ||
| monkeypatch, | ||
| ): | ||
| tools = build_tools() | ||
| target_day = date(2026, 2, 19) | ||
| monkeypatch.setattr(settings, "kis_ohlcv_cache_enabled", False, raising=False) | ||
| monkeypatch.setattr( | ||
| market_data_quotes, | ||
| "_resolve_kr_intraday_route", | ||
| AsyncMock(return_value="J"), | ||
| ) | ||
| calls: list[tuple[date | None, str | None, str]] = [] | ||
|
|
||
| class DummyKISClient: | ||
| async def inquire_time_dailychartprice( | ||
| self, code, market, n, end_date=None, end_time=None | ||
| ): | ||
| del code | ||
| calls.append((end_date, end_time, market)) | ||
| requested_day = end_date or target_day | ||
| assert n >= 1 | ||
| return pd.DataFrame( | ||
| [ | ||
| { | ||
| "datetime": pd.Timestamp(f"{requested_day} 09:00:00"), | ||
| "date": requested_day, | ||
| "time": pd.Timestamp("2026-01-01 09:00:00").time(), | ||
| "open": 100.0, | ||
| "high": 101.0, | ||
| "low": 99.0, | ||
| "close": 100.5, | ||
| "volume": 1000, | ||
| "value": 100500.0, | ||
| }, | ||
| { | ||
| "datetime": pd.Timestamp(f"{requested_day} 10:00:00"), | ||
| "date": requested_day, | ||
| "time": pd.Timestamp("2026-01-01 10:00:00").time(), | ||
| "open": 100.5, | ||
| "high": 102.0, | ||
| "low": 100.0, | ||
| "close": 101.5, | ||
| "volume": 1100, | ||
| "value": 111650.0, | ||
| }, | ||
| ] | ||
| ) | ||
|
|
||
| _patch_runtime_attr(monkeypatch, "KISClient", DummyKISClient) | ||
| result = await tools["get_ohlcv"]( | ||
| "005930", | ||
| market="kr", | ||
| count=4, | ||
| period="1h", | ||
| end_date=target_day.isoformat(), | ||
| ) | ||
|
|
||
| assert len(calls) >= 2 | ||
| assert calls[0][2] == "J" | ||
| assert calls[0][1] == "153000" | ||
| assert any( | ||
| day == target_day - timedelta(days=1) and end_time == "153000" | ||
| for day, end_time, _ in calls | ||
| ) | ||
| assert len(result["rows"]) == 4 | ||
| assert result["period"] == "1h" | ||
| assert result["instrument_type"] == "equity_kr" | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_get_ohlcv_kr_1h_paginates_within_same_day(monkeypatch): | ||
| tools = build_tools() | ||
| target_day = date(2026, 2, 19) | ||
| monkeypatch.setattr(settings, "kis_ohlcv_cache_enabled", False, raising=False) | ||
| monkeypatch.setattr( | ||
| market_data_quotes, | ||
| "_resolve_kr_intraday_route", | ||
| AsyncMock(return_value="UN"), | ||
| ) | ||
| calls: list[tuple[date | None, str | None, str]] = [] | ||
|
|
||
| class DummyKISClient: | ||
| async def inquire_time_dailychartprice( | ||
| self, code, market, n, end_date=None, end_time=None | ||
| ): | ||
| del code, n | ||
| calls.append((end_date, end_time, market)) | ||
| requested_day = end_date or target_day | ||
| if end_time is None or end_time >= "180000": | ||
| hours = (19, 18) | ||
| else: | ||
| hours = (17, 16) | ||
| return pd.DataFrame( | ||
| [ | ||
| { | ||
| "datetime": pd.Timestamp(f"{requested_day} {hour:02d}:00:00"), | ||
| "date": requested_day, | ||
| "time": pd.Timestamp(f"2026-01-01 {hour:02d}:00:00").time(), | ||
| "open": float(hour), | ||
| "high": float(hour) + 0.5, | ||
| "low": float(hour) - 0.5, | ||
| "close": float(hour) + 0.25, | ||
| "volume": 1000 + hour, | ||
| "value": float((1000 + hour) * hour), | ||
| } | ||
| for hour in hours | ||
| ] | ||
| ) | ||
|
|
||
| _patch_runtime_attr(monkeypatch, "KISClient", DummyKISClient) | ||
| result = await tools["get_ohlcv"]( | ||
| "005930", | ||
| market="kr", | ||
| count=4, | ||
| period="1h", | ||
| end_date=target_day.isoformat(), | ||
| ) | ||
|
|
||
| assert len(calls) >= 2 | ||
| assert len(result["rows"]) == 4 | ||
| assert all(day == target_day for day, _, _ in calls) | ||
| assert all(market == "UN" for _, _, market in calls) | ||
| assert calls[0][1] == "200000" | ||
| assert calls[1][1] is not None | ||
| assert calls[1][1] < calls[0][1] | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_get_ohlcv_kr_1h_stops_when_cursor_stalls(monkeypatch): | ||
| tools = build_tools() | ||
| target_day = date(2026, 2, 19) | ||
| monkeypatch.setattr(settings, "kis_ohlcv_cache_enabled", False, raising=False) | ||
| monkeypatch.setattr( | ||
| market_data_quotes, | ||
| "_resolve_kr_intraday_route", | ||
| AsyncMock(return_value="UN"), | ||
| ) | ||
| calls: list[tuple[date | None, str | None, str]] = [] | ||
|
|
||
| class DummyKISClient: | ||
| async def inquire_time_dailychartprice( | ||
| self, code, market, n, end_date=None, end_time=None | ||
| ): | ||
| del code, n | ||
| calls.append((end_date, end_time, market)) | ||
| requested_day = end_date or target_day | ||
| if requested_day != target_day: | ||
| return pd.DataFrame() | ||
| return pd.DataFrame( | ||
| [ | ||
| { | ||
| "datetime": pd.Timestamp(f"{requested_day} 10:00:00"), | ||
| "date": requested_day, | ||
| "time": pd.Timestamp("2026-01-01 10:00:00").time(), | ||
| "open": 100.0, | ||
| "high": 101.0, | ||
| "low": 99.0, | ||
| "close": 100.5, | ||
| "volume": 1000, | ||
| "value": 100500.0, | ||
| }, | ||
| { | ||
| "datetime": pd.Timestamp(f"{requested_day} 09:00:00"), | ||
| "date": requested_day, | ||
| "time": pd.Timestamp("2026-01-01 09:00:00").time(), | ||
| "open": 99.5, | ||
| "high": 100.5, | ||
| "low": 99.0, | ||
| "close": 100.0, | ||
| "volume": 900, | ||
| "value": 90000.0, | ||
| }, | ||
| ] | ||
| ) | ||
|
|
||
| _patch_runtime_attr(monkeypatch, "KISClient", DummyKISClient) | ||
| result = await tools["get_ohlcv"]( | ||
| "005930", | ||
| market="kr", | ||
| count=6, | ||
| period="1h", | ||
| end_date=target_day.isoformat(), | ||
| ) | ||
|
|
||
| same_day_calls = [call for call in calls if call[0] == target_day] | ||
| assert len(same_day_calls) == 2 | ||
| assert same_day_calls[0][1] == "200000" | ||
| assert same_day_calls[1][1] is not None | ||
| assert same_day_calls[1][1] < same_day_calls[0][1] | ||
| assert len(result["rows"]) == 2 | ||
| assert result["rows"] | ||
| row = result["rows"][0] | ||
| assert row["session"] == "REGULAR" | ||
| assert row["venues"] == ["KRX", "NTX"] |
There was a problem hiding this comment.
Add a unit marker for this async test.
This test is unit-scoped (fully mocked) but lacks a unit/integration marker required for tests in this path.
🔧 Add a unit marker
-@pytest.mark.asyncio
-async def test_get_ohlcv_kr_equity_period_1h_includes_session_and_venues(monkeypatch):
+@pytest.mark.asyncio
+@pytest.mark.unit
+async def test_get_ohlcv_kr_equity_period_1h_includes_session_and_venues(monkeypatch):As per coding guidelines, “tests/test_*.py: Use test markers (@pytest.mark.unit, @pytest.mark.integration, @pytest.mark.slow) to categorize tests appropriately”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_mcp_server_tools.py` around lines 1430 - 1461, This test function
test_get_ohlcv_kr_equity_period_1h_includes_session_and_venues is fully mocked
and needs the pytest unit marker; add `@pytest.mark.unit` above the async def and
ensure pytest is imported at the top of the file if not already present so the
marker resolves.




Summary
Validation
Summary by CodeRabbit
New Features
Data Maintenance
Tasks & Scheduling
CLI & Automation
Infrastructure
Tests & Docs