Skip to content

refactor(mcp): yfinance 동일 path 중복 호출 제거 (무캐시, 구조 개선)#188

Open
robin-watcha wants to merge 1 commit intomainfrom
refactor/yfinance-dedupe-no-cache
Open

refactor(mcp): yfinance 동일 path 중복 호출 제거 (무캐시, 구조 개선)#188
robin-watcha wants to merge 1 commit intomainfrom
refactor/yfinance-dedupe-no-cache

Conversation

@robin-watcha
Copy link
Collaborator

@robin-watcha robin-watcha commented Feb 24, 2026

요약

yfinance 경로에서 동일 요청 내 중복 네트워크 호출을 구조적으로 제거합니다. 캐시 없이 구조 개선만 수행합니다.

확인된 중복 근거

  1. analyze_stock(us) 1회에서 ticker.info 2회 호출
  2. analyze_stock(us) 1회에서 OHLCV fetch 3회 호출
  3. get_support_resistance 내부에서 동일 60일 OHLCV를 2회 fetch
  4. sector_peers_us는 dedupe 이전에 info fetch를 수행해 불필요 호출 가능

변경 사항

1단계 (핵심 중복 제거)

  • YFinanceSnapshot 데이터클래스 추가 (fundamentals_sources_naver.py)
  • _fetch_valuation_yfinance에 snapshot/session optional 인자 추가
  • _fetch_investment_opinions_yfinance에 snapshot/session optional 인자 추가
  • _analyze_stock_impl(US)에서 snapshot 1회 수집 후 전달 (analysis_screening.py)
  • _get_support_resistance_impl에서 OHLCV 중복 fetch 제거 (fundamentals_handlers.py)

2단계 (전체 yfinance 경로 확장)

  • _get_indicators_impl에 preloaded_df optional 인자 추가 (portfolio_holdings.py)
  • _get_support_resistance_impl에 preloaded_df optional 인자 추가 (fundamentals_handlers.py)
  • _analyze_stock_impl에서 OHLCV 1회 fetch 후 지표/지지저항 공유
  • _fetch_sector_peers_us dedupe를 네트워크 호출 이전으로 이동

성능 개선

경로 변경 전 변경 후
analyze_stock(US) ticker.info 호출 2회 1회
get_support_resistance OHLCV fetch 2회 1회
analyze_stock OHLCV fetch 3회 1회

공개 API/인터페이스 변경 사항

  • MCP public tool 이름/파라미터/응답 필드: 변경 없음
  • 내부 함수 시그니처(비공개)만 optional 인자 추가

테스트

추가된 회귀 테스트

  1. test_get_support_resistance_uses_single_ohlcv_fetch - OHLCV 1회만 fetch 검증
  2. test_sector_peers_us_dedupes_before_network_call - dedupe 선적용 검증
  3. test_analyze_stock_us_reuses_yfinance_info - ticker.info 1회만 호출 검증

검증 결과

342 passed, 4 warnings in 17.21s

제약 사항

  • 캐시(TTL/전역/요청간) 도입 없음, 구조 개선만 수행
  • Sentry search_events 불가/로그인 제한 때문에 trace 세부 path 집계는 코드+로컬 계측 근거로 진행

Summary by CodeRabbit

  • Performance Improvements

    • Optimized data fetching by reusing market data across multiple analysis operations, reducing redundant network calls.
    • Implemented intelligent deduplication of related ticker symbols to minimize duplicate lookups.
    • Enhanced support/resistance and indicator calculations to leverage preloaded historical data for faster processing.
  • Tests

    • Added regression tests verifying single-fetch optimization and data deduplication behavior.

## 변경 사항

### 1단계 (핵심 중복 제거)
- YFinanceSnapshot 데이터클래스 추가로 ticker.info 중복 호출 제거
- _fetch_valuation_yfinance에 snapshot/session optional 인자 추가
- _fetch_investment_opinions_yfinance에 snapshot/session optional 인자 추가
- _analyze_stock_impl(US)에서 snapshot 1회 수집 후 전달
- _get_support_resistance_impl에서 OHLCV 중복 fetch 제거 (2회→1회)

### 2단계 (전체 yfinance 경로 확장)
- _get_indicators_impl에 preloaded_df optional 인자 추가
- _get_support_resistance_impl에 preloaded_df optional 인자 추가
- _analyze_stock_impl에서 OHLCV 1회 fetch 후 지표/지지저항 공유 (3회→1회)
- _fetch_sector_peers_us dedupe를 네트워크 호출 이전으로 이동

### 테스트 추가
- test_get_support_resistance_uses_single_ohlcv_fetch
- test_sector_peers_us_dedupes_before_network_call
- test_analyze_stock_us_reuses_yfinance_info

## 성능 개선
- analyze_stock(US) ticker.info 호출: 2회 → 1회
- get_support_resistance OHLCV fetch: 2회 → 1회
- analyze_stock OHLCV fetch: 3회 → 1회

## 제약
- 캐시(TTL/전역/요청간) 도입 없음, 구조 개선만 수행
- MCP public tool 이름/파라미터/응답 필드: 변경 없음
@coderabbitai
Copy link

coderabbitai bot commented Feb 24, 2026

📝 Walkthrough

Walkthrough

This PR optimizes financial analysis workflows by introducing preloaded OHLCV data reuse and yfinance snapshot caching. Function signatures are extended across multiple modules to accept optional preloaded dataframes and snapshots, reducing duplicate network requests. Peer ticker deduplication logic is added to prevent redundant lookups.

Changes

Cohort / File(s) Summary
OHLCV Data Reuse
app/mcp_server/tooling/analysis_screening.py, app/mcp_server/tooling/fundamentals_handlers.py, app/mcp_server/tooling/portfolio_holdings.py
Extended _get_indicators_impl and _get_support_resistance_impl signatures with optional preloaded_df parameter. Functions now reuse provided OHLCV data instead of fetching independently, with fallback to existing fetch logic when data unavailable.
YFinance Snapshot Pattern
app/mcp_server/tooling/fundamentals_sources_naver.py
Introduced _YFinanceSnapshot dataclass to cache ticker info, analyst targets, and upgrades/downgrades. Updated _fetch_valuation_yfinance and _fetch_investment_opinions_yfinance to accept optional snapshot and session parameters, enabling data reuse across multiple valuation/opinion calls. Added base-ticker deduplication logic in peer fetching.
Analysis Flow Optimization
app/mcp_server/tooling/analysis_screening.py
Extended equity_us analysis path to initialize yfinance session and snapshot, preload OHLCV data once, and pass both to dependent functions. Incorporates per-symbol snapshot collection via thread pool to avoid event loop blocking.
Test Coverage
tests/test_mcp_server_tools.py
Added regression tests validating single-fetch optimizations: OHLCV reuse in support/resistance, yfinance info reuse across valuation/opinions, and base-ticker deduplication in peer lookups. Includes mocks for internal fetch paths and yfinance interactions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

size/XL

Poem

🐰 A bunny's ode to swift data flows:

Once fetches danced, now snapshots stay,
Preloaded frames save the day,
One call feeds many, no waste in sight,
Deduplicated tickers hop so light!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 17.24% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is in Korean and uses technical jargon ('yfinance 동일 path 중복 호출 제거', 'dedupe removal'), describing a refactoring that removes duplicate yfinance calls without caching. The raw summary confirms these are the primary changes: deduplication of OHLCV fetches, snapshot reuse, and peer deduplication.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refactor/yfinance-dedupe-no-cache

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sonarqubecloud
Copy link

@claude
Copy link

claude bot commented Feb 24, 2026

코드 리뷰 - PR #188 (yfinance 중복 호출 제거)

전반적으로 목표가 명확하고 테스트가 추가된 좋은 리팩터링입니다. 그러나 몇 가지 버그와 성능 역효과를 지적드립니다.


버그 / 정확성 문제

1. _fetch_sector_peers_us: get_base_ticker 이중 정의 + target_base 데드코드

fundamentals_sources_naver.py

PR에서 함수 상단부에 get_base_ticker 헬퍼를 새로 추가했지만, 원래 코드에 이미 같은 이름의 함수가 함수 내부에 정의되어 있었습니다(hunk @@ -671 +727 @@ 컨텍스트 확인). 결과적으로 _fetch_sector_peers_us 내부에 동일한 get_base_ticker가 두 번 정의됩니다. 또한 포스트-네트워크 dedupe 루프를 제거하면서 target_base = get_base_ticker(upper_symbol) 라인은 남겨뒀는데, seen_bases가 사라지면서 이 라인은 완전히 데드 코드가 됩니다.

# line ~727 (신규 파일 기준) — target_base 는 사용되지 않음
target_base = get_base_ticker(upper_symbol)
# Dedupe already applied before network call, use peer_tickers directly
peers: list[dict[str, Any]] = []

수정 방향: 원본 get_base_ticker 정의와 데드 target_base 라인을 제거하세요.


2. _calculate_volume_profile 에 다른 DataFrame을 넘기는 문제

fundamentals_handlers.py

기존 코드는 _fetch_ohlcv_for_volume_profile을 별도로 호출했습니다. 이 함수와 _fetch_ohlcv_for_indicators가 완전히 동일한 컬럼/포맷을 반환하는지 검증이 필요합니다. 만약 _fetch_ohlcv_for_volume_profile이 거래량 데이터를 다른 방식으로 정규화하거나 다른 컬럼명을 사용한다면, _calculate_volume_profile이 런타임에 실패할 수 있습니다.

# 변경 전
volume_profile_df = await _fetch_ohlcv_for_volume_profile(normalized_symbol, market_type, 60)
volume_result = _calculate_volume_profile(volume_profile_df, bins=20)

# 변경 후 — 두 함수가 동일한 포맷을 보장하는지 확인 필요
volume_result = _calculate_volume_profile(df, bins=20)

수정 방향: 두 fetch 함수의 반환 스키마가 동일함을 확인하거나, 이에 대한 테스트를 추가하세요.


성능 역효과

3. OHLCV + snapshot이 순차적으로 실행되어 전체 지연이 증가할 수 있음

analysis_screening.py, _analyze_stock_impl

가장 중요한 이슈입니다. 기존에는 _get_indicators_impl_get_support_resistance_impl이 각각 내부에서 OHLCV를 fetch하면서 asyncio.create_task동시에 실행되었습니다. 변경 후에는 OHLCV fetch와 yfinance 스냅샷이 모든 태스크보다 먼저 순차적으로 실행됩니다.

# 지금 (US equity 경로):
ohlcv_df = await _fetch_ohlcv_for_indicators(...)          # 직렬 대기 1
yf_snapshot = await loop.run_in_executor(None, _collect_yf_snapshot)  # 직렬 대기 2
# 그 다음에야 concurrent tasks 시작

네트워크 호출 횟수는 줄었지만 벽시계 지연(wall-clock latency) 이 늘어날 수 있습니다. 두 작업을 asyncio.gather로 병렬화하는 것을 검토하세요.

# 개선안 (US equity 한정)
yf_session = build_yfinance_tracing_session()
yf_ticker = yf.Ticker(normalized_symbol, session=yf_session)

def _collect_yf_snapshot() -> _YFinanceSnapshot: ...

ohlcv_df, yf_snapshot = await asyncio.gather(
    _fetch_ohlcv_for_indicators(normalized_symbol, market_type, count=250),
    loop.run_in_executor(None, _collect_yf_snapshot),
)

코드 품질 문제

4. dedupe 로직 코드 중복

fundamentals_sources_naver.py

manual_peers 분기와 else 분기에 동일한 dedupe-by-base-ticker 코드가 그대로 복사되어 있습니다. 이미 함수 내에 get_base_ticker 헬퍼를 추가했으니 dedupe 로직도 헬퍼로 분리하면 좋습니다.

def _dedupe_by_base(tickers: list[str], seen: set[str]) -> list[str]:
    result = []
    for t in tickers:
        base = get_base_ticker(t)
        if base not in seen:
            seen.add(base)
            result.append(t)
    return result

5. _fetch_investment_opinions_yfinance: snapshot 사용 시 불필요한 Ticker 생성

snapshot이 제공되더라도 ticker = yf.Ticker(symbol, session=session)은 여전히 실행됩니다. snapshot 사용 시 ticker 객체를 생성할 필요가 없습니다.

# 개선안: snapshot이 없을 때만 ticker 생성
if snapshot is not None:
    targets = snapshot.analyst_price_targets
    ud = snapshot.upgrades_downgrades
    info = snapshot.info
else:
    if session is None:
        session = build_yfinance_tracing_session()
    ticker = yf.Ticker(symbol, session=session)
    targets, ud, info = await loop.run_in_executor(None, _collect)

6. _YFinanceSnapshot 타입 어노테이션

upgrades_downgrades: Any = None 보다 pd.DataFrame | None = None으로 명시하면 더 명확합니다. 또한 클래스 정의 다음에 빈 줄이 없습니다.


테스트 관련

7. test_sector_peers_us_dedupes_before_network_call 어설션 약함

assert 'BRK.B' not in yf_info_calls or 'BRK.A' not in yf_info_calls

이 OR 조건은 두 티커가 모두 호출되어도 참이 됩니다. 아래처럼 더 엄격하게 작성하는 것을 권장합니다.

brk_calls = [c for c in yf_info_calls if c.startswith('BRK')]
assert len(brk_calls) <= 1, f'BRK base ticker called {len(brk_calls)} times: {brk_calls}'

요약

우선순위 항목
높음 get_base_ticker 이중 정의 + target_base 데드코드 제거
높음 _fetch_ohlcv_for_volume_profile vs _fetch_ohlcv_for_indicators 호환성 검증
중간 OHLCV + snapshot 직렬 실행으로 인한 지연 증가 검토 (asyncio.gather 병렬화)
중간 dedupe 로직 중복 → 헬퍼 함수로 추출
낮음 snapshot 사용 시 불필요한 Ticker 생성 제거
낮음 테스트 어설션 강화, 타입 어노테이션 정리

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
app/mcp_server/tooling/fundamentals_handlers.py (1)

57-57: ⚠️ Potential issue | 🔴 Critical

Remove the unused _fetch_ohlcv_for_volume_profile import — this is causing a CI pipeline error (F401).

Volume profile computation now reuses the main OHLCV dataframe (df) via _calculate_volume_profile, so _fetch_ohlcv_for_volume_profile is no longer called anywhere in this file.

🔧 Proposed fix
 from app.mcp_server.tooling.market_data_indicators import (
     _calculate_fibonacci,
     _calculate_volume_profile,
     _cluster_price_levels,
     _compute_indicators,
     _fetch_ohlcv_for_indicators,
-    _fetch_ohlcv_for_volume_profile,
     _format_fibonacci_source,
     _split_support_resistance_levels,
 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/fundamentals_handlers.py` at line 57, Remove the
unused import _fetch_ohlcv_for_volume_profile from the import list in this
module; search for any references to _fetch_ohlcv_for_volume_profile (e.g.,
inside functions like _calculate_volume_profile or callers that previously used
it) to confirm it's not used anywhere, then delete the import statement entry so
the linter/CI (F401) error is resolved.
app/mcp_server/tooling/fundamentals_sources_naver.py (1)

16-16: ⚠️ Potential issue | 🟡 Minor

Fix import ordering to resolve pipeline warning.

CI reports I001 Import block is un-sorted or un-formatted at line 16. Run ruff check --fix or reorder imports per your project's isort/Ruff configuration.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/fundamentals_sources_naver.py` at line 16, The import
block in fundamentals_sources_naver.py is not ordered per project style (CI
flagged the lone "import httpx"); reorder imports into the canonical groups
(standard library, third-party like httpx, then local app imports) and apply
your project's formatter (run `ruff check --fix` or `isort`/`ruff --fix`) to
automatically sort and format the imports; update the top of the module where
`import httpx` is declared (and any other imports in that file) so they follow
the configured ordering rules.
app/mcp_server/tooling/analysis_screening.py (1)

28-44: ⚠️ Potential issue | 🟡 Minor

Fix import ordering to resolve pipeline warning.

The CI pipeline reports I001 Import block is un-sorted or un-formatted at line 3. The pandas, yfinance, and monitoring imports are placed between two app.* import groups, breaking the expected sort order. Move them into the correct import group per Ruff/isort conventions (typically: stdlib → third-party → first-party).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/analysis_screening.py` around lines 28 - 44, The
import block is mis-ordered: move third-party imports (pandas, yfinance, and
build_yfinance_tracing_session) into the third-party group so first stdlib (if
any) → third-party → first-party (app.*) ordering is preserved; specifically,
relocate "import pandas as pd", "import yfinance as yf", and "from
app.monitoring import build_yfinance_tracing_session" to appear together before
the app.mcp_server.* imports (or adjust to match your project's isort/Ruff
configuration), keeping the existing named imports (_YFinanceSnapshot,
_fetch_...) and _fetch_... imports intact and properly grouped.
🧹 Nitpick comments (6)
app/mcp_server/tooling/portfolio_holdings.py (1)

544-548: Consider documenting/asserting the minimum row requirement for preloaded_df.

_get_indicators_impl fetches 250 rows when computing indicators internally. If a caller passes a preloaded_df with fewer rows (e.g. 60, fetched for support/resistance), indicator calculations proceed silently with reduced data quality. A lightweight guard or docstring clarifying the expected minimum would prevent future misuse.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/portfolio_holdings.py` around lines 544 - 548, The
code in _get_indicators_impl uses preloaded_df if provided but otherwise fetches
250 rows via _fetch_ohlcv_for_indicators; add a lightweight guard and/or
docstring to require a minimum number of rows (e.g. 250) for accurate indicator
computation: update the function docstring for _get_indicators_impl to state the
minimum row requirement for preloaded_df and add a runtime check (assert or
raise ValueError) after receiving preloaded_df that verifies
preloaded_df.shape[0] >= 250 (or a named constant) and raises a clear error
message advising callers to supply more history or omit preloaded_df so the
internal fetch runs; reference preloaded_df, df, and _fetch_ohlcv_for_indicators
when implementing the check and message.
tests/test_mcp_server_tools.py (1)

8728-8903: Add a pytest unit marker to the new regression tests.

These tests currently only have @pytest.mark.asyncio, but tests/test_*.py requires a strict marker (unit, integration, slow) to keep strict markers compliant.

🧭 Suggested fix
 `@pytest.mark.asyncio`
+@pytest.mark.unit
 async def test_get_support_resistance_uses_single_ohlcv_fetch(monkeypatch):
@@
 `@pytest.mark.asyncio`
+@pytest.mark.unit
 async def test_sector_peers_us_dedupes_before_network_call(monkeypatch):
@@
 `@pytest.mark.asyncio`
+@pytest.mark.unit
 async def test_analyze_stock_us_reuses_yfinance_info(monkeypatch):

As per coding guidelines: “tests/test_*.py: Use test markers (@pytest.mark.unit, @pytest.mark.integration, @pytest.mark.slow) to categorize tests appropriately”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_mcp_server_tools.py` around lines 8728 - 8903, The three new async
tests (test_get_support_resistance_uses_single_ohlcv_fetch,
test_sector_peers_us_dedupes_before_network_call,
test_analyze_stock_us_reuses_yfinance_info) are missing the required strict test
marker; add the appropriate pytest marker (e.g., `@pytest.mark.unit`) alongside
`@pytest.mark.asyncio` above each of those test functions so they are categorized
as unit tests per the project guidelines. Ensure you import pytest if not
already present and apply the `@pytest.mark.unit` decorator to each of the three
test functions.
app/mcp_server/tooling/analysis_screening.py (3)

263-267: loop is obtained eagerly but only used in the equity_us branch.

asyncio.get_running_loop() at line 264 is only consumed at line 324 (loop.run_in_executor), inside the elif market_type == "equity_us" block. For equity_kr and crypto paths, loop is an unused local.

Move it inside the equity_us branch to keep scope minimal and intent clear.

Proposed fix
-    # Fetch OHLCV once for indicators and support/resistance (avoid duplicate fetch)
-    loop = asyncio.get_running_loop()
-    ohlcv_df = await _fetch_ohlcv_for_indicators(normalized_symbol, market_type, count=250)
+    # Fetch OHLCV once for indicators and support/resistance (avoid duplicate fetch)
+    ohlcv_df = await _fetch_ohlcv_for_indicators(normalized_symbol, market_type, count=250)

Then inside the equity_us block:

     elif market_type == "equity_us":
         # Collect yfinance snapshot once to avoid duplicate ticker.info calls
+        loop = asyncio.get_running_loop()
         yf_session = build_yfinance_tracing_session()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/analysis_screening.py` around lines 263 - 267, The
variable loop is created with asyncio.get_running_loop() eagerly but only used
in the equity_us branch; move the asyncio.get_running_loop() call into the elif
market_type == "equity_us" branch so loop is only defined where used, and keep
the existing loop.run_in_executor(...) call unchanged; update any references to
loop (e.g., loop.run_in_executor) to use the newly scoped loop inside that
branch and remove the unused top-level loop declaration near
_fetch_ohlcv_for_indicators/ohlcv_df.

301-344: Snapshot collection blocks the event loop before subsequent tasks are created.

_collect_yf_snapshot runs in an executor (line 324) and is awaited before valuation_task and opinions_task are created. This means the snapshot's three synchronous yfinance property accesses (which may each make an HTTP call) must complete sequentially before the Finnhub profile/news tasks and all other tasks start.

This is inherent to the design (snapshot must be ready before passing to downstream tasks), but worth noting that profile_task and news_task (which don't depend on the snapshot) could be created before awaiting the snapshot to recover some concurrency.

Sketch: overlap snapshot fetch with independent tasks
     elif market_type == "equity_us":
+        loop = asyncio.get_running_loop()
         yf_session = build_yfinance_tracing_session()
         yf_ticker = yf.Ticker(normalized_symbol, session=yf_session)

         def _collect_yf_snapshot() -> _YFinanceSnapshot:
             ...

-        yf_snapshot = await loop.run_in_executor(None, _collect_yf_snapshot)
+        # Start snapshot collection; don't await yet
+        snapshot_future = loop.run_in_executor(None, _collect_yf_snapshot)

+        # Create tasks that don't depend on the snapshot immediately
         profile_task = asyncio.create_task(
             _fetch_company_profile_finnhub(normalized_symbol),
         )
         tasks.append(profile_task)

         news_task = asyncio.create_task(
             _fetch_news_finnhub(normalized_symbol, "us", 5),
         )
         tasks.append(news_task)

+        yf_snapshot = await snapshot_future
+
         valuation_task = asyncio.create_task(
             _fetch_valuation_yfinance(normalized_symbol, snapshot=yf_snapshot, session=yf_session),
         )
         tasks.append(valuation_task)

         opinions_task = asyncio.create_task(
             _fetch_investment_opinions_yfinance(normalized_symbol, 10, snapshot=yf_snapshot, session=yf_session),
         )
         tasks.append(opinions_task)

Note: task ordering changes would require adjusting the task_idx unpacking below (lines 400–415), so verify consistency if you adopt this.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/analysis_screening.py` around lines 301 - 344, The
snapshot gathering (_collect_yf_snapshot) is awaited before creating independent
Finnhub tasks, blocking concurrency; to fix, create profile_task and news_task
(calls to _fetch_company_profile_finnhub and _fetch_news_finnhub) before
awaiting yf_snapshot so they run concurrently, then await loop.run_in_executor
to get yf_snapshot and only afterwards create valuation_task and opinions_task
(calls to _fetch_valuation_yfinance and _fetch_investment_opinions_yfinance)
which depend on the snapshot; also update any downstream task index unpacking
(the task_idx handling where tasks are consumed) to reflect the new creation
order.

306-324: Silent exception swallowing in _collect_yf_snapshot hinders debugging.

All three property accesses (ticker.info, analyst_price_targets, upgrades_downgrades) use bare except Exception: pass. If yfinance returns transient errors (rate-limiting, auth issues, network timeouts), every call silently degrades to None with no trace in logs. Since this is a performance-critical path that replaces previously separate calls, at least log at warning/debug level so operators can diagnose silent data loss.

Proposed fix (example for one property; apply similarly to the others)
+import logging
+
+logger = logging.getLogger(__name__)
+
 def _collect_yf_snapshot() -> _YFinanceSnapshot:
     info = None
     targets = None
     ud = None
     try:
         info = yf_ticker.info
     except Exception:
-        pass
+        logger.debug("yfinance ticker.info failed for %s", normalized_symbol, exc_info=True)
     try:
         targets = yf_ticker.analyst_price_targets
     except Exception:
-        pass
+        logger.debug("yfinance analyst_price_targets failed for %s", normalized_symbol, exc_info=True)
     try:
         ud = yf_ticker.upgrades_downgrades
     except Exception:
-        pass
+        logger.debug("yfinance upgrades_downgrades failed for %s", normalized_symbol, exc_info=True)
     return _YFinanceSnapshot(info=info, analyst_price_targets=targets, upgrades_downgrades=ud)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/analysis_screening.py` around lines 306 - 324, The
helper _collect_yf_snapshot currently swallows exceptions when accessing
yf_ticker.info, yf_ticker.analyst_price_targets, and
yf_ticker.upgrades_downgrades; update each try/except to catch Exception as e
and log the exception (at warning or debug) including context (which property
failed) before continuing, then return the _YFinanceSnapshot as before; locate
the function _collect_yf_snapshot and replace bare excepts with logged except
blocks that reference yf_ticker and the specific property name so operators can
diagnose silent failures.
app/mcp_server/tooling/fundamentals_sources_naver.py (1)

662-688: Dedup logic is duplicated between manual_peers and else branches.

Lines 664-673 and 679-688 are nearly identical — only the final slice differs (:limit vs : limit + 5). Extract the common dedup into a small helper to stay DRY.

Proposed refactor
+    def _dedupe_by_base(tickers: list[str], max_count: int) -> list[str]:
+        seen = {get_base_ticker(upper_symbol)}
+        result = []
+        for t in tickers:
+            base = get_base_ticker(t)
+            if base not in seen:
+                seen.add(base)
+                result.append(t)
+        return result[:max_count]
+
     if manual_peers:
         peer_tickers = [t.upper() for t in manual_peers if t.upper() != upper_symbol]
-        # Dedupe by base ticker BEFORE network call
-        target_base = get_base_ticker(upper_symbol)
-        seen_bases = {target_base}
-        deduped_peer_tickers = []
-        for ticker in peer_tickers:
-            peer_base = get_base_ticker(ticker)
-            if peer_base not in seen_bases:
-                seen_bases.add(peer_base)
-                deduped_peer_tickers.append(ticker)
-        peer_tickers = deduped_peer_tickers[:limit]
+        peer_tickers = _dedupe_by_base(peer_tickers, limit)
     else:
         peer_tickers: list[str] = await asyncio.to_thread(
             client.company_peers, upper_symbol
         )
         peer_tickers = [t for t in peer_tickers if t.upper() != upper_symbol]
-        # Dedupe by base ticker BEFORE network call
-        target_base = get_base_ticker(upper_symbol)
-        seen_bases = {target_base}
-        deduped_peer_tickers = []
-        for ticker in peer_tickers:
-            peer_base = get_base_ticker(ticker)
-            if peer_base not in seen_bases:
-                seen_bases.add(peer_base)
-                deduped_peer_tickers.append(ticker)
-        peer_tickers = deduped_peer_tickers[: limit + 5]
+        peer_tickers = _dedupe_by_base(peer_tickers, limit + 5)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/fundamentals_sources_naver.py` around lines 662 - 688,
The deduplication logic for peer_tickers is duplicated in both the manual_peers
and else branches; extract it into a small helper (e.g.,
dedupe_by_base(peer_list, target_symbol, extra_limit=0)) that uses
get_base_ticker to build seen_bases starting from target_symbol, filters unique
base tickers into deduped_peer_tickers, and returns the final slice using limit
or limit + 5 via the extra_limit parameter; then replace the duplicated loops in
the manual_peers branch and the asyncio.to_thread(client.company_peers,
upper_symbol) branch by calling this helper with extra_limit=0 for manual_peers
and extra_limit=5 for the network branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/mcp_server/tooling/fundamentals_handlers.py`:
- Around line 138-141: The code hardcodes volume_result["period_days"] = 60 even
though the reused df (often named preloaded_df or passed into
_analyze_stock_impl) may contain a different number of rows; update the
assignment to compute the actual period span from the dataframe used by
_calculate_volume_profile (e.g., set period_days = len(df) or derive calendar
days from df.index if it contains dates) so volume_result["period_days"]
reflects the true window for the computed profile; ensure you update the block
that sets volume_result (referencing volume_result, _calculate_volume_profile,
df, preloaded_df, and normalized_symbol).
- Around line 7-9: The import ordering is wrong: the third-party import "import
pandas as pd" is placed between standard-library imports (e.g., "from datetime
..." and "from typing import TYPE_CHECKING, Any"); move "import pandas as pd" so
all stdlib imports come first, then a blank line, then third-party imports
(including pandas), and then local imports if any; update the top of
fundamentals_handlers.py accordingly so "import pandas as pd" follows the stdlib
block and matches the Ruff I001 ordering convention.

In `@app/mcp_server/tooling/fundamentals_sources_naver.py`:
- Around line 657-660: Remove the duplicate get_base_ticker function definition
(the second/shadowing one) and delete the unused variable assignment target_base
that follows it; keep the original get_base_ticker implementation (the one
already defined earlier) and ensure any later code calls that existing
get_base_ticker, not the removed duplicate, so deduplication logic remains
intact.

In `@app/mcp_server/tooling/portfolio_holdings.py`:
- Around line 6-8: The imports in portfolio_holdings.py are out of order and
trigger Ruff I001 because the third-party import pandas is placed between stdlib
imports (asyncio) and typing; reorder the import block so all stdlib imports
(e.g., asyncio, typing) come first, then third-party imports (import pandas as
pd), and then local application imports, updating the top-of-file import
sequence around the existing symbols to satisfy isort/Ruff.

In `@tests/test_mcp_server_tools.py`:
- Around line 8784-8805: The test defines unused local variables causing Ruff
F841: remove or rename them to underscores so they aren't reported —
specifically delete the unused dict "calls" inside mock_yf_ticker or change it
to "_calls", remove the unused "result =" assignment (call
tools["get_sector_peers"] without assigning) and remove or rename
"original_ticker" (or restore it where needed) so no unused locals remain;
update mock_yf_ticker, the test invocation using tools["get_sector_peers"], and
the restoration of yf.Ticker (original_ticker) accordingly.

---

Outside diff comments:
In `@app/mcp_server/tooling/analysis_screening.py`:
- Around line 28-44: The import block is mis-ordered: move third-party imports
(pandas, yfinance, and build_yfinance_tracing_session) into the third-party
group so first stdlib (if any) → third-party → first-party (app.*) ordering is
preserved; specifically, relocate "import pandas as pd", "import yfinance as
yf", and "from app.monitoring import build_yfinance_tracing_session" to appear
together before the app.mcp_server.* imports (or adjust to match your project's
isort/Ruff configuration), keeping the existing named imports
(_YFinanceSnapshot, _fetch_...) and _fetch_... imports intact and properly
grouped.

In `@app/mcp_server/tooling/fundamentals_handlers.py`:
- Line 57: Remove the unused import _fetch_ohlcv_for_volume_profile from the
import list in this module; search for any references to
_fetch_ohlcv_for_volume_profile (e.g., inside functions like
_calculate_volume_profile or callers that previously used it) to confirm it's
not used anywhere, then delete the import statement entry so the linter/CI
(F401) error is resolved.

In `@app/mcp_server/tooling/fundamentals_sources_naver.py`:
- Line 16: The import block in fundamentals_sources_naver.py is not ordered per
project style (CI flagged the lone "import httpx"); reorder imports into the
canonical groups (standard library, third-party like httpx, then local app
imports) and apply your project's formatter (run `ruff check --fix` or
`isort`/`ruff --fix`) to automatically sort and format the imports; update the
top of the module where `import httpx` is declared (and any other imports in
that file) so they follow the configured ordering rules.

---

Nitpick comments:
In `@app/mcp_server/tooling/analysis_screening.py`:
- Around line 263-267: The variable loop is created with
asyncio.get_running_loop() eagerly but only used in the equity_us branch; move
the asyncio.get_running_loop() call into the elif market_type == "equity_us"
branch so loop is only defined where used, and keep the existing
loop.run_in_executor(...) call unchanged; update any references to loop (e.g.,
loop.run_in_executor) to use the newly scoped loop inside that branch and remove
the unused top-level loop declaration near _fetch_ohlcv_for_indicators/ohlcv_df.
- Around line 301-344: The snapshot gathering (_collect_yf_snapshot) is awaited
before creating independent Finnhub tasks, blocking concurrency; to fix, create
profile_task and news_task (calls to _fetch_company_profile_finnhub and
_fetch_news_finnhub) before awaiting yf_snapshot so they run concurrently, then
await loop.run_in_executor to get yf_snapshot and only afterwards create
valuation_task and opinions_task (calls to _fetch_valuation_yfinance and
_fetch_investment_opinions_yfinance) which depend on the snapshot; also update
any downstream task index unpacking (the task_idx handling where tasks are
consumed) to reflect the new creation order.
- Around line 306-324: The helper _collect_yf_snapshot currently swallows
exceptions when accessing yf_ticker.info, yf_ticker.analyst_price_targets, and
yf_ticker.upgrades_downgrades; update each try/except to catch Exception as e
and log the exception (at warning or debug) including context (which property
failed) before continuing, then return the _YFinanceSnapshot as before; locate
the function _collect_yf_snapshot and replace bare excepts with logged except
blocks that reference yf_ticker and the specific property name so operators can
diagnose silent failures.

In `@app/mcp_server/tooling/fundamentals_sources_naver.py`:
- Around line 662-688: The deduplication logic for peer_tickers is duplicated in
both the manual_peers and else branches; extract it into a small helper (e.g.,
dedupe_by_base(peer_list, target_symbol, extra_limit=0)) that uses
get_base_ticker to build seen_bases starting from target_symbol, filters unique
base tickers into deduped_peer_tickers, and returns the final slice using limit
or limit + 5 via the extra_limit parameter; then replace the duplicated loops in
the manual_peers branch and the asyncio.to_thread(client.company_peers,
upper_symbol) branch by calling this helper with extra_limit=0 for manual_peers
and extra_limit=5 for the network branch.

In `@app/mcp_server/tooling/portfolio_holdings.py`:
- Around line 544-548: The code in _get_indicators_impl uses preloaded_df if
provided but otherwise fetches 250 rows via _fetch_ohlcv_for_indicators; add a
lightweight guard and/or docstring to require a minimum number of rows (e.g.
250) for accurate indicator computation: update the function docstring for
_get_indicators_impl to state the minimum row requirement for preloaded_df and
add a runtime check (assert or raise ValueError) after receiving preloaded_df
that verifies preloaded_df.shape[0] >= 250 (or a named constant) and raises a
clear error message advising callers to supply more history or omit preloaded_df
so the internal fetch runs; reference preloaded_df, df, and
_fetch_ohlcv_for_indicators when implementing the check and message.

In `@tests/test_mcp_server_tools.py`:
- Around line 8728-8903: The three new async tests
(test_get_support_resistance_uses_single_ohlcv_fetch,
test_sector_peers_us_dedupes_before_network_call,
test_analyze_stock_us_reuses_yfinance_info) are missing the required strict test
marker; add the appropriate pytest marker (e.g., `@pytest.mark.unit`) alongside
`@pytest.mark.asyncio` above each of those test functions so they are categorized
as unit tests per the project guidelines. Ensure you import pytest if not
already present and apply the `@pytest.mark.unit` decorator to each of the three
test functions.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ea285d2 and dc547a4.

📒 Files selected for processing (5)
  • app/mcp_server/tooling/analysis_screening.py
  • app/mcp_server/tooling/fundamentals_handlers.py
  • app/mcp_server/tooling/fundamentals_sources_naver.py
  • app/mcp_server/tooling/portfolio_holdings.py
  • tests/test_mcp_server_tools.py

Comment on lines +7 to 9

import pandas as pd
from typing import TYPE_CHECKING, Any
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Import ordering: same Ruff I001 pattern as portfolio_holdings.py.

import pandas as pd (third-party) is placed between stdlib imports (datetime and typing). Move it after all stdlib imports.

🔧 Proposed fix
 import asyncio
 import datetime
-
-import pandas as pd
 from typing import TYPE_CHECKING, Any
 
+import pandas as pd
+
 from app.mcp_server.tooling.fundamentals_sources_binance import (
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/fundamentals_handlers.py` around lines 7 - 9, The
import ordering is wrong: the third-party import "import pandas as pd" is placed
between standard-library imports (e.g., "from datetime ..." and "from typing
import TYPE_CHECKING, Any"); move "import pandas as pd" so all stdlib imports
come first, then a blank line, then third-party imports (including pandas), and
then local imports if any; update the top of fundamentals_handlers.py
accordingly so "import pandas as pd" follows the stdlib block and matches the
Ruff I001 ordering convention.

Comment on lines +138 to 141
# Reuse the same OHLCV dataframe for volume profile (avoid duplicate fetch)
volume_result = _calculate_volume_profile(df, bins=20)
volume_result["symbol"] = normalized_symbol
volume_result["period_days"] = 60
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

period_days is hardcoded to 60 but will be inaccurate when preloaded_df spans a larger window.

Before this PR, the volume profile was computed from a separately-fetched 60-row dataframe, so period_days=60 was accurate. Now that the same df is reused, a preloaded_df with 250 rows (fetched by the parent _analyze_stock_impl) will produce a volume profile computed over ~250 days but still report period_days=60.

🔧 Proposed fix
-        volume_result["period_days"] = 60
+        volume_result["period_days"] = len(df)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/fundamentals_handlers.py` around lines 138 - 141, The
code hardcodes volume_result["period_days"] = 60 even though the reused df
(often named preloaded_df or passed into _analyze_stock_impl) may contain a
different number of rows; update the assignment to compute the actual period
span from the dataframe used by _calculate_volume_profile (e.g., set period_days
= len(df) or derive calendar days from df.index if it contains dates) so
volume_result["period_days"] reflects the true window for the computed profile;
ensure you update the block that sets volume_result (referencing volume_result,
_calculate_volume_profile, df, preloaded_df, and normalized_symbol).

Comment on lines +657 to +660
def get_base_ticker(ticker: str) -> str:
if "." in ticker:
return ticker.split(".")[0]
return ticker
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's see the file structure and the specific lines mentioned
fd 'fundamentals_sources_naver.py' --type f

Repository: mgh3326/auto_trader

Length of output: 115


🏁 Script executed:

#!/bin/bash
# Read the file to see content around lines 657-660 and 724-729
wc -l app/mcp_server/tooling/fundamentals_sources_naver.py

Repository: mgh3326/auto_trader

Length of output: 119


🏁 Script executed:

#!/bin/bash
# Check the specific sections mentioned in the review
sed -n '650,680p' app/mcp_server/tooling/fundamentals_sources_naver.py

Repository: mgh3326/auto_trader

Length of output: 1251


🏁 Script executed:

#!/bin/bash
# Check lines around 724-729
sed -n '715,735p' app/mcp_server/tooling/fundamentals_sources_naver.py

Repository: mgh3326/auto_trader

Length of output: 791


🏁 Script executed:

#!/bin/bash
# Search for all occurrences of target_base to see if it's used after line 729
grep -n 'target_base' app/mcp_server/tooling/fundamentals_sources_naver.py

Repository: mgh3326/auto_trader

Length of output: 304


🏁 Script executed:

#!/bin/bash
# Check lines after 729 to see if target_base is ever used in the rest of the function
sed -n '729,800p' app/mcp_server/tooling/fundamentals_sources_naver.py | head -80

Repository: mgh3326/auto_trader

Length of output: 2585


Remove duplicate get_base_ticker definition and unused target_base variable at lines 724–729.

The function get_base_ticker is already defined at line 657. The identical second definition at line 724 shadows the first and is not necessary. Additionally, target_base assigned at line 729 is never referenced afterward—the deduplication logic was already applied earlier (lines 665–673 and 680–688), making this assignment dead code.

Proposed fix: remove the duplicate definition and unused variable
-    def get_base_ticker(ticker: str) -> str:
-        if "." in ticker:
-            return ticker.split(".")[0]
-        return ticker
-
-    target_base = get_base_ticker(upper_symbol)
     # Dedupe already applied before network call, use peer_tickers directly
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/fundamentals_sources_naver.py` around lines 657 - 660,
Remove the duplicate get_base_ticker function definition (the second/shadowing
one) and delete the unused variable assignment target_base that follows it; keep
the original get_base_ticker implementation (the one already defined earlier)
and ensure any later code calls that existing get_base_ticker, not the removed
duplicate, so deduplication logic remains intact.

Comment on lines +6 to 8

import pandas as pd
from typing import TYPE_CHECKING, Any, cast
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix import ordering to resolve the Ruff I001 pipeline warning.

import pandas as pd (third-party) is inserted between two stdlib imports (asyncio and typing), breaking isort ordering and causing the CI warning.

🔧 Proposed fix
 import asyncio
-
-import pandas as pd
 from typing import TYPE_CHECKING, Any, cast
 
+import pandas as pd
+
 import app.services.brokers.upbit.client as upbit_service
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import pandas as pd
from typing import TYPE_CHECKING, Any, cast
import asyncio
from typing import TYPE_CHECKING, Any, cast
import pandas as pd
import app.services.brokers.upbit.client as upbit_service
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/mcp_server/tooling/portfolio_holdings.py` around lines 6 - 8, The imports
in portfolio_holdings.py are out of order and trigger Ruff I001 because the
third-party import pandas is placed between stdlib imports (asyncio) and typing;
reorder the import block so all stdlib imports (e.g., asyncio, typing) come
first, then third-party imports (import pandas as pd), and then local
application imports, updating the top-of-file import sequence around the
existing symbols to satisfy isort/Ruff.

Comment on lines +8784 to +8805
def mock_yf_ticker(symbol, session=None):
calls = {"symbol": symbol}

class MockTicker:
@property
def info(self):
yf_info_calls.append(symbol)
return {
"shortName": f"{symbol} Inc",
"currentPrice": 100.0,
"previousClose": 99.0,
"trailingPE": 15.0,
"priceToBook": 2.0,
"marketCap": 1000000000,
"industry": "Tech",
}

return MockTicker()

monkeypatch.setattr(fundamentals_sources_naver.yf, "Ticker", mock_yf_ticker)

result = await tools["get_sector_peers"]("TEST", market="us", limit=5)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused locals flagged by Ruff (F841).

Line 8785 (calls), Line 8805 (result), and Line 8847 (original_ticker) are unused and already breaking CI.

🧹 Suggested fix
 def mock_yf_ticker(symbol, session=None):
-    calls = {"symbol": symbol}
-
     class MockTicker:
         `@property`
         def info(self):
@@
-    result = await tools["get_sector_peers"]("TEST", market="us", limit=5)
+    await tools["get_sector_peers"]("TEST", market="us", limit=5)
@@
-    original_ticker = fundamentals_sources_naver.yf.Ticker
-
     def mock_yf_ticker(symbol, session=None):

Also applies to: 8847-8849

🧰 Tools
🪛 GitHub Actions: Test

[error] 8785-8785: F841 Local variable calls is assigned to but never used.


[error] 8805-8805: F841 Local variable result is assigned to but never used.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_mcp_server_tools.py` around lines 8784 - 8805, The test defines
unused local variables causing Ruff F841: remove or rename them to underscores
so they aren't reported — specifically delete the unused dict "calls" inside
mock_yf_ticker or change it to "_calls", remove the unused "result =" assignment
(call tools["get_sector_peers"] without assigning) and remove or rename
"original_ticker" (or restore it where needed) so no unused locals remain;
update mock_yf_ticker, the test invocation using tools["get_sector_peers"], and
the restoration of yf.Ticker (original_ticker) accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant