Skip to content

Comments

Support Athena managed query result storage#665

Merged
laughingman7743 merged 5 commits intomasterfrom
fix/managed-query-result-storage
Feb 19, 2026
Merged

Support Athena managed query result storage#665
laughingman7743 merged 5 commits intomasterfrom
fix/managed-query-result-storage

Conversation

@laughingman7743
Copy link
Member

@laughingman7743 laughingman7743 commented Feb 18, 2026

Summary

Fixes #664

When a workgroup has managed query result storage enabled, GetQueryExecution does not return ResultConfiguration.OutputLocation. This caused file-based cursors (Pandas, Arrow, Polars, S3FS) to return empty results even though data was available via the GetQueryResults API.

Changes

API fallback for managed storage (result_set.py, pandas/result_set.py, arrow/result_set.py, polars/result_set.py, s3fs/result_set.py)

  • When output_location is None but the query succeeded, fall back to GetQueryResults API to fetch data
  • S3 file reading remains the primary path when output_location is available
  • Shared helpers: _fetch_all_rows(), _parse_result_rows(), _rows_to_columnar()
  • _fetch_all_rows() and _as_*_from_api() accept an optional converter parameter (defaults to DefaultTypeConverter)

Refactor result parsing (result_set.py)

  • Remove is_first_page flag from _parse_result_rows; move _is_first_row_column_labels detection to callers (_pre_fetch, _fetch_all_rows)
  • Change _process_rows to accept parsed rows and offset instead of raw response
  • Add optional converter parameter to _get_rows and reuse it in _fetch_all_rows to eliminate duplicated row conversion loop
  • Remove unnecessary cast() calls

s3_staging_dir="" support (connection.py)

  • Pass empty string to explicitly disable AWS_ATHENA_S3_STAGING_DIR env var fallback
  • Required for managed workgroups where ResultConfiguration conflicts with ManagedQueryResultsConfiguration
  • Updated docstrings in Connection.__init__() and connect()

assertProgrammingError (11 occurrences across 8 files)

  • Replaced all assert statements used for input validation with proper ProgrammingError exceptions
  • assert can be stripped by python -O, making validation unreliable

Omit empty ResultConfiguration (common.py)

  • Don't send empty ResultConfiguration dict in StartQueryExecution when no S3 staging dir or encryption is configured

Tests (5 test files + tests/__init__.py)

  • test_fetch_all_rows added to all cursor types, parameterized by workgroup (default vs managed) using fixture indirect
  • AWS_ATHENA_MANAGED_WORKGROUP env var controls managed workgroup name; tests skip when not set

CI & Docs

  • GitHub Actions: added AWS_ATHENA_MANAGED_WORKGROUP env var
  • CloudFormation: added ManagedWorkGroup resource
  • docs/testing.md: documented managed workgroup setup
  • docs/usage.md: added managed query result storage section

Test plan

  • All 10 test_fetch_all_rows tests pass (5 cursor types × 2 workgroup modes)
  • make fmt / make chk (ruff + mypy) pass
  • CI with managed workgroup (pyathena-managed)

🤖 Generated with Claude Code

When a workgroup has managed query result storage enabled,
GetQueryExecution does not return ResultConfiguration.OutputLocation.
File-based cursors (Pandas, Arrow, Polars, S3FS) checked output_location
and returned empty results even though data was available via API.

Fall back to GetQueryResults API when output_location is None but the
query succeeded. S3 file reading remains the primary path when
output_location is available.

- Add _fetch_all_rows() to AthenaResultSet base class for paginated
  API fetching with DefaultTypeConverter
- Add _parse_result_rows() to share response parsing between
  _process_rows (normal path) and _fetch_all_rows (fallback path)
- Add _rows_to_columnar() for shared row-to-columnar conversion
- Extract __get_query_results() from __fetch() for reuse
- Add _as_*_from_api() fallback methods to each result set subclass
- Remove empty ResultConfiguration from StartQueryExecution request
  when no S3 staging dir or encryption is configured
- Emit warning log when falling back to API

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…with ProgrammingError

- Add test_fetch_all_rows to all cursor test files (Cursor, Pandas, Arrow, Polars, S3FS)
  parameterized by workgroup (default vs managed) using fixture indirect
- Support s3_staging_dir="" to explicitly disable env var fallback, required
  for managed workgroups where ResultConfiguration conflicts with
  ManagedQueryResultsConfiguration
- Replace all assert statements with ProgrammingError exceptions across
  connection.py, cursor files, result_set files (11 occurrences)
- Add AWS_ATHENA_MANAGED_WORKGROUP env var to tests/Env, GitHub Actions, docs
- Add ManagedWorkGroup resource to CloudFormation template

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@laughingman7743 laughingman7743 force-pushed the fix/managed-query-result-storage branch from e679a31 to cd45053 Compare February 19, 2026 02:18
laughingman7743 and others added 3 commits February 19, 2026 23:07
…nversion logic

- Remove is_first_page parameter from _parse_result_rows; move
  _is_first_row_column_labels detection to callers (_pre_fetch, _fetch_all_rows)
- Change _process_rows to accept parsed rows and offset instead of raw response
- Add optional converter parameter to _get_rows and reuse it in _fetch_all_rows
  to eliminate duplicated row conversion loop
- Remove unnecessary cast() calls from _process_rows and _is_first_row_column_labels

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allow custom type converter to be passed through the API fallback
chain. Defaults to DefaultTypeConverter for backward compatibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…aging_dir="" docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@laughingman7743 laughingman7743 merged commit a13bd3c into master Feb 19, 2026
5 checks passed
@laughingman7743 laughingman7743 deleted the fix/managed-query-result-storage branch February 19, 2026 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fetching results does not work with Athena managed query result storage

1 participant