Support Athena managed query result storage#665
Merged
laughingman7743 merged 5 commits intomasterfrom Feb 19, 2026
Merged
Conversation
When a workgroup has managed query result storage enabled, GetQueryExecution does not return ResultConfiguration.OutputLocation. File-based cursors (Pandas, Arrow, Polars, S3FS) checked output_location and returned empty results even though data was available via API. Fall back to GetQueryResults API when output_location is None but the query succeeded. S3 file reading remains the primary path when output_location is available. - Add _fetch_all_rows() to AthenaResultSet base class for paginated API fetching with DefaultTypeConverter - Add _parse_result_rows() to share response parsing between _process_rows (normal path) and _fetch_all_rows (fallback path) - Add _rows_to_columnar() for shared row-to-columnar conversion - Extract __get_query_results() from __fetch() for reuse - Add _as_*_from_api() fallback methods to each result set subclass - Remove empty ResultConfiguration from StartQueryExecution request when no S3 staging dir or encryption is configured - Emit warning log when falling back to API Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…with ProgrammingError - Add test_fetch_all_rows to all cursor test files (Cursor, Pandas, Arrow, Polars, S3FS) parameterized by workgroup (default vs managed) using fixture indirect - Support s3_staging_dir="" to explicitly disable env var fallback, required for managed workgroups where ResultConfiguration conflicts with ManagedQueryResultsConfiguration - Replace all assert statements with ProgrammingError exceptions across connection.py, cursor files, result_set files (11 occurrences) - Add AWS_ATHENA_MANAGED_WORKGROUP env var to tests/Env, GitHub Actions, docs - Add ManagedWorkGroup resource to CloudFormation template Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
e679a31 to
cd45053
Compare
…nversion logic - Remove is_first_page parameter from _parse_result_rows; move _is_first_row_column_labels detection to callers (_pre_fetch, _fetch_all_rows) - Change _process_rows to accept parsed rows and offset instead of raw response - Add optional converter parameter to _get_rows and reuse it in _fetch_all_rows to eliminate duplicated row conversion loop - Remove unnecessary cast() calls from _process_rows and _is_first_row_column_labels Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allow custom type converter to be passed through the API fallback chain. Defaults to DefaultTypeConverter for backward compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…aging_dir="" docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #664
When a workgroup has managed query result storage enabled,
GetQueryExecutiondoes not returnResultConfiguration.OutputLocation. This caused file-based cursors (Pandas, Arrow, Polars, S3FS) to return empty results even though data was available via theGetQueryResultsAPI.Changes
API fallback for managed storage (
result_set.py,pandas/result_set.py,arrow/result_set.py,polars/result_set.py,s3fs/result_set.py)output_locationisNonebut the query succeeded, fall back toGetQueryResultsAPI to fetch dataoutput_locationis available_fetch_all_rows(),_parse_result_rows(),_rows_to_columnar()_fetch_all_rows()and_as_*_from_api()accept an optionalconverterparameter (defaults toDefaultTypeConverter)Refactor result parsing (
result_set.py)is_first_pageflag from_parse_result_rows; move_is_first_row_column_labelsdetection to callers (_pre_fetch,_fetch_all_rows)_process_rowsto accept parsed rows and offset instead of raw responseconverterparameter to_get_rowsand reuse it in_fetch_all_rowsto eliminate duplicated row conversion loopcast()callss3_staging_dir=""support (connection.py)AWS_ATHENA_S3_STAGING_DIRenv var fallbackResultConfigurationconflicts withManagedQueryResultsConfigurationConnection.__init__()andconnect()assert→ProgrammingError(11 occurrences across 8 files)assertstatements used for input validation with properProgrammingErrorexceptionsassertcan be stripped bypython -O, making validation unreliableOmit empty
ResultConfiguration(common.py)ResultConfigurationdict inStartQueryExecutionwhen no S3 staging dir or encryption is configuredTests (5 test files +
tests/__init__.py)test_fetch_all_rowsadded to all cursor types, parameterized by workgroup (default vs managed) using fixtureindirectAWS_ATHENA_MANAGED_WORKGROUPenv var controls managed workgroup name; tests skip when not setCI & Docs
AWS_ATHENA_MANAGED_WORKGROUPenv varManagedWorkGroupresourcedocs/testing.md: documented managed workgroup setupdocs/usage.md: added managed query result storage sectionTest plan
test_fetch_all_rowstests pass (5 cursor types × 2 workgroup modes)make fmt/make chk(ruff + mypy) passpyathena-managed)🤖 Generated with Claude Code