-
Notifications
You must be signed in to change notification settings - Fork 107
Description
Problem
When AioPandasCursor or AioPolarsCursor is used with the chunksize option, execute() creates a lazy reader (e.g., pandas TextFileReader) instead of loading all data at once. Subsequent calls to fetchone(), fetchmany(), fetchall(), as_pandas() iteration, or async for trigger chunk-by-chunk S3 reads that are not wrapped in asyncio.to_thread(), blocking the event loop.
Without chunksize, this is not an issue — execute() downloads all data via asyncio.to_thread() and fetch methods operate on in-memory data only.
Root cause
AioPandasCursor and AioPolarsCursor inherit sync fetchone()/fetchmany()/fetchall() from WithAsyncFetch (in pyathena/aio/common.py). These methods call result_set.fetchone() directly, which in the chunked case calls next(self._iterrows) → PandasDataFrameIterator.__next__ → next(self._reader) on a TextFileReader, triggering blocking S3 I/O.
Proposed solution
Override fetchone(), fetchmany(), fetchall(), and __anext__() in AioPandasCursor and AioPolarsCursor to wrap the calls in asyncio.to_thread() when chunksize is set, following the same pattern used by AioS3FSCursor:
async def fetchone(self):
if not self.has_result_set:
raise ProgrammingError("No result set.")
result_set = cast(AthenaPandasResultSet, self.result_set)
return await asyncio.to_thread(result_set.fetchone)Two approaches:
- Conditional: Only use
asyncio.to_thread()whenself._chunksizeis set; call synchronously otherwise to avoid overhead for the common non-chunked case. - Unconditional: Always wrap in
asyncio.to_thread()for simplicity. The overhead for in-memory operations is minimal.
Related
- PRs Add native asyncio cursor support (Phase 1) #666, Add native asyncio specialized cursors and DRY refactors (Phase 2) #667, Add native asyncio S3FS and Spark cursors with boilerplate deduplication (Phase 3) #668 — native asyncio cursor implementation
- PR Add documentation for native asyncio cursors #671 — documentation for aio cursors (includes a note about this limitation)