Skip to content

AioPandasCursor/AioPolarsCursor: chunksize fetch blocks the event loop #672

@laughingman7743

Description

@laughingman7743

Problem

When AioPandasCursor or AioPolarsCursor is used with the chunksize option, execute() creates a lazy reader (e.g., pandas TextFileReader) instead of loading all data at once. Subsequent calls to fetchone(), fetchmany(), fetchall(), as_pandas() iteration, or async for trigger chunk-by-chunk S3 reads that are not wrapped in asyncio.to_thread(), blocking the event loop.

Without chunksize, this is not an issue — execute() downloads all data via asyncio.to_thread() and fetch methods operate on in-memory data only.

Root cause

AioPandasCursor and AioPolarsCursor inherit sync fetchone()/fetchmany()/fetchall() from WithAsyncFetch (in pyathena/aio/common.py). These methods call result_set.fetchone() directly, which in the chunked case calls next(self._iterrows)PandasDataFrameIterator.__next__next(self._reader) on a TextFileReader, triggering blocking S3 I/O.

Proposed solution

Override fetchone(), fetchmany(), fetchall(), and __anext__() in AioPandasCursor and AioPolarsCursor to wrap the calls in asyncio.to_thread() when chunksize is set, following the same pattern used by AioS3FSCursor:

async def fetchone(self):
    if not self.has_result_set:
        raise ProgrammingError("No result set.")
    result_set = cast(AthenaPandasResultSet, self.result_set)
    return await asyncio.to_thread(result_set.fetchone)

Two approaches:

  1. Conditional: Only use asyncio.to_thread() when self._chunksize is set; call synchronously otherwise to avoid overhead for the common non-chunked case.
  2. Unconditional: Always wrap in asyncio.to_thread() for simplicity. The overhead for in-memory operations is minimal.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions