-
Notifications
You must be signed in to change notification settings - Fork 107
Open
Description
Summary
Compare performance of S3ThreadPoolExecutor (sync, current default) vs S3AioExecutor (async, new) to validate the switch to AioS3FileSystem as the default in v3.30.0.
Background
PR #684 introduced the S3Executor strategy pattern, replacing hardcoded ThreadPoolExecutor usage with a pluggable interface. This eliminates thread-in-thread nesting when aio cursors use S3FileSystem. Before making AioS3FileSystem the default for async paths, we need empirical performance data.
Related:
- Add S3Executor strategy pattern for async S3 operations #684: Add S3Executor strategy pattern for async S3 operations
- Comprehensive cursor benchmark with memory and performance metrics #644: Comprehensive cursor benchmark with memory and performance metrics
Benchmark Scope
Scenarios
| Scenario | Description |
|---|---|
| Query result fetch | AioS3FSCursor fetch performance (small/medium/large result sets) |
| Large file read | Multipart range read via _fetch_range |
| Large file write | Multipart upload via commit |
| Parallel copy | _copy_object_with_multipart_upload |
Metrics
- Wall-clock time (latency)
- Throughput (MB/s)
- Concurrency behavior under varying
max_workers
Comparison
S3FileSystem+S3ThreadPoolExecutor(sync baseline)AioS3FileSystem+S3AioExecutor(async candidate)
Acceptance Criteria
- Benchmark script(s) covering the scenarios above
- Results showing no significant regression for async path
- Summary with recommendation for v3.30.0 default switch
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels