Forward-merge main into pandas3 by AyodeAwe · Pull Request #21509 · rapidsai/cudf

AyodeAwe · 2026-02-20T18:00:15Z

Forward-merge triggered by automated cron job to keep pandas3 up-to-date with main.

If this PR has conflicts, it will remain open for manual resolution.

Follow-up to #21464 Removes some testing configuration left behind in that PR. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #21494

AyodeAwe · 2026-02-20T18:00:18Z

FAILURE - Unable to forward-merge automatically, manual merge is necessary.

cc @Matt711 @galipremsagar @mroeschke

Do not use the Resolve conflicts option in this PR. Follow these instructions: https://docs.rapids.ai/maintainers/forward-merger/

IMPORTANT: When merging this PR, do not use the auto-merger (i.e. the /merge comment). Instead, an admin must manually merge by changing the merging strategy to Create a Merge Commit. Otherwise, history will be lost and the branches become incompatible.

Follows up #21469 to handle the stricter dtype validation at array creation time introduced in cupy 14. Ops-Bot-Merge-Barrier: true Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Tom Augspurger (https://github.com/TomAugspurger) - Bradley Dice (https://github.com/bdice) - James Lamb (https://github.com/jameslamb) URL: #21504

…argument. (#21503) Handle the breaking change introduced in rapidsai/rapidsmpf#871 Authors: - Mads R. B. Kristensen (https://github.com/madsbk) - Tom Augspurger (https://github.com/TomAugspurger) Approvers: - Matthew Murray (https://github.com/Matt711) URL: #21503

review-notebook-app · 2026-02-20T22:12:53Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

This has not been used almost anywhere Authors: - Michael Schellenberger Costa (https://github.com/miscco) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - David Wendt (https://github.com/davidwendt) - Shruti Shivakumar (https://github.com/shrshi) URL: #21477

Close #21512 This PR fixes a misaligned memory access bug. Previously, we aligned data to 8-byte boundaries, which causes issues for 16-byte types such as `decimal128` that require 16-byte alignment. The fix updates the alignment to 16 bytes. Note that this change may introduce additional padding, but the overall padding overhead is negligible compared to the usable data. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Bradley Dice (https://github.com/bdice) - David Wendt (https://github.com/davidwendt) URL: #21513

This PR attempts to collect the common bits of logic used by the various ColumnBase subclasses' `find_and_replace` implementations. I noticed some of this duplicated code while working on other refactorings and collected the changes together here. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Matthew Roeschke (https://github.com/mroeschke) URL: #21500

…21410) ### Summary - Refactors join benchmark input table generation to produce deterministic results across runs - Moves `generate_input_tables` implementation from header to separate `.cu` file - Replaces random sampling with deterministic `thrust::tabulate` + `thrust::shuffle` approach ### Changes **`cpp/benchmarks/common/generate_input.cu`** - Rewrote `create_distinct_rows_column` for numeric types to use `cudf::sequence` followed by `thrust::shuffle` instead of random sampling - This ensures unique values are generated deterministically given the same seed **`cpp/benchmarks/join/generate_input_tables.cu`** (new file) - Moved implementation from header file - Build table gather map: uses `thrust::tabulate` with modulo to cycle through unique keys, then shuffles with fixed seed (12345) - Probe table gather map: uses `thrust::tabulate` to assign matching keys (cycling through unique build keys) for first `selectivity * probe_rows` entries, non-matching keys for the rest, then shuffles with fixed seed (67890) **`cpp/benchmarks/join/generate_input_tables.cuh`** - Reduced to declarations only (moved CUDA kernels and implementation to `.cu` file) **`cpp/benchmarks/CMakeLists.txt`** - Added `join/generate_input_tables.cu` to the build Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Yunsong Wang (https://github.com/PointKernel) - David Wendt (https://github.com/davidwendt) - Bradley Dice (https://github.com/bdice) URL: #21410

Towards #21229 One of the 2 large changes to natively support pandas extension types in cuDF now possible that we consistently use the `ColumnBase.create` API to preserve pandas extension types - `dtype` arguments will now pass through extension types instead of coercing them to numpy types in the `def dtype` function. * Some changes were needed in `DatetimeTZColumn` to be more accommodating to `pandas.ArrowDtype` to pass the cudf tests suite. * IIRC Dask, by default, will try to use `pandas.StringDtype(storage="pyarrow")` type if pyarrow is installed even with pandas < 2. I turned off this feature in some tests, as what is already done in other tests, and I expect we should be able to remove this with pandas 3 support when that string type is the default. * The added tests to `conftest-patch.py` appear to be largely due to column APIs not entire resolving the resulting dtype correctly still (like `DatetimeColumn.strftime`. Those improvement can be in a follow up. The next change will be to preserve input data that are pandas objects with extension types. Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Lawrence Mitchell (https://github.com/wence-) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: #21499

check-nightly-ci: remove testing config (#21494)

ef712cc

Follow-up to #21464 Removes some testing configuration left behind in that PR. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #21494

AyodeAwe requested a review from a team as a code owner February 20, 2026 18:00

AyodeAwe requested review from bdice and removed request for a team February 20, 2026 18:00

github-actions bot assigned AyodeAwe Feb 20, 2026

rapids-bot bot requested a review from a team as a code owner February 20, 2026 18:27

rapids-bot bot requested review from brandon-b-miller and removed request for a team February 20, 2026 18:27

github-actions bot added Python Affects Python cuDF API. cudf.pandas Issues specific to cudf.pandas labels Feb 20, 2026

github-project-automation bot added this to cuDF Python Feb 20, 2026

GPUtester moved this to In Progress in cuDF Python Feb 20, 2026

github-actions bot added the cudf-polars Issues specific to cudf-polars label Feb 20, 2026

rapids-bot bot requested a review from a team as a code owner February 21, 2026 00:22

rapids-bot bot requested review from kingcrimsontianyu and removed request for a team February 21, 2026 00:22

github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Feb 21, 2026

PointKernel and others added 3 commits February 21, 2026 00:31

rapids-bot bot requested a review from a team as a code owner February 21, 2026 02:09

github-actions bot added the CMake CMake build issue label Feb 21, 2026

rapids-bot bot requested a review from a team as a code owner February 21, 2026 05:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Forward-merge main into pandas3#21509

Forward-merge main into pandas3#21509
AyodeAwe wants to merge 8 commits intopandas3from
main

AyodeAwe commented Feb 20, 2026

Uh oh!

AyodeAwe commented Feb 20, 2026

Uh oh!

review-notebook-app bot commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Comments

Conversation

AyodeAwe commented Feb 20, 2026

Uh oh!

AyodeAwe commented Feb 20, 2026

Uh oh!

review-notebook-app bot commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants