Skip to content

Comments

Fix Python 3.12 CI failures#971

Open
chrisspalm wants to merge 4 commits intodask:mainfrom
chrisspalm:fix/phase-race-condition
Open

Fix Python 3.12 CI failures#971
chrisspalm wants to merge 4 commits intodask:mainfrom
chrisspalm:fix/phase-race-condition

Conversation

@chrisspalm
Copy link
Contributor

@chrisspalm chrisspalm commented Feb 20, 2026

Summary

  • Fix flaky test_create_cluster_validates_name assertion that raced between "Created" and "Pending" phases on Python 3.12 (relates to DaskCluster stuck in Pending state - Possible race condition? #968)
  • Replace import pkg_resources with importlib.metadata.entry_points() in the controller for Python 3.12 compatibility (setuptools 82 removed pkg_resources)
  • Pin setuptools<81 and jsonschema==4.17.3 in CI to fix k8s-crd-resolver / prance backend detection failure (openapi-spec-validator<0.5.0 still needs pkg_resources at import time)

Details

Test fix: The test previously asserted an exact phase ("Created"), but on Python 3.12 the async handler chain runs faster, causing a race where the poll catches "Pending" instead. Changed to assert != "Error" for valid names, which is correct across all Python versions.

Controller fix: pkg_resources was removed from setuptools 82. The controller used it for plugin discovery via iter_entry_points. Replaced with importlib.metadata.entry_points() (stdlib since Python 3.9).

CI fix: openapi-spec-validator 0.4.0 (required by k8s-crd-resolver) imports pkg_resources.resource_filename at module level. With setuptools 82, this import fails, causing prance to report "No validation backend available!". Pinning setuptools<81 restores pkg_resources availability. Also re-pins jsonschema==4.17.3 after dask[complete] installs, since openapi-spec-validator<0.5.0 needs jsonschema._legacy_validators (removed in 4.18+).

Test plan

  • Reproduced prance backend detection failure locally with setuptools 82 + Python 3.12
  • Verified fix locally: setuptools<81 restores openapi-spec-validator importability
  • CI passes for Python 3.10, 3.11, 3.12

The test expected phase "Created" for valid cluster names, but "Created"
is a transient state that immediately transitions to "Pending" when
daskcluster_create_components fires. On Python 3.12, faster asyncio
internals cause this transition to complete before the test poll reads
the status, resulting in "Pending" instead of "Created".

Change the test to check != "Error" for valid names instead of expecting
a specific transient phase. The test validates name validation logic, not
phase lifecycle, so this accurately reflects its intent and works across
all Python versions.

Fixes Python 3.12 failure in:
https://github.com/dask/dask-kubernetes/actions/runs/21955249462/job/63417295766

Related to dask#968
pkg_resources (from setuptools) is no longer bundled by default in
Python 3.12 CI environments, causing ModuleNotFoundError at import time.

Replace with importlib.metadata.entry_points which is in the stdlib
since Python 3.9 (project requires >=3.10).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@chrisspalm chrisspalm changed the title Fix flaky test_create_cluster_validates_name on Python 3.12 Fix Python 3.12 test failures: replace pkg_resources and fix flaky phase assertion Feb 20, 2026
@chrisspalm chrisspalm changed the title Fix Python 3.12 test failures: replace pkg_resources and fix flaky phase assertion Fix Python 3.12 CI failures Feb 20, 2026
@chrisspalm chrisspalm force-pushed the fix/phase-race-condition branch from 7e6df02 to 08e8503 Compare February 20, 2026 13:03
dask[complete] pulls in jupyter packages that upgrade jsonschema to
4.18+, which removed _legacy_validators. k8s-crd-resolver requires
openapi-spec-validator<0.5.0 which depends on that internal module.

Re-pin jsonschema==4.17.3 at the end of ci/install-deps.sh so the pin
survives the dask[complete] install.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@chrisspalm chrisspalm force-pushed the fix/phase-race-condition branch from 08e8503 to a45a401 Compare February 20, 2026 13:08
setuptools 82 removed pkg_resources, which openapi-spec-validator<0.5.0
needs at import time. This caused prance to fail backend detection with
"No validation backend available!" in the customresources test fixture.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@chrisspalm chrisspalm force-pushed the fix/phase-race-condition branch from 4499f42 to 914a360 Compare February 20, 2026 13:47
@chrisspalm
Copy link
Contributor Author

Hey! Heads-up: I'm not deeply familiar with this codebase. I noticed the Python 3.12 CI failures and used an AI assistant to help diagnose and propose these fixes.

The analysis and local reproduction look solid, but I'd really appreciate a careful review since I can't vouch for every nuance of the controller/test behavior. In particular:

Test assertion change — switched from asserting exact phase "Created" to != "Error" for valid names, to avoid the async timing race between Created → Pending. Is this the right level of strictness?
importlib.metadata migration — replaced pkg_resources.iter_entry_points() with importlib.metadata.entry_points(). Should be equivalent but worth verifying.
CI setuptools<81 pin — workaround for setuptools 82 removing pkg_resources, which breaks openapi-spec-validator<0.5.0. Longer term, upgrading k8s-crd-resolver or its deps would be cleaner.
Happy to adjust anything based on feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants