Refactor build wheels pipeline to build wheels and run pytest#301
Refactor build wheels pipeline to build wheels and run pytest#301alekstheod wants to merge 13 commits intomasterfrom
Conversation
536aa4a to
3adaf27
Compare
| build:rocm_rbe --host_platform="//platform/linux:tf_linux_gpu" | ||
| build:rocm_rbe --extra_execution_platforms="//platform/linux:tf_linux_gpu" | ||
| build:rocm_rbe --platforms="//platform/linux:tf_linux_gpu" | ||
| build:rocm_rbe --host_platform="@local_config_rocm//rocm:linux_x64" |
There was a problem hiding this comment.
We should remove the jax_rocm_plugin/platform/linux/BUILD file if we're going to use the platform in rocm/xla. Also, pretty sure that this is going to break PR CI as it works now, unless your change to make the XLA platform's Docker image configurable landed in upstream.
There was a problem hiding this comment.
I would like firstly merge this PR: #297
So I can test every consequent one. I will adjust this PR once rebased.
There was a problem hiding this comment.
Unfortunately removing the old platform is not possible. So I will just convert this PR to adjusting the build_wheels pipeline instead!
f82499e to
554f0ba
Compare
94ea954 to
0934ef9
Compare
5698800 to
d0f00db
Compare
9f1a3d0 to
8425212
Compare
| cancel-in-progress: true | ||
|
|
||
| permissions: | ||
| contents: read |
There was a problem hiding this comment.
Thanks for putting these in there
There was a problem hiding this comment.
It only became possible because of the adaptions: single workspace and on the fly wheel build.
| rbe_ci_cert: ${{ secrets.RBE_CI_CERT }} | ||
| rbe_ci_key: ${{ secrets.RBE_CI_KEY }} | ||
| builder-image: "search" | ||
| call-build-docker: |
There was a problem hiding this comment.
Can you move this to the end of the workflow? We don't want to lose the ability for PR CI to check if the docker image build is busted. It'll use the wheels that you build and upload to artifacts.
There was a problem hiding this comment.
Agreed on adding it back
| rocm-version: ["7"] | ||
| python-version: ["3.11", "3.12", "3.13", "3.14"] | ||
| container: | ||
| image: rocm/tensorflow-build@sha256:7fcfbd36b7ac8f6b0805b37c4248e929e31cf5ee3af766c8409dd70d5ab65faa |
There was a problem hiding this comment.
I've finally got the manylinux build to happen in its own stage. Could we use that here instead? Should be called ghcr.io/rocm/jax-manylinux_2_28-rocm-7.2.0:latest
There was a problem hiding this comment.
Lets create a separate PR for that. I think last time I tried something from our images I got permission issue!
c79e09a to
271f006
Compare
271f006 to
1cfeb76
Compare
6d93aef to
a836f55
Compare
f65c551 to
f76fc54
Compare

This PR adjust the build_wheel pipeline to build wheel and make an integration test with installed wheel into the venv + using pytest.
Note: I tried - Use rbe platform from inside xla project, unfortunately it can't work as we have different tags in jax and xla doesn't handle these tags. Tags are used in xla to assign rbe pool to each test action!