Skip to content

Comments

Add docs on bulid.py#218

Draft
charleshofer wants to merge 1 commit intomasterfrom
update-build-docs
Draft

Add docs on bulid.py#218
charleshofer wants to merge 1 commit intomasterfrom
update-build-docs

Conversation

@charleshofer
Copy link
Collaborator

@charleshofer charleshofer commented Dec 11, 2025

Update the docs to describe how to use the build.py without using the devsetup and stack.py

TODO: Draft PR. Don't merge yet. Need to add instructions on options for overriding the rocm/jax and rocm/xla version and test out commands with those combinations to make sure they just work.

Copy link
Contributor

@Arech8 Arech8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if clang path is fine, then fine.

Copy link

@i-chaochen i-chaochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added more related folks to make sure this JAX-XLA build docs can meet their need. We can merge untill everyone's on same page. Otherwise it still cannot address the real issue.

@i-chaochen i-chaochen requested review from gulsumgudukbay, jiagaoxiang, lorri-rao and yaomingamd and removed request for cschenjunlin December 11, 2025 18:23
@jiagaoxiang
Copy link

Hi @i-chaochen is this only a documentation change? What should we focus on in our review?

Once inside the container,
```shell
python3 stack.py develop --rebuild-makefile
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good but a question: Should we add a link referencing the Local build.py Build here? @charleshofer

@yaomingamd
Copy link

yaomingamd commented Dec 11, 2025

This commit broken capability to build jaxlib using jax/build/build.py
/ROCm/jax@09c8b8c
git clone git@github.com:ROCm/jax.git
cd jax
python3.11 ./build/build.py build
--target_cpu_features=native
--use_clang=true
--clang_path=/lib/llvm-18/bin/clang-18
--wheels=jaxlib
--verbose
pip install dist/jaxlib* --force-reinstall
https://github.com/ROCm/rocm-jax/blob/update-build-docs/BUILDING.md#2-build--install-jaxlib
Also please make version of built jaxlib wheel same as upstream jaxlib ( not add .dev.xxxxx to version of the wheel.), so that it can work with upstream jax, otherwise python package such as maxtext will uninstall customer build jaxlib and install upstream one.

Comment on lines +14 to +57
## 1. Build & Install Plugin Wheels

```shell
# Set your GFX targets to whatever you've got installed
AMDGPU_TARGETS="$(shell rocminfo | grep -o -m 1 'gfx.*')"

# Clone rocm-jax and run build.py
git clone git@github.com:ROCm/rocm-jax.git
cd jax_rocm_plugin
python3 build/build.py build
--use_clang=true \
--clang_path=/lib/llvm-18/bin/clang-18 \
--wheels=jax-rocm-plugin,jax-rocm-pjrt \
--target_cpu_features=native \
--rocm_path=/opt/rocm \
--rocm_version=7 \
--rocm_amdgpu_targets=${AMDGPU_TARGETS} \
--verbose
pip3 install dist/jax_rocm* --force-reinstall
```

By default, this builds against specific commits of `jax-ml/jax` and
`rocm/xla` that are kept in `jax_rocm_plugin/third_party/xla/workspace.bzl` and
`jax_rocm_plugin/third_party/jax/workspace.bzl`. You can override this and
build with your local JAX and XLA by adding,

```shell
--bazel_options=--override_repository=xla=<path to my XLA>
--bazel_options=--override_repository=jax=<path to my JAX>
```

## 2. Build & Install `jaxlib`

```shell
git clone git@github.com:ROCm/jax.git
cd jax
python3.11 ./build/build.py build \
--target_cpu_features=native \
--use_clang=true \
--clang_path=/lib/llvm-18/bin/clang-18 \
--wheels=jaxlib \
--verbose
pip install dist/jaxlib* --force-reinstall
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please also include a subsection of how to build these libraries with debug symbols in them. The team that I'm on likes to often rebuild these libraries from source with debug symbols for our work.

Thank you!

Copy link

@yaomingamd yaomingamd Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jax_rocm_plugin/build/build.py needs to support build local jaxlib too (just uses jax at rocm_jax/jax), so that it will be consistent with jax_rocm plugin. Also do not add .dev.xxxxxx to name/version of jaxlib wheel. I believe that jax_rocm_plugin/build/build.py has such function previous.

@i-chaochen
Copy link

Hi @i-chaochen is this only a documentation change? What should we focus on in our review?

Hi @jiagaoxiang thanks for review. Yes, if you have any questions/unclear about your current jax/xla build and use scearnios, yes, please ask question based on this docs.

# Clone rocm-jax and run build.py
git clone git@github.com:ROCm/rocm-jax.git
cd jax_rocm_plugin
python3 build/build.py build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you miss \ in the end of this line?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also rocm-jax contains jax_rocm_plugin dir but there is no cd to rocm-jax before cd to jax_rocm_plugin. Is that just missing?

@ScXfjiang
Copy link

ScXfjiang commented Dec 14, 2025

Hi @charleshofer, thanks for the doc!

I think we need to clarify the goal of the XLA team. Most of the time, we don't need to build Docker images. What we really need is to build JAX and XLA with specific commits in the current container so we can run models in MaxText.

As I understand, we need to build four components:

  1. jax (pure python scripts, device/platform-independent)
  2. jax-plugin (device/platform-dependent)
  3. jax-pjrt (device/platform-dependent)
  4. jaxlib (device/platform-dependent)

Two questions about the doc (https://github.com/ROCm/rocm-jax/blob/update-build-docs/BUILDING.md):

  1. You split the installation of jax-plugin, jax-jpjrt, and jaxlib into two steps, which implies users can use different jax commits to build them. Is that intentional to give users more flexibility? Personally, I usually use the same commit to build all of them.
  2. You don't mention the installation of jax in the doc. Could you add this part to the doc because we may also need to update the python-level interfaces in jax as well.

Some typical use cases

  • We have added some new features to XLA (e.g., MX Datatype) to JAX/XLA 0.9.0, so we need to rebuild both JAX/XLA and verify the new feature via MaxText.
  • We need to add some logs in the current XLA code for debugging and we need to build JAX/XLA from source again to see those logs.

Here is the user cases collected from our team:
https://loop.cloud.microsoft/p/eyJ1IjoiaHR0cHM6Ly9hbWRjbG91ZC1teS5zaGFyZXBvaW50LmNvbS9wZXJzb25hbC9jY2hlbjEwNF9hbWRfY29tP25hdj1jejBsTWtad1pYSnpiMjVoYkNVeVJtTmphR1Z1TVRBMFgyRnRaRjlqYjIwbVpEMWlKVEl4WnkxTU1IRmhjSEpDUldWaVZqSkxhVkphU0ZOWFJqTnhablpNYlVFek1VbHZRWE15YkVKQ1kzUlZkakJNVVZsdlEyd3hSVkpLTUhwU1RGZEtOVTh4WHlabVBUQXhWMDFCUWtGRVVFbFJWRUUxUnpaYVJqSkdRVnBDVVRkTFJVbFRXakpJTjBFbVl6MGxNa1ltWVQxTWIyOXdRWEJ3Sm5BOUpUUXdabXgxYVdSNEpUSkdiRzl2Y0Mxd1lXZGxMV052Ym5SaGFXNWxjZz09In0%3D

FYI: I think the installation of jax and maxtext can be decoupled, so let's focus on jax only here.


```shell
# Set your GFX targets to whatever you've got installed
AMDGPU_TARGETS="$(shell rocminfo | grep -o -m 1 'gfx.*')"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shell isn't a linux command and this line can't be executed directly.

```shell
git clone git@github.com:ROCm/jax.git
cd jax
python3.11 ./build/build.py build \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a specific Python version needed here?

Build artifacts are also produced by different commands to the `build/ci_build`
script. This build script does nearly all of its work inside of containers.
It requires that you have an installation of Docker and Python 3.6 or newer.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we have a section to have debug symbol build?

@cj401-amd cj401-amd requested a review from ammarwa December 15, 2025 19:28
AMDGPU_TARGETS="$(shell rocminfo | grep -o -m 1 'gfx.*')"

# Clone rocm-jax and run build.py
git clone git@github.com:ROCm/rocm-jax.git

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, use https: link here, git one may require authorization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants