Add docs for serialization with NX.serialize by aphillipo · Pull Request #630 · elixir-nx/axon

aphillipo · 2026-03-05T12:58:53Z

Added docs for Nx.serialize with examples.
Removed link to non-working onnx docs (axon_onnx needs updating)
Added tests
Updated CHANGELOG for 0.7 to explain what's happened (debatable if this is necessary)

aphillipo · 2026-03-05T13:00:30Z

You are obviously welcome to rip this apart but it's a decent start, I might have a go at updating axon_onnx at some point.

polvalente · 2026-03-05T17:07:28Z

guides/serialization/saving_and_loading.livemd

+
+```elixir
+Mix.install([
+  {:axon, ">= 0.8.0"}


Suggested change

{:axon, ">= 0.8.0"}

{:axon, "~> 0.8"}

polvalente · 2026-03-05T17:07:59Z

guides/serialization/saving_and_loading.livemd

+* Makes the model structure explicit and version-controlled in code
+* Works reliably across processes and deployments
+
+The model itself is just code—you define it once and reuse it. Only the learned parameters need to be persisted.


Suggested change

The model itself is just code—you define it once and reuse it. Only the learned parameters need to be persisted.

The model itself is just code — you define it once and reuse it. Only the learned parameters need to be persisted.

polvalente · 2026-03-05T17:08:30Z

guides/serialization/saving_and_loading.livemd

+trained_model_state = Axon.Loop.run(loop, train_data, Axon.ModelState.empty(), epochs: 2, iterations: 100)
+```
+
+The training loop returns `model_state` by default (from `Axon.Loop.trainer/3`). For inference, we need the parameters—extract the `.data` field from `ModelState`:


Suggested change

The training loop returns `model_state` by default (from `Axon.Loop.trainer/3`). For inference, we need the parameters—extract the `.data` field from `ModelState`:

The training loop returns `model_state` by default (from `Axon.Loop.trainer/3`). For inference, we need the parameters—extract the `data` field from `ModelState`:

polvalente · 2026-03-05T17:08:45Z

guides/serialization/saving_and_loading.livemd

+The training loop returns `model_state` by default (from `Axon.Loop.trainer/3`). For inference, we need the parameters—extract the `.data` field from `ModelState`:
+
+```elixir
+# Extract parameters - ModelState.data contains the nested map of weights


Suggested change

# Extract parameters - ModelState.data contains the nested map of weights

# Extract parameters - trained_model_state.data contains the nested map of weights

polvalente · 2026-03-05T17:09:07Z

guides/serialization/saving_and_loading.livemd

+params =
+  case trained_model_state do
+    %Axon.ModelState{data: data} -> data
+    params when is_map(params) -> params
+  end


Is this case really necessary?

polvalente · 2026-03-05T17:09:48Z

guides/serialization/saving_and_loading.livemd

+# Transfer to binary backend for reliable serialization (avoids issues with Nx 0.11+ on EXLA/Torchx)
+params = Nx.backend_transfer(params)


Which issues would these be? We should not need to transfer it to serialize. If there are issues, that's a bug in Nx 0.11

polvalente · 2026-03-05T17:12:50Z

guides/serialization/saving_and_loading.livemd

+Axon.Loop.run(loop, train_data, Axon.ModelState.empty(), epochs: 3, iterations: 50)
+```
+
+Checkpoints are saved to the `checkpoints/` directory. Each file contains the serialized loop state from `Axon.Loop.serialize_state/2`.


Suggested change

Checkpoints are saved to the `checkpoints/` directory. Each file contains the serialized loop state from `Axon.Loop.serialize_state/2`.

Checkpoints are saved to the `checkpoints/` directory, as configured above. Each file contains the serialized loop state from `Axon.Loop.serialize_state/2`.

polvalente · 2026-03-05T17:14:28Z

guides/serialization/saving_and_loading.livemd

+## Resuming from a Checkpoint
+
+To resume training from a saved checkpoint:
+
+1. Load the checkpoint with `Axon.Loop.deserialize_state/2`
+2. Attach it to your loop with `Axon.Loop.from_state/2`
+3. Run the loop as usual
+
+```elixir
+# Load the checkpoint (use the path from your checkpoint files)
+checkpoint_path = "checkpoints/checkpoint_2_50.ckpt"
+serialized = File.read!(checkpoint_path)
+state = Axon.Loop.deserialize_state(serialized)
+
+# Resume training
+model =
+  Axon.input("data")
+  |> Axon.dense(8)
+  |> Axon.relu()
+  |> Axon.dense(1)
+


@seanmor5 I think there's a bit of a dissonance between not having Axon.serialize/deserialize, while checkpoints need their Axon functions. WDYT?

polvalente · 2026-03-05T17:14:48Z

guides/serialization/saving_and_loading.livemd

+
+# Extract model parameters from step_state
+%{model_state: model_state} = state.step_state
+params = model_state.data |> Nx.backend_transfer()


Same comment about needing backend_transfer

polvalente · 2026-03-05T17:15:31Z

guides/serialization/saving_and_loading.livemd

+## Troubleshooting: ArgumentError with Nx.serialize
+
+If you see `(ArgumentError) argument error` or `:erlang.++` errors when calling `Nx.serialize/2` on parameters (common with Nx 0.11+ and EXLA/Torchx backends), transfer tensors to the binary backend first:
+
+```elixir
+params = Nx.backend_transfer(params)
+params_bytes = Nx.serialize(params)
+```
+
+If serialization still fails, you can use `:erlang.term_to_binary/2` when parameters are on the binary backend (e.g. after `Nx.backend_transfer/1`):
+
+```elixir
+params = Nx.backend_transfer(params)
+params_bytes = :erlang.term_to_binary(params)
+File.write!("model_params.axon", params_bytes)
+
+# To load:
+params = File.read!("model_params.axon") |> :erlang.binary_to_term([:safe])
+```


We should fix this and release 0.11.1 so we can merge this PR without these bug-related caveats

Add docs for serialization with NX.serialize

38be770

aphillipo mentioned this pull request Mar 5, 2026

Axon.serialize seems to be gone? #614

Open

polvalente reviewed Mar 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add docs for serialization with NX.serialize#630

Add docs for serialization with NX.serialize#630
aphillipo wants to merge 1 commit intoelixir-nx:mainfrom
aphillipo:ap-serialization-docs

aphillipo commented Mar 5, 2026 •

edited

Loading

Uh oh!

aphillipo commented Mar 5, 2026 •

edited

Loading

Uh oh!

polvalente Mar 5, 2026

Uh oh!

polvalente Mar 5, 2026

Uh oh!

polvalente Mar 5, 2026

Uh oh!

polvalente Mar 5, 2026

Uh oh!

polvalente Mar 5, 2026

Uh oh!

polvalente Mar 5, 2026

Uh oh!

polvalente Mar 5, 2026

Uh oh!

polvalente Mar 5, 2026

Uh oh!

polvalente Mar 5, 2026

Uh oh!

polvalente Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	The model itself is just code—you define it once and reuse it. Only the learned parameters need to be persisted.
	The model itself is just code — you define it once and reuse it. Only the learned parameters need to be persisted.

	The training loop returns `model_state` by default (from `Axon.Loop.trainer/3`). For inference, we need the parameters—extract the `.data` field from `ModelState`:
	The training loop returns `model_state` by default (from `Axon.Loop.trainer/3`). For inference, we need the parameters—extract the `data` field from `ModelState`:

	# Extract parameters - ModelState.data contains the nested map of weights
	# Extract parameters - trained_model_state.data contains the nested map of weights

		# Transfer to binary backend for reliable serialization (avoids issues with Nx 0.11+ on EXLA/Torchx)
		params = Nx.backend_transfer(params)

	Checkpoints are saved to the `checkpoints/` directory. Each file contains the serialized loop state from `Axon.Loop.serialize_state/2`.
	Checkpoints are saved to the `checkpoints/` directory, as configured above. Each file contains the serialized loop state from `Axon.Loop.serialize_state/2`.

Conversation

aphillipo commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aphillipo commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aphillipo commented Mar 5, 2026 •

edited

Loading

aphillipo commented Mar 5, 2026 •

edited

Loading