Ability to run on CPU only (no GPU) by noahvandal · Pull Request #12 · p-e-w/heretic

noahvandal · 2025-11-17T12:29:10Z

I have an older mac running intel silicon that was not working; slight pyproject.toml changes to allow for proper older versions of software to allow for non-GPU use cases (for testing, mainly)

p-e-w · 2025-11-17T14:24:23Z

Did you get it working? CPU inference works just fine with the current version on my system.

noahvandal · 2025-11-17T14:25:57Z

Yes, I was being dumb and used the default 3.12 python from uv. Setting it to 3.11 worked just fine. So I closed. Thank you!

noahvandal · 2025-11-17T14:58:05Z

Actually, I am reopening this with a few other provisions for the case of older computers like mine (Mac-Pro, 2019, Intel x64, no GPU (maybe integrated graphics which shows up as MPS, e.g., Intel UHD 630 )).

Was having compatibility issues (need PyTorch >2.1, <2.3; NumPy <2.0), along with this specific version not supporting BFloat.

I hope this helps and doesn't cause issues with anyone else. My initial problem was just getting it installed, which the initial PR was for; this PR is for actual runtime use.

p-e-w · 2025-11-17T15:20:41Z

I don't get it. Why should Python 3.12 fail? I use it all the time, and it works fine with CPU inference.

noahvandal · 2025-11-17T15:25:30Z

You are correct, I have reverted the versioning for python; the original problem was with torch and numpy versions (which were resolved)

p-e-w · 2025-11-17T15:25:59Z

pyproject.toml

    { name = "Philipp Emanuel Weidmann", email = "pew@worldwidemann.com" }
 ]
-requires-python = ">=3.10"
+requires-python = ">=3.10,<3.13"  # Supports 3.10-3.12 (verified with Python 3.12)


Does 3.13 fail?

I did not verify, 3.12 was the highest I verified.

p-e-w · 2025-11-17T15:28:28Z

pyproject.toml

+    # macOS Intel (x86_64): Use >=2.1.0,<2.3.0 (2.3+ dropped macOS Intel support; last with wheels was 2.2.0)
+    "torch>=2.1.0,<2.3.0; platform_machine == 'x86_64' and sys_platform == 'darwin'",
+    # macOS Silicon (ARM64): Can use newer versions with Python 3.12
+    "torch>=2.0.0; platform_machine == 'arm64' and sys_platform == 'darwin'",


Version >= 2.2 is required. 2.0 fails (don't quite remember what the problem was, some multiplication kernel issue I think).

Okay by default 2.2.2 is installed, we can set this param also

p-e-w · 2025-11-17T15:33:56Z

pyproject.toml

+    # Other platforms (Linux, Windows): Can use newer versions
+    "torch>=2.2.0; sys_platform != 'darwin'",
+    # NumPy: PyTorch 2.1-2.2 (macOS Intel) require NumPy 1.x (not 2.x)
+    "numpy<2.0; platform_machine == 'x86_64' and sys_platform == 'darwin'",


Do these dependency specs automatically install the right version on all Mac platforms (both Intel and M series)?

I have no experience with Macs so I'm a little out of my depth here.

That should be incidental, doesn't matter as they are all pointing to the same version

# macOS Silicon (ARM64): Can use newer versions with Python 3.12 "torch>=2.2.0; platform_machine == 'arm64' and sys_platform == 'darwin'", # Other platforms (Linux, Windows): Can use newer versions "torch>=2.2.0; sys_platform != 'darwin'",

p-e-w · 2025-11-17T15:35:07Z

src/heretic/model.py

-        for dtype in settings.dtypes:
-            print(f"* Trying dtype [bold]{dtype}[/]... ", end="")
+        # Filter dtypes: MPS doesn't support bfloat16, so skip "auto" on MPS
+        # (since "auto" typically resolves to bfloat16)


Doesn't that happen automatically when the MPS backend is used? It would be strange if a backend tried to load a format it doesn't support.

What happens if you use auto on MPS?

It did not for mine. I believe it is because it technically is using MPS but it is more of an integrated graphics (Intel UHD 630) instead of an actual Metal backend. On any Apple Silicon Mac I do not believe that error would propagate however I do not have an M series chip.

p-e-w · 2025-11-17T15:37:38Z

src/heretic/model.py

+            dtypes_to_try = [dtype for dtype in dtypes_to_try if dtype != "auto"]
+            if not dtypes_to_try:
+                # If only "auto" was specified, default to float16 for MPS
+                dtypes_to_try = ["float16"]


That's a lot of magic, and might make it difficult for the user to understand why problems happen if they explicitly specified a dtype cascade and the program just does something else.

Fair points. I made changes so this should only affect x86_64 CPU trying to use MPS, so that it only affects a small subset of users using a non-gpu Mac

p-e-w · 2025-11-17T16:11:37Z

src/heretic/model.py

+                else:
+                    self.model = AutoModelForCausalLM.from_pretrained(
+                        settings.model,
+                        torch_dtype=torch_dtype,


This triggers a deprecation warning with newer Transformers versions. The argument name is just dtype now.

p-e-w · 2025-11-17T16:12:59Z

src/heretic/model.py

-                    dtype=dtype,
-                    device_map=settings.device_map,
-                )
+                # Convert dtype string to torch dtype object


Why is this necessary? I'm pretty sure torch accepts both strings and objects as arguments, and performs the conversion automatically.

p-e-w · 2025-11-17T17:11:30Z

src/heretic/model.py

+                    # Then convert dtype explicitly, then move to MPS
+                    self.model = AutoModelForCausalLM.from_pretrained(
+                        settings.model,
+                        torch_dtype=None,  # Load as-is first


Argument name is still wrong. torch_dtype => dtype, otherwise a warning is raised on recent Transformer versions. Same thing below.

p-e-w · 2025-11-18T03:16:48Z

src/heretic/model.py

+                dtypes_to_try = ["float16"]
+
+        for dtype in dtypes_to_try:
+            print(f"* Trying dtype [bold]{dtype_str}[/]... ", end="")


Suggested change

print(f"* Trying dtype [bold]{dtype_str}[/]... ", end="")

print(f"* Trying dtype [bold]{dtype}[/]... ", end="")

This will raise an error otherwise.

p-e-w · 2025-11-18T03:19:51Z

src/heretic/model.py

+                if use_mps and is_intel_x86_64:
+                    # Load to CPU first without dtype to avoid bfloat16 preservation
+                    # Then convert dtype explicitly, then move to MPS
+                    self.model = AutoModelForCausalLM.from_pretrained(


The same thing needs to happen in reload_model below, which is called on every trial.

p-e-w · 2025-11-18T03:20:33Z

src/heretic/model.py

+                        device_map="cpu",  # Load to CPU first
+                        low_cpu_mem_usage=False,  # Ensure full conversion
+                    )
+                    # Convert to desired dtype explicitly (this forces conversion)


What exactly is the difference between this and just specifying the dtype directly when loading the model?

p-e-w · 2025-11-19T04:52:31Z

Please rebase on top of master to run CI.

Add a new --device CLI option that allows users to explicitly select the compute device (auto, cpu, cuda, mps). This enables running Heretic on systems without a GPU. Changes: - Add DeviceType enum with AUTO, CPU, CUDA, MPS options - Add --device setting that derives device_map automatically - Make bitsandbytes import conditional with graceful fallback - Exclude bitsandbytes from Intel Macs in pyproject.toml - Auto-disable quantization on CPU (requires CUDA) - Prefer float32 dtype on CPU for best compatibility - Add validation for explicit CUDA/MPS device requests Closes p-e-w#12

- Make bitsandbytes import conditional with graceful fallback - Exclude bitsandbytes dependency on Intel Macs where it can't work - Provide clear error message when quantization requested without bitsandbytes - Add documentation for CPU-only usage (device_map = "cpu") This allows Heretic to run on systems without CUDA support by using device_map = "cpu" and quantization = "none". Closes p-e-w#12

noahvandal added 2 commits November 16, 2025 17:39

Mac cpu support

07e54fa

Mac cpu support with fallbacks

76b7592

noahvandal closed this Nov 17, 2025

noahvandal added 3 commits November 17, 2025 08:36

Updated lockfile

f7309b6

Updated lockfile for numpy compatibility

c107bbc

MPS dtype proper compatibility on Intel Silicon (no gpu)

00aa0b7

noahvandal reopened this Nov 17, 2025

noahvandal added 2 commits November 17, 2025 09:22

python 12 is fine

b7d7f3f

python 12 is fine - note

e58c394

p-e-w reviewed Nov 17, 2025

View reviewed changes

noahvandal added 2 commits November 17, 2025 09:28

Do not know about py3.13

e2fa847

minimum viable torch version for kernel issue

8c09cf1

p-e-w reviewed Nov 17, 2025

View reviewed changes

Ensure changes regarding dtype only are happening on x86_64

afeaf17

p-e-w reviewed Nov 17, 2025

View reviewed changes

Torch dtype works fine as a string

d18939b

p-e-w reviewed Nov 17, 2025

View reviewed changes

Newer transformers do not accept 'torch_dtype

caa70be

p-e-w reviewed Nov 18, 2025

View reviewed changes

ricyoung mentioned this pull request Jan 8, 2026

fix: make bitsandbytes optional for CPU-only systems #101

Open

3 tasks

	print(f"* Trying dtype [bold]{dtype_str}[/]... ", end="")
	print(f"* Trying dtype [bold]{dtype}[/]... ", end="")

Conversation

noahvandal commented Nov 17, 2025

Uh oh!

p-e-w commented Nov 17, 2025

Uh oh!

noahvandal commented Nov 17, 2025

Uh oh!

noahvandal commented Nov 17, 2025

Uh oh!

p-e-w commented Nov 17, 2025

Uh oh!

noahvandal commented Nov 17, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

p-e-w commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants