Skip to content

Comments

feat: vulkan gpu acceleration#15

Open
structwafel wants to merge 7 commits intosevos:mainfrom
structwafel:feature/vulkan-gpu-acceleration
Open

feat: vulkan gpu acceleration#15
structwafel wants to merge 7 commits intosevos:mainfrom
structwafel:feature/vulkan-gpu-acceleration

Conversation

@structwafel
Copy link
Contributor

Summary

This pr adds the ability to create a binary with the extra features: whisper/vulkan .
Using vulkan allows amd cards to have much faster STT, which allows for bigger models.

I tried using hipblas but it had to compile/preprocess the model on each run, vulkan runs instantly.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring

Changes Made

  • optional build feature ["vulkan"], which adds whisper-rs/vulkan for gpu acceleration
  • WHISPER_USE_GPU optional env variable, to enable gpu support
  • WHISPER_GPU_DEVICE selecting which gpu to use for whisper-rs

Testing

  • Tests pass locally with cargo test
  • Code follows the project's style guidelines (cargo fmt and cargo clippy)
  • Self-review of the code has been performed
  • Code has been tested manually (if applicable)

Additional Notes

I thought it seemed unnecessary to make the cpu-only binary have the extra whisper-rs feature.
Binary becomes 28M vs 6M on my machine.

Which is why it is under the feature flag.

Then the question becomes if we should put a second binary target to Cargo.toml? or just keep the same binary name waystt for the one with vulkan runtime.

aur & CI/CD

Have not made any modifications for the aur and ci/cd. If feature is accepted, will make necessary changes + additions (waystt-vulkan-bin directory in /aur)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • [] I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@structwafel structwafel mentioned this pull request Sep 24, 2025
@structwafel
Copy link
Contributor Author

Perhaps the added bloat is not bad enough to just integrate everything? Not sure how big the cuda integration would be.
Or if it is even a problem, as the models are much much much bigger than the program...

So perhaps enabling by default the cuda/vulkan features in case people want to use them.
And add a new GPU_BACKEND env variable for deciding which backend to use?

@sevos
Copy link
Owner

sevos commented Oct 6, 2025

I would not worry about the binary size these days. It would be great if someone could just pick the backend via env var.

@structwafel
Copy link
Contributor Author

I would not worry about the binary size these days. It would be great if someone could just pick the backend via env var.

That would work, until you also want to have whisper-rs/cuda feature.
From my understanding whisper-rs can only enable one feature at the time. As enabling both errors the CMAKE build.

So I'm not actually sure what is the best way.
Have two binaries? waystt-vulkan and waystt-cuda? And have a WHISPER_BACKEND env for setting to cpu/gpu?

- Remove CUDA backend support (Vulkan works on both AMD and NVIDIA)
- Add Dockerfile.build-vulkan for CI builds using LunarG Vulkan SDK
- Update release workflow to build both CPU-only and Vulkan binaries
- Simplify configuration: WHISPER_BACKEND now supports cpu/vulkan only
@structwafel structwafel force-pushed the feature/vulkan-gpu-acceleration branch from 20e7e4e to e206f73 Compare December 8, 2025 05:43
- Enable vulkan feature by default for GPU acceleration
- Quick check CI uses --no-default-features (Vulkan tested in release)
- Simplified release to single binary with GPU support
- Default features = [] (CPU-only, compiles everywhere)
- Release binary built with Vulkan via Dockerfile
- CI checks/tests work without Vulkan SDK
@structwafel
Copy link
Contributor Author

@sevos
Adding cuda support seems to be a bit more whacky.
The cuda support is mostly meant for "native" target, so prebuild bin is not the way.
I can't really test the cuda binaries anyway.

But vulkan should also be good enough for nvidia users.

So currently the artifact created is built in a docker container with vulkan enabled.
Then the user can decide in the config if they want to run with vulkan or not.

Let me know if this is something which seems right or not

@juddey
Copy link

juddey commented Jan 24, 2026

Just to note that I have installed @structwafel's code and its working well for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants