Skip to content

Conversation

@MISHANMAURYA
Copy link

Titus-von-Koeller and others added 30 commits May 16, 2025 08:41
* Test g5g runner

* Switch L4 to L40S runner; swap GitHub Linux T4 runner for AWS g4dn

* Run tests on last 2 pytorch stable releases

* Run tests on last 2 pytorch stable releases
* General cleanup & test improvements

* Tests: WA numpy 2 compat issue for torch<2.3

* Tests: update aarch64 cpu min torch version

* Tests: update aarch64 cpu min torch version

* Tests: update aarch64 cpu min torch version
* Add torch.compile tests

* Tests: WA aarch64 CPU regressions for torch 2.6.0; add Windows torch==2.7.0+cu118 test config

* Tests: skip torch.compile for cuda on windows
* Start cleaning up docs

* Remove page

* Minor update

* correction

* Minor doc revisions

* Update installation.mdx

* Update _toctree.yml
* enable ipex

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix cpu 8bit quantization

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix int8 and nf4 cpu inference

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add cpu fp4 and rem

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix dequantize nf4 xpu

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix ipex op

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix dequantize nf4 name

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix dequantize nf4 ipex

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix matmul8bitfp

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* enable cpu tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix format

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix quantize blockwise output shape

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix quant_storage bf16 and gemv cpu

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix cpu tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix xpu tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix lib

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* skip xpu dequantize blockwise op check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix matmul8bit

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* skip not used function teests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix matmul8bit fp

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* check ipex before MatMul8bitFp

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update ipex install guide

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update install guide

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix error log

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix error lof

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* update comment

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* move torch op to default

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* revert ipex check

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix code tabledevice

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix code table device

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix xpu ops

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* Tests: xfail opcheck for 4bit quantization with floating storage dtypes

* Tests: xfail opcheck for 4bit quantization with floating storage dtypes

* Tests: skip test_gemv_eye_4bit on CPU with bf16 when not supported by torch

* Tests: skip test_gemv_eye_4bit on CPU with bf16 when not supported by torch
* Tests: add linux x64 cpu+ipex to nightly CI workflow

* typo

* Tests: guard linear8bit compile test for ipex cpu issue
* Deprecation cleanup: remove histogram_scatter_add_2d

* Deprecation cleanup: vectorwise_mm_dequant

* Deprecation cleanup: vectorwise_quant

* Remove unused test

* Optimizer test cleanup

* Deprecations: remove estimate_quantiles, create_quantile_map

* Move deprecated test
* supports hpu backend in main branch

* Update bitsandbytes/backends/hpu/ops.py

updates the assertion message

Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update bitsandbytes/backends/hpu/ops.py

Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>

* Update ops.py

Fix lint issue

* Update ops.py

---------

Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>
matthewdouglas and others added 18 commits June 6, 2025 16:19
…s-foundation#1629)

* [xpu/triton] Add trtion dequantization kernel

This PR adds xpu backend and trtion kernel for dequantization nf4 dtype.
Trtion is an optional import.
Tests:
	tests/test_functional.py::TestQuantize4BitFunctional supported nf4/fp4 cases
	tests/test_functional.py::Test8BitBlockwiseQuantizeFunctional
implemented quantize_blockwise with binary search that works faster for XPU
        tests/test_linear4bit.py

Signed-off-by: Dmitrii Makarenko <dmitrii.makarenko@intel.com>

* align with ipex code

* enable test for ipex

* test_kbit_backprop: skip no longer needed

* remove unused

---------

Signed-off-by: Dmitrii Makarenko <dmitrii.makarenko@intel.com>
* doc fix signature for 8-bit optim

* required changes

* precommit
* Add clang-format rules

* Update clang-format
* Setup XPU CI

* CI: expand XPU matrix

* test

* test

* test

* test

* test

* test

* test

* test

* test

* test

* skip some fp4 tests on hpu

* skip some fp4 tests on hpu

* skip gemv tests on hpu

* test

* Additional test patches for HPU

* HPU test update

* HPU test update

* HPU test update

* HPU test update

* Format
@pnunna93 pnunna93 self-requested a review June 18, 2025 17:15
Copy link

@pnunna93 pnunna93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pnunna93 pnunna93 merged commit 648ecd2 into ROCm:upstream_main_rocm_enabled Jun 18, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.