Skip to content

[pull] master from ggml-org:master#756

Merged
pull[bot] merged 7 commits intoLongLeCE:masterfrom
ggml-org:master
Jan 8, 2026
Merged

[pull] master from ggml-org:master#756
pull[bot] merged 7 commits intoLongLeCE:masterfrom
ggml-org:master

Conversation

@pull
Copy link

@pull pull bot commented Jan 8, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

DocShotgun and others added 7 commits January 8, 2026 11:03
* ggml: add env var GGML_OP_OFFLOAD_MIN_BATCH
* makes the min_batch_size for triggering op offload configurable via env var, defaulting to the prior hardcoded value of 32

* ggml: read GGML_OP_OFFLOAD_MIN_BATCH once and store to dev ctx

* cann: forward declaration of device context struct

* cann: move offload op check after device context declaration

* cuda: fix whitespace

Co-authored-by: Aman Gupta <amangupta052@gmail.com>

---------

Co-authored-by: Aman Gupta <amangupta052@gmail.com>
Add template specialization for kernel_mul_mm_id_map0 with ne20=5
to support models using 5 active experts (e.g., VAETKI).
* vendor : update cpp-httplib to 0.30.0
* common : allow custom headers when downloading
* vulkan: optimize ssm_scan

* fix warp vs subgroup naming
I added an assert to catch further mismatches, and it found several.
Fix those, too.
@pull pull bot locked and limited conversation to collaborators Jan 8, 2026
@pull pull bot added the ⤵️ pull label Jan 8, 2026
@pull pull bot merged commit 2524c26 into LongLeCE:master Jan 8, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants