[pull] master from ggml-org:master by pull[bot] · Pull Request #705 · LongLeCE/llama.cpp

pull · 2025-12-24T14:42:01Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

Move the graph property checking code into methods of LRU cache. Signed-off-by: Wang Weixuan <wangweixvan@gmail.com>

* model: llama-embed-nemotron * minor: python lint * changed arch-name * templated llm_build_llama to be used for both llama and llama-embed arch

* CUDA: experimental native mxfp4 support for blackwell * optimize load_tiles * optimize quantize_mxfp4 * cleanup * first pass review: formatting * use interleaved layout for mma * mmq: add assert for size * use __nv_fp4x4_e2m1 * use iter_k as 512, cleanup * Use 1200 as blackwell instead of 1000 * address review comments * mmq: fix stride * quantize.cu: use reference impl of e8m0 scale * address review comments * add 120f-virtual + minor fixes --------- Co-authored-by: Aman Gupta <aman>

yoka and others added 5 commits December 24, 2025 17:19

docs: Fix typos in SYCL documentation (#18269)

1ce0126

CANN : refactor ACL graph cache (#17752)

ce7a6dc

Move the graph property checking code into methods of LRU cache. Signed-off-by: Wang Weixuan <wangweixvan@gmail.com>

vulkan: fix command buffer corruption in ggml_backend_vk_event_wait (#…

2a9ea20

…18302)

model : support for LlamaBidirectionalModel architecture (#18220)

54132f1

* model: llama-embed-nemotron * minor: python lint * changed arch-name * templated llm_build_llama to be used for both llama and llama-embed arch

pull bot locked and limited conversation to collaborators Dec 24, 2025

pull bot added the ⤵️ pull label Dec 24, 2025

pull bot merged commit c8a2417 into LongLeCE:master Dec 24, 2025

github-actions bot added documentation Improvements or additions to documentation Nvidia GPU python ggml SYCL Ascend NPU Vulkan model labels Dec 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from ggml-org:master#705

[pull] master from ggml-org:master#705
pull[bot] merged 5 commits intoLongLeCE:masterfrom
ggml-org:master

pull bot commented Dec 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

pull bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pull bot commented Dec 24, 2025 •

edited

Loading