Skip to content

[pull] master from ggml-org:master#878

Merged
pull[bot] merged 4 commits intoLongLeCE:masterfrom
ggml-org:master
Feb 14, 2026
Merged

[pull] master from ggml-org:master#878
pull[bot] merged 4 commits intoLongLeCE:masterfrom
ggml-org:master

Conversation

@pull
Copy link

@pull pull bot commented Feb 14, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

iMilnb and others added 4 commits February 14, 2026 09:47
last_graph is only available without OpenMP, but
ggml_graph_compute_thread() is called in both cases.

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* models : optimizing qwen3next graph

* cont

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* cont : remove redundant q, g chunking

* minor

* minor

* avoid passing masks around

* avoid concats during chunking

* naming + shapes

* update names and use prefix to disable CUDA graphs
* nemotron nano v2 vlm support added

* simplified code; addressed reviews

* pre-downsample position embeddings during GGUF conversion for fixed input size
@pull pull bot locked and limited conversation to collaborators Feb 14, 2026
@pull pull bot added the ⤵️ pull label Feb 14, 2026
@pull pull bot merged commit 01d8eaa into LongLeCE:master Feb 14, 2026
13 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants