jd-opensource / xllm Public

Notifications You must be signed in to change notification settings
Fork 140
Star 1k

Code
Issues 74
Pull requests 37
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: jd-opensource/xllm

Labels 15 Milestones 0

New pull request New

37 Open 719 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

feat: support index cache transfer in PD separate deployment scenario.

#895 opened Feb 7, 2026 by sunbaosong

Loading…

feat: adapt for CANN 8.5 and PyTorch 2.7.1 for npu device.

#891 opened Feb 6, 2026 by haimbb000

Loading…

feat: add beam search for recommandation in cuda env.

#890 opened Feb 6, 2026 by liujinguang0125

Loading…

bugfix: fix mrope position tensor shape mismatch in NPU graph mode.

#885 opened Feb 4, 2026 by QwertyJack

Loading…

2 tasks done

feat: add DeepSeek-V3.2 MTP on NPU.

#879 opened Feb 4, 2026 by edison240121

Loading…

feat: support mm_data rpc transport.

#876 opened Feb 3, 2026 by wly-115

Loading…

feat: add VMM submitter APIs for non-blocking vmm::map/unmap.

#874 opened Feb 3, 2026 by shifengmin

Loading…

feat: add piecewise cudagraph for prefill.

#872 opened Feb 3, 2026 by zhang-minchao

Loading…

docs: add iluvatar model support list.

#871 opened Feb 3, 2026 by laneeeee

Loading…

feat: add optimization codes for qwen3 moe

#865 opened Feb 3, 2026 by panxua

Loading…

bugfix: fix early return in prefetch_from_storage.

#862 opened Feb 2, 2026 by shifengmin

Loading…

feat: dynamic and scalable multi-model support.

#861 opened Feb 2, 2026 by Clement-Wang26

Loading…

feat: support eagle3 for qwen3 series.

#859 opened Feb 2, 2026 by RobbieLeung

Loading…

feat: add kernels that support Qwen3 model on musa device.

#856 opened Feb 2, 2026 by FleckyFelix

Loading…

feat: support LongCat-Image on cuda device.

#849 opened Jan 31, 2026 by Dragonliu2018

Loading…

refactor: recfactor xllm docs arch for better readability.

#845 opened Jan 31, 2026 by XuZhang99 • Draft

feat: implement rpc interface in APIService for xllm service internal usage.

#837 opened Jan 30, 2026 by weizhehuang0827

Loading…

bugfix: fix MTP k>1 crash by loading embed_tokens weights

#836 opened Jan 29, 2026 by QwertyJack • Draft

3 tasks done

feat: support QwenImageEditPlus pipeline with embedding infer.

#834 opened Jan 29, 2026 by shan-chen-feng

Loading…

bugfix: fix KV cache memory leak when prefix cache is enabled.

#829 opened Jan 29, 2026 by QwertyJack • Draft

3 tasks

feat: support mtp graph for mlu graph executor.

#825 opened Jan 28, 2026 by a120092009

Loading…

bugfix: get suitable token budget that allocated for sequence when enable MTP with overlap.

#823 opened Jan 28, 2026 by RobbieLeung

Loading…

bugfix: fix missing M-RoPE section in GLM-4V model args.

#822 opened Jan 28, 2026 by wly-115

Loading…

feat: implement bidirectional remote host to local device kv cache transfer and batch offload.

#812 opened Jan 27, 2026 by Kang-Meng • Draft

refactor: centralize ffmpeg resource cleanup.

#810 opened Jan 27, 2026 by xanecdotex

Loading…

Previous 1 2 Next

Previous Next

ProTip! Add no:assignee to see everything that’s not assigned.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!