What's Changed
- doc: fix github link by @mwxely in #65
- fix: Fix llava ov batched image padding issue by @kcz358 in #72
- [test] add test for qwen2.5omni, fix qwen2.5omni example by @ngquangtrung57 in #71
- [feat] support BAGEL training with Liger Kernel by @pufanyi in #74
- [docs] Fix BAGEL model packing status in README by @pufanyi in #79
- Enhance MFU reference document introduction by @kcz358 in #82
- add linux uv sync script with automatic platform detection by @oneScotch in #81
- [feat] Qwen3 MoE EP Support by @Jinghao-Guo in #75
- [docs] Add Docker usage instructions to README by @pangyyyyy in #85
- add llada and dream arch examples for dllm training by @JinjieNi in #84
- [feat] Qwen 3 Omni MOE with EP support by @ngquangtrung57 in #88
- [docs] correct doc: diffusion language model by @KemingWu in #89
- Update README.md by @kcz358 in #90
- [fix]: Remove rank == 0 in all makedirs (#93) by @VietCT04 in #94
- [feat] Qwen 3 VL MOE with EP support by @ngquangtrung57 in #92
- [feat] Qwen3 Training support by @yiyexy in #95
- [feat] SP loss better alignment and patch qwen3 vl conv implementation to linear by @kcz358 in #96
- [fix] Handle router logits in Qwen 3 moe and Qwen 3 omni moe for aux loss by @ngquangtrung57 in #98
- [feat] LLaVA-Video Training support by @nssmd in #97
- [feat] Gradient accumulation by @pufanyi in #103
- Remove linear patch for conv3d for now for precision issue by @kcz358 in #105
- [feat] Allow bagel to output logits and logprobs for sde, fix collator padding for padded images by @kcz358 in #109
- [fix] Update Hydra command for multi-node training by @pufanyi in #108
- [fix] Fix some training mismatch in qwen3 vl and rfc parallel logic by @kcz358 in #106
- Add projects using LMMs-Engine to README by @KemingWu in #111
- Fix badge formatting for LongVT project link by @mwxely in #112
- LLaVAOneVision1_5 Support by @Jinghao-Guo in #101
- [feat] Add map style dataset for qwen3 vl by @kcz358 in #115
- [fix] Align better bagel original eval with option to align with flow-grpo sde settings by @kcz358 in #117
- Update section title and project descriptions in README by @mwxely in #118
- [feat] enable freeze submodules by @gathierry in #119
- [feat] Better imports utils for lmms-engine by @kcz358 in #122
- [feat] add EMA (Exponential Moving Average) support for FSDP2 training by @KemingWu in #120
- [fix] relax overwrite_config typing to support non-string config overrides by @KemingWu in #124
- Add Bagel Trainer and fix config, bagel data processor by @KemingWu in #126
- [fix] Applied different rnd seed in bagel so that the noise would be sample… by @kcz358 in #129
- [fix]: use valid labels for SP loss normalization by @kcz358 in #130
New Contributors
- @oneScotch made their first contribution in #81
- @pangyyyyy made their first contribution in #85
- @JinjieNi made their first contribution in #84
- @KemingWu made their first contribution in #89
- @VietCT04 made their first contribution in #94
- @yiyexy made their first contribution in #95
- @nssmd made their first contribution in #97
- @gathierry made their first contribution in #119
Full Changelog: v0.1.2...v0.1.3