LLaMA-Factory, a project providing a WebUI for fine-tuning over 100 large language models, has seen a notable increase in user-reported issues related to model fine-tuning and memory management, while maintaining a high level of development activity.
Recent issues and pull requests (PRs) reveal a focus on addressing technical challenges such as out-of-memory (OOM) errors during training and inference, particularly with advanced configurations like LoRA and DeepSpeed. The development team is actively working on enhancing hardware compatibility and improving user experience through new templates and documentation updates.
Developer | Avatar | Branches | PRs | Commits | Files | Changes |
---|---|---|---|---|---|---|
hoshi-hiyouga | 1 | 0/0/0 | 33 | 52 | 864 | |
이루리 | 1 | 1/1/0 | 1 | 2 | 452 | |
Иван | 1 | 1/1/0 | 1 | 4 | 132 | |
moontidef | 1 | 2/2/0 | 3 | 9 | 64 | |
Richard Wen | 1 | 2/1/0 | 2 | 4 | 26 | |
codingma | 1 | 3/3/0 | 9 | 7 | 21 | |
khazzz1c | 1 | 1/1/1 | 1 | 2 | 5 | |
liudan | 1 | 0/0/0 | 1 | 2 | 4 | |
Liuww | 1 | 1/1/0 | 1 | 1 | 2 | |
None (noiji) | 0 | 1/0/0 | 0 | 0 | 0 | |
piamo (piamo) | 0 | 0/1/0 | 0 | 0 | 0 | |
None (Zxilly) | 0 | 1/0/0 | 0 | 0 | 0 | |
Sangchun Ha (Patrick) (upskyy) | 0 | 1/0/1 | 0 | 0 | 0 | |
zzc (zzc0430) | 0 | 1/0/0 | 0 | 0 | 0 | |
kang sheng (techkang) | 0 | 1/0/1 | 0 | 0 | 0 | |
Huiyu Chen (chenhuiyu) | 0 | 1/0/0 | 0 | 0 | 0 | |
Ikko Eltociear Ashimine (eltociear) | 0 | 1/0/0 | 0 | 0 | 0 | |
Uminosachi (Uminosachi) | 0 | 1/0/0 | 0 | 0 | 0 | |
None (liu-zichen) | 0 | 1/0/0 | 0 | 0 | 0 | |
Ricardo (Ricardo-L-C) | 0 | 1/0/0 | 0 | 0 | 0 | |
Coding Steven (Truecodeman) | 0 | 1/0/0 | 0 | 0 | 0 | |
WeepingDogel (WeepingDogel) | 0 | 1/0/1 | 0 | 0 | 0 | |
None (LDLINGLINGLING) | 0 | 1/1/0 | 0 | 0 | 0 | |
None (huang-yu-sheng) | 0 | 1/0/1 | 0 | 0 | 0 |
PRs: created by that dev and opened/merged/closed-unmerged during the period
Timespan | Opened | Closed | Comments | Labeled | Milestones |
---|---|---|---|---|---|
7 Days | 52 | 14 | 28 | 1 | 1 |
30 Days | 260 | 191 | 374 | 1 | 1 |
90 Days | 311 | 200 | 476 | 1 | 1 |
All Time | 4587 | 4461 | - | - | - |
Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.
The LLaMA-Factory project currently has 126 open issues, with recent activity indicating a surge in user inquiries and bug reports. Notably, many issues are related to model fine-tuning, inference errors, and memory management challenges, particularly with large models. A recurring theme is the struggle with out-of-memory (OOM) errors during training and inference, especially when using advanced configurations like LoRA and DeepSpeed.
Several issues exhibit similar characteristics, such as users encountering problems with specific model configurations or training parameters that lead to unexpected behavior. For instance, issues related to the integration of new models or methods often highlight the need for better documentation or examples.
Here are some of the most recently created and updated issues:
Issue #5206: 部署的时候能不能支持 vllm 部署 且 base 和 lora 不合并,即能访问base模型也能访问lora模型
Issue #5205: 请问webui后出现以下情况是什么原因,之前成功过一次,后来就不行了
Issue #5204: deepseekcoder-33b-instruct 微调输出不停止,微调的时候使用的是deepseekcoder的指令模版
Issue #5203: llava多模态模型使用dpo + lora微调报错
Issue #5202: llava多模态模型使用kto + lora微调报错
Issue #5199: Can't do inference on NPU, using the untuned Qwen2-7B-Instruct model
Issue #5195: save_steps is always 5
Issue #5194: What is the use of the config train on prompt, can anyone give me the docs of this extra config?
Issue #5190: 能不能把可选的依赖也直接放在包里呢
Issue #5189: LLaMA Factory, version 0.8.4.dev0 on Ascend910B failed for missing Op OnesLike
These issues reflect a mix of user requests for feature enhancements and reports of technical difficulties encountered during model training and deployment.
This analysis highlights both the challenges faced by users and the ongoing engagement within the community to address these concerns through collaborative problem-solving and feature requests.
The LLaMA-Factory project has a total of 18 open pull requests (PRs) currently under review. These PRs focus on enhancing model compatibility, improving performance, and adding new features to the framework.
PR #5193: _is_bf16_available judgment supports npu
is_bf16_available
check to include NPU devices, which is crucial for users working with Ascend hardware. It ensures that the framework correctly identifies device capabilities for bf16 support.PR #5188: fix: report correct device count for intel xpu
PR #5185: Add SailorLLM template
PR #5170: Avoid casting model params to float32 with unsloth
PR #5163: fix lr not change
PR #5156: fix Llama-template's system prompt bug
PR #5118: Support MistralV2 Format
PR #4733: merge easycontext
PR #5019: overwrite training_step for CustomDPOTrainer to clear cuda cache every train step
PR #4957: docs: add Japanese README
Additional PRs (#4877 to #1624) focus on various enhancements including metric updates, documentation improvements, and feature additions that support multiple models and functionalities within the LLaMA-Factory ecosystem.
The current set of open pull requests reflects several key themes in the ongoing development of the LLaMA-Factory project:
A significant number of PRs are focused on improving compatibility with various hardware configurations (e.g., NPU and Intel XPU). For example, PR #5193 enhances bf16 support for NPU devices, while PR #5188 ensures accurate device counting for Intel's OneAPI. These changes are crucial as they allow users to leverage different hardware effectively, which is particularly important in environments where resources may vary significantly.
Several PRs aim to improve user experience through better templates and documentation (e.g., PR #5185 introduces a SailorLLM template). The addition of language-specific templates and translations (like the Japanese README in PR #4957) indicates a strong commitment to making the framework accessible to a diverse user base. This is particularly relevant as LLaMA-Factory aims to cater to global audiences with varying linguistic backgrounds.
There is a clear emphasis on addressing existing issues and bugs within the codebase (e.g., PR #5163 fixes learning rate issues). This proactive approach not only improves stability but also builds trust within the community by demonstrating that contributors are attentive to user feedback and willing to resolve problems promptly.
The introduction of new features such as support for additional model formats (e.g., MistralV2 in PR #5118) showcases an ongoing effort to expand the capabilities of LLaMA-Factory. This aligns with the project's goal of supporting a wide range of models and methodologies in fine-tuning large language models.
The active discussion around many PRs indicates robust community engagement. Contributors are not only submitting code but also interacting with each other through comments and suggestions, fostering a collaborative environment that can lead to more innovative solutions and improvements over time.
In conclusion, the current pull requests reflect a dynamic development process characterized by enhancements in compatibility, user experience, bug fixes, and feature expansions—all critical elements that contribute to the overall robustness and usability of LLaMA-Factory as a leading framework for fine-tuning large language models.
codemayq (codingma)
hiyouga (hoshi-hiyouga)
YeQiuO (Richard Wen)
relic-yuexi (moontidef)
Eruly (이루리)
liudan
HardAndHeavy (Иван)
khazic
liuwwang