MiniCPM is an open-source project by OpenBMB that develops small-scale large language models for edge devices. Its flagship 2.7B parameter model competes with much larger LLMs while enabling real-time inference on mobile phones.
Recent development has focused on expanding deployment options, enhancing fine-tuning capabilities, and improving quantization techniques. The project has seen consistent activity, with notable additions including QLoRA training, AutoAWQ quantization support, and integration with frameworks like LangChain and LLaMA Factory.
Issues and PRs indicate a strong focus on deployment and performance optimization. Users have reported challenges deploying MiniCPM on various platforms, particularly mobile devices (e.g., #149, #104). The team has responded by adding support for frameworks like Ollama, FastLLM, and PowerInfer (PRs #145, #79, #166).
Recent development team activity:
LDLINGLINGLING:
cyx2000:
zh-zheng:
ywfang:
SwordFaith:
The project maintains competitive performance with much larger models while enabling edge deployment, demonstrating the potential for efficient LLMs.
Recent development has heavily focused on quantization and deployment optimization, suggesting a strong commitment to edge AI applications.
The addition of multi-modal (MiniCPM-V) and mixture-of-experts (MiniCPM-MoE-8x2B) variants indicates exploration of advanced model architectures within the small-scale LLM paradigm.
While the project shows active development, the centralized review process (most PRs merged by LDLINGLINGLING) may impact long-term community engagement.
Recurring deployment issues across platforms suggest a need for more robust cross-platform support and documentation.
Timespan | Opened | Closed | Comments | Labeled | Milestones |
---|---|---|---|---|---|
7 Days | 4 | 0 | 3 | 0 | 1 |
30 Days | 12 | 9 | 14 | 3 | 1 |
90 Days | 39 | 47 | 61 | 13 | 1 |
All Time | 148 | 131 | - | - | - |
Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.
Developer | Avatar | Branches | PRs | Commits | Files | Changes |
---|---|---|---|---|---|---|
LDLINGLINGLING | 1 | 4/4/0 | 6 | 4 | 62 |
PRs: created by that dev and opened/merged/closed-unmerged during the period
Here is a brief analysis of the recent GitHub issue activity for the MiniCPM project:
Recent Activity Analysis:
The project has seen steady issue activity over the past few months, with 17 open issues currently. Recent issues focus on topics like model deployment, inference errors, and requests for additional features or clarifications.
Some notable issues and themes:
Deployment challenges: Several users have reported issues deploying MiniCPM on different platforms, especially mobile/edge devices. For example:
Issue #188 reports errors when trying to run multi-modal inference on MiniCPM-V.
These suggest ongoing work may be needed to improve cross-platform deployment stability.
Inference and performance: A few issues relate to unexpected inference results or performance:
Issue #163 reports unexpectedly long inference times on a GPU.
Issue #143 describes poor code generation capabilities compared to reported benchmark results.
Feature requests: There are several requests for additional features or model variants:
Issue #91 asks about releasing base model weights (pre-fine-tuning).
Issue #60 suggests adding support for the fastllm inference framework.
Documentation and examples: Some issues request more detailed documentation or examples:
Issue #66 asks for examples of loading fine-tuned LoRA weights.
Overall, the issues suggest active community engagement with the project, with users attempting deployments across various platforms. The maintainers appear responsive, often providing workarounds or explanations. However, some recurring deployment and inference challenges may warrant further investigation or documentation improvements.
Issue Details:
Most recently created: #188: "[Bad Case]: 多模态 MiniCPM-V 推理报错" (open, created 0 days ago) #187: "[Bad Case]: 多模态MiniCPM-V 2.0 transformers 推理报错" (open, created 2 days ago)
Most recently updated: #188: "[Bad Case]: 多模态 MiniCPM-V 推理报错" (open, updated 0 days ago) #187: "[Bad Case]: 多模态MiniCPM-V 2.0 transformers 推理报错" (open, updated 0 days ago)
These recent issues both relate to inference errors with the multi-modal MiniCPM-V model, suggesting this may be an area requiring attention from the development team.
This report analyzes 31 closed pull requests for the OpenBMB/MiniCPM repository, which contains an open-source large language model designed for edge devices.
#183: Added tutorial entry points for MiniCPM in README files (16 days ago)
#180: Added xtuner open source community link (18 days ago)
#177: Added LLaMA-Factory navigation to homepage (18 days ago)
#176: Added QLoRA training method (21 days ago)
#172: Added Langchain demo for multi-file RAG on GPUs with <6GB VRAM (31 days ago)
#170, #169: Added quick navigation, quantization, and LLaMA-Factory content to README (32 days ago)
#166: Added PowerInfer deployment example for MiniCPM-S-1B model (39 days ago)
#162: Fixed two bugs in MLX code (50 days ago)
#161: Added LLaMA-Factory fine-tuning examples (50 days ago)
#157: Added AutoAWQ support for MiniCPM (52 days ago)
#156: Fixed user token issues for different model sizes (53 days ago)
#145: Added Ollama support for MiniCPM-1B (53 days ago)
#122: Added OpenAI API support (56 days ago)
#111: Added MiniCPMV support in Hugging Face demo (43 days ago)
#110: Added MLX inference for Mac (127 days ago)
#106: Added bf16 and fp16 settings for fine-tuning (114 days ago)
#79: Added FastLLM support (168 days ago)
Earlier PRs (>170 days ago) mainly involved documentation updates, bug fixes, and minor feature additions.
The pull requests for the MiniCPM project demonstrate a consistent focus on improving accessibility, performance, and deployment options for the model. Several key themes emerge from this analysis:
Expanding Deployment Options: Many PRs focused on adding support for various deployment frameworks and platforms. This includes Ollama (#145), FastLLM (#79), MLX for Mac (#110), and PowerInfer (#166). These additions significantly broaden the model's usability across different environments and hardware.
Enhancing Fine-tuning Capabilities: PRs #176 and #161 introduced new fine-tuning methods like QLoRA and LLaMA-Factory integration. This demonstrates a commitment to improving the model's adaptability for specific tasks and domains.
Quantization and Efficiency: Several PRs (#157, #169) addressed quantization techniques like AutoAWQ and bitsandbytes (bnb). This aligns with MiniCPM's goal of efficient deployment on edge devices.
Documentation and Accessibility: Many PRs (#183, #180, #177) focused on improving documentation, adding tutorials, and enhancing navigation in the README files. This indicates a strong emphasis on making the project more accessible to users and contributors.
Bug Fixes and Optimizations: PRs like #162 and #156 addressed specific bugs and optimized the code for different model sizes, showing ongoing maintenance and refinement of the codebase.
Expanding Ecosystem Integration: The addition of OpenAI API support (#122) and Langchain demo (#172) shows efforts to integrate MiniCPM with popular AI development ecosystems.
Multi-modal and Specialized Versions: PR #111 added support for MiniCPMV, indicating development of specialized versions of the model.
The frequency and nature of these pull requests suggest an active development cycle with contributions from multiple community members. The project appears to be evolving rapidly, with a clear focus on making the model more versatile, efficient, and accessible to a wide range of users and deployment scenarios.
However, it's worth noting that most PRs are being merged by a single user (LDLINGLINGLING), which might indicate a centralized review process. Encouraging more diverse reviewer participation could potentially benefit the project's long-term sustainability and community engagement.
Overall, the pull requests reflect MiniCPM's positioning as a competitive, efficient large language model suitable for edge devices, with ongoing efforts to expand its capabilities and ease of use across various platforms and use cases.
LDLINGLINGLING:
cyx2000:
Zhi Zheng (zh-zheng):
ywfang:
Xiang Long (SwordFaith):
zRzRzRzRzRzRzR:
DingDing (ShengdingHu):
Active development: The repository shows consistent activity over the past 30 days, with frequent updates and improvements.
Focus on efficiency and deployment: Recent work has centered on quantization, edge deployment, and support for efficient inference frameworks like MLX and PowerInfer.
Expanding model variants: The team has been adding support for new MiniCPM variants, including MoE and 128k context length versions.
Improving documentation: There's a consistent effort to keep documentation up-to-date in both English and Chinese.
Integration with popular frameworks: Recent work has focused on integrating MiniCPM with frameworks like LangChain, LLaMA Factory, and OpenAI API.
Performance optimization: The team is actively working on quantization techniques (QLora, BNB, AutoAWQ) to improve model efficiency.
Community engagement: The addition of tutorials, examples, and improved navigation suggests an effort to make the project more accessible to users and potential contributors.