MiniCPM-V, a project focused on advanced multimodal large language models for mobile devices, has experienced a slowdown in development activity, with no new commits or pull requests in the past 30 days. The project, known for its high-performance vision-language understanding capabilities, is backed by OpenBMB.
The MiniCPM-V project currently faces 55 open issues, indicating ongoing challenges and active development needs. Recent issues highlight recurring themes such as memory usage during training, LoRA fine-tuning effectiveness, and multi-image input handling. Installation errors and hardware compatibility also present frequent obstacles for users.
LDLINGLINGLING
docs
and assets
.Tianyu Yu (yiranyyu)
Qianyu Chen (qyc-98)
finetune
directory.tc-mb
Hongji Zhu (iceflame89)
YuzaChongyi
HwwwwwwwH
Cui Junbo (Cuiunbo)
Haoye Zhang (Haoye17)
JamePeng
High Number of Open Issues: With 55 open issues, the project faces significant challenges that may hinder progress if not addressed promptly.
Stagnant Development: No new commits or pull requests in the last month suggest a potential pause in active development efforts.
Documentation Focus: Recent activities heavily emphasize documentation updates, indicating an effort to improve user accessibility and understanding of the project's features.
Community Engagement: Despite being a newer project, MiniCPM-V has quickly gained traction with over 10,782 stars on GitHub, reflecting strong community interest.
Hardware Compatibility Concerns: Frequent issues related to installation errors and hardware compatibility highlight ongoing challenges in ensuring smooth deployment across different environments.
Developer | Avatar | Branches | PRs | Commits | Files | Changes |
---|---|---|---|---|---|---|
qianyu chen | 2 | 1/1/0 | 19 | 8 | 8415 | |
Tianyu Yu | 1 | 0/0/0 | 29 | 29 | 4301 | |
LDLINGLINGLING | 1 | 1/2/0 | 17 | 10 | 367 | |
Alphi | 1 | 2/2/0 | 4 | 3 | 275 | |
tc-mb | 1 | 0/0/0 | 7 | 7 | 244 | |
YuzaChongyi | 1 | 0/0/0 | 4 | 3 | 87 | |
Hongji Zhu | 1 | 0/0/0 | 9 | 4 | 61 | |
Cui Junbo | 1 | 0/0/0 | 2 | 1 | 4 | |
Haoye Zhang | 1 | 0/0/0 | 1 | 7 | 0 | |
None (JamePeng) | 0 | 1/0/0 | 0 | 0 | 0 | |
sky (cnsky2016) | 0 | 1/0/0 | 0 | 0 | 0 | |
Ikko Eltociear Ashimine (eltociear) | 0 | 1/0/0 | 0 | 0 | 0 | |
None (BothSavage) | 0 | 2/0/1 | 0 | 0 | 0 | |
Tejas Makode (TejMakode1523) | 0 | 1/0/0 | 0 | 0 | 0 | |
Dr. Artificial曾小健 (ArtificialZeng) | 0 | 1/0/0 | 0 | 0 | 0 | |
Arpit Pathak (Thepathakarpit) | 0 | 1/0/1 | 0 | 0 | 0 |
PRs: created by that dev and opened/merged/closed-unmerged during the period
Timespan | Opened | Closed | Comments | Labeled | Milestones |
---|---|---|---|---|---|
7 Days | 35 | 25 | 57 | 22 | 1 |
14 Days | 95 | 70 | 286 | 62 | 1 |
30 Days | 119 | 95 | 363 | 81 | 1 |
All Time | 421 | 378 | - | - | - |
Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.
The MiniCPM-V project has seen a surge in activity, with a total of 43 open issues currently logged on GitHub. Recent submissions include a mix of bug reports, feature requests, and inquiries about model capabilities, particularly focusing on the latest version (2.6) and its performance in various contexts such as fine-tuning and inference.
Notably, there are several recurring themes among the issues, including concerns about memory usage during training, the effectiveness of LoRA fine-tuning, and the model's ability to handle multi-image inputs. Issues related to installation errors and compatibility with different hardware configurations also appear frequently, indicating potential challenges for users trying to deploy the model effectively.
Here are some of the most recent issues created and updated:
Issue #486: [BUG] Data fetch error - typo
Issue #485: [llamacpp] - Is it possible to provide the ability to perform video inference by the server mode?
Issue #483: [BUG] number of image start tokens and image end tokens mismatch
Issue #482: [BUG] Error when running myMinicpmv2.6:latest model directly.
Issue #481: [BUG] Training speed issue.
Issue #480: [BUG] Lack of position_id in MiniCPMVProcessor.
Issue #479: [vllm] Using vllm API server for video inference.
Issue #478: [vllm] KeyError when running with MiniCPM-V-2_6-int4.
Issue #477: [BUG] AVX instruction set error.
Issue #476: [BUG] Multi-card API startup issue.
The issue regarding the mismatch between image start and end tokens (#483) indicates a potential flaw in how the model processes images, which could lead to runtime errors during inference or training.
There is a significant focus on bugs related to video inference capabilities (#485, #479), suggesting that users are eager for enhanced multimedia processing features within the MiniCPM-V framework.
The high number of issues related to bugs in training speed and memory management (#481, #480) reflects ongoing concerns about resource efficiency and performance optimization in various environments.
The presence of multiple requests for support with LoRA fine-tuning (#481, #482) highlights a growing interest in efficient training methods that require fewer resources while still achieving high performance.
In summary, while MiniCPM-V shows promise with its advanced multimodal capabilities, users are encountering various challenges that need addressing to ensure smoother deployment and optimal performance across different use cases.
The analysis of the pull requests (PRs) from the OpenBMB/MiniCPM-V repository reveals a dynamic development environment focused on enhancing multimodal capabilities, fixing bugs, and improving documentation. The repository currently has 12 open PRs, with a notable emphasis on updates related to the latest version, MiniCPM-V 2.6.
PR #484: Update streamlit implementation for MiniCPM-V 2.6
Created 1 day ago, this PR enhances the Streamlit application to support new multimodal features, including text, images, and video processing. It introduces significant changes to file handling and user interaction.
PR #461: fix mps rely on flash_atten
Created 6 days ago, this PR addresses dependencies related to MPS (Metal Performance Shaders) and flash attention mechanisms, indicating ongoing efforts to optimize performance on Apple hardware.
PR #460: fix
Also created 6 days ago, this PR appears to be a minor fix with limited details provided, reflecting a common practice of addressing small issues as they arise.
PR #435: docs: add Japanese README
Created 10 days ago, this PR expands accessibility by adding a Japanese translation of the README file, demonstrating community engagement and inclusivity.
PR #403: 修复V100无法运行MiniCPM-V-2_6问题
Created 12 days ago, this PR addresses compatibility issues with V100 GPUs, showcasing the team's responsiveness to hardware-specific challenges.
PR #383: Fine tuning of MiniCPM-Llama3-V-2_5-int4
Created 13 days ago, this PR focuses on fine-tuning scripts for improved performance in specific model configurations.
PR #281: feat: Added judgment logic to support training with plain text data
Created 62 days ago, this PR introduces logic to handle text-only datasets effectively, addressing previous limitations in data handling.
PR #304: Update requirements.txt for finetuning requirements
Created 53 days ago, this PR updates the requirements file to include additional packages necessary for fine-tuning processes.
PR #301: Clear the torch cuda cache after response
Created 54 days ago, this PR optimizes memory management by clearing CUDA cache after responses to improve performance during model switching.
PR #293: Update inference_on_multiple_gpus.md
Created 58 days ago, this PR enhances documentation regarding multi-GPU inference setups, improving user guidance.
PR #278: fix a bug with web_demo_streamlit_2.5 at text mode
Created 63 days ago, this PR resolves an error in the Streamlit demo when using text mode without image uploads.
PR #36: [Draft] Add minicpmv finetune script
Created 131 days ago, this draft PR proposes a fine-tuning script for MiniCPM-V but has not yet been finalized or merged.
The current set of open pull requests reflects several key themes in the ongoing development of the MiniCPM-V project:
A significant portion of the recent PRs is dedicated to enhancing the multimodal capabilities introduced in MiniCPM-V 2.6. For instance, PR #484 highlights improvements in Streamlit integration that allow users to upload and process various media types seamlessly—an essential feature for applications requiring diverse input modalities. This aligns with the project's goal of providing high-performance vision-language understanding.
Several pull requests aim at fixing bugs and optimizing performance across different environments and hardware configurations. For example, PR #461 addresses issues related to MPS on macOS while PR #403 resolves compatibility problems with V100 GPUs. These efforts indicate a proactive approach to ensuring that the software remains robust across various platforms and configurations.
The addition of multilingual documentation (as seen in PR #435) and updates to existing guides (like those in PR #293) demonstrate a commitment to user accessibility and community engagement. Clear documentation is crucial for fostering user adoption and facilitating contributions from developers who may not be fluent in English.
The presence of multiple contributors working on diverse aspects—from bug fixes to feature enhancements—suggests an active community around MiniCPM-V. The discussions within some pull requests also indicate collaborative problem-solving efforts among contributors, which is vital for maintaining momentum in open-source projects.
Despite the positive developments, there are notable anomalies such as the high number of open issues (55), which may reflect ongoing challenges or areas needing further attention from maintainers. Additionally, some pull requests like PR #460 lack detailed descriptions or context about their changes, which could hinder effective review processes and collaboration.
In conclusion, the current landscape of pull requests for MiniCPM-V illustrates a vibrant development effort focused on enhancing functionality while addressing user needs through continuous improvement and community involvement. However, there remains room for improvement in documentation clarity and issue resolution processes to ensure sustained project health and community satisfaction.
LDLINGLINGLING
docs
and assets
directories.Tianyu Yu (yiranyyu)
Qianyu Chen (qyc-98)
finetune
directory.tc-mb
Hongji Zhu (iceflame89)
YuzaChongyi
HwwwwwwwH
Cui Junbo (Cuiunbo)
Haoye Zhang (Haoye17)
JamePeng
The development team is actively enhancing the MiniCPM-V project through collaborative efforts focused on documentation improvements and feature enhancements. The recent activities reflect a commitment to maintaining high-quality resources for users while also addressing technical updates necessary for the evolving capabilities of the software.