OSS Report: OpenBMB/MiniCPM-V

Sept. 18, 2024, 1:30 p.m. UTC This report was generated by Dispatch AI

Surge in Documentation and Feature Enhancements Marks MiniCPM-V's Progress

MiniCPM-V, a multimodal large language model project, has seen significant updates in documentation and feature enhancements over the past 30 days, reflecting a strong focus on usability and performance optimization.

The MiniCPM-V project aims to advance vision-language understanding with models capable of processing images and video inputs. It is designed for deployment across various platforms, including mobile devices.

Recent Activity

Recent issues and pull requests (PRs) indicate a focus on improving fine-tuning processes and addressing user-reported bugs. Notable issues include #587, which highlights problems with image token handling, and #585, concerning instability during full model fine-tuning. These issues suggest challenges in model configuration and usability.

Development Team Activities

Cui Junbo (Cuiunbo): Merged PR #543 for eval_mm modifications.
LDLINGLINGLING: Made 10 commits updating wechat.md for WeChat integration.
Haoyu Li (lihytotoro): Contributed to eval_mm changes, collaborating with Cui Junbo.
JamePeng: Made 4 commits enhancing Streamlit implementation for MiniCPM-V 2.6.
Tianyu Yu (yiranyyu): Focused on README updates and documentation.
Qianyu Chen (qyc-98): Active in the 2.6-sft branch with fine-tuning updates.
Hongji Zhu (iceflame89): Contributed to README enhancements.
Jacky Jin Jing, Mandlin Sarah, 2U1, Glenn124f: No recent contributions.

Of Note

Documentation Emphasis: Extensive updates to documentation files like README.md and wechat.md highlight a commitment to clarity and user guidance.
Collaborative Efforts: Effective collaboration is evident in joint PRs and merges, particularly between Cui Junbo and Haoyu Li.
Streamlit Enhancements: JamePeng's work on the Streamlit interface reflects ongoing efforts to improve user experience.
Active Branching for Fine-Tuning: The 2.6-sft branch indicates focused efforts on optimizing model performance through fine-tuning.

Overall, the MiniCPM-V project is actively evolving with a strong emphasis on improving both functionality and user experience.

Quantified Reports

Quantify Issues

Recent GitHub Issues Activity

Timespan	Opened	Closed	Comments	Labeled	Milestones
7 Days	19	24	21	17	1
14 Days	30	28	30	25	1
30 Days	98	72	169	84	1
All Time	517	453	-	-	-

_{Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.}

Quantify commits

Quantified Commit Activity Over 30 Days

Developer	Branches	PRs	Commits	Files	Changes
Haoyu Li	1	1/1/0	1	69	9971
JamePeng	1	0/1/0	4	1	182
Tianyu Yu	1	0/0/0	5	5	79
LDLINGLINGLING	1	0/0/0	10	5	12
Yu-won Lee (2U1)	0	1/0/0	0	0	0
Cui Junbo	0	0/0/0	0	0	0
Glenn Fernandes (glenn124f)	0	0/0/1	0	0	0
None (jackyjinjing)	0	1/0/0	0	0	0
Mandlin Sarah (mandlinsarah)	0	1/0/0	0	0	0

_{PRs: created by that dev and opened/merged/closed-unmerged during the period}

Detailed Reports

Report On: Fetch issues

Recent Activity Analysis

The MiniCPM-V project has seen a surge in recent activity, with 64 open issues currently logged. Notably, many of these issues revolve around bugs related to fine-tuning and inference, particularly concerning the handling of image tokens and memory management during training. There is a clear trend of users encountering difficulties with model performance and configuration, suggesting that while the model's capabilities are robust, its usability may require further refinement.

Several issues highlight common problems, such as discrepancies in expected input formats and errors related to tensor dimensions during training and inference. This indicates potential gaps in documentation or implementation that could hinder user experience.

Issue Details

Most Recently Created Issues

Issue #587: [BUG] 在funetune/dataset.py中报告image start token != image end tokens的错误
- Priority: High
- Status: Open
- Created: 0 days ago
- Update: N/A
Issue #586: How to use the forward call?
- Priority: Medium
- Status: Open
- Created: 0 days ago
- Update: N/A
Issue #585: 全量微调vision端不稳定
- Priority: High
- Status: Open
- Created: 0 days ago
- Update: N/A
Issue #584: How not to save the files in global_step1 during training
- Priority: Medium
- Status: Open
- Created: 1 day ago
- Update: N/A
Issue #582: how to modify the system prompt
- Priority: Low
- Status: Open
- Created: 2 days ago
- Update: 1 day ago

Most Recently Updated Issues

Issue #578: [BUG] finetune minicpm error
- Priority: High
- Status: Open
- Created: 4 days ago
- Update: N/A
Issue #577: Finetuning issue
- Priority: Medium
- Status: Open
- Created: 4 days ago
- Update: 1 day ago
Issue #576: [BUG] problem when awq self-quantization
- Priority: High
- Status: Open
- Created: 4 days ago
- Update: N/A
Issue #575: [BUG] Screenshot of code. OCR does not result in correct code
- Priority: Medium
- Status: Open
- Created: 5 days ago
- Update: N/A
Issue #574: 是否可以透漏in-context learning是怎么训练的吗
- Priority: Low
- Status: Open
- Created: 5 days ago
- Update: N/A

Analysis of Notable Issues

Many recent issues focus on bugs related to image token handling, particularly the mismatch between image start and end tokens (#587). This suggests that users are struggling with the input format required by the model, which could lead to confusion and inefficiencies in fine-tuning processes.
The inquiry regarding how to use the forward call (#586) indicates a need for clearer documentation or examples on utilizing model methods effectively.
The issue concerning instability during full model fine-tuning (#585) raises concerns about the robustness of training configurations, especially when users report poor loss curves and performance metrics.
A recurring theme is the need for better guidance on configuring training parameters and understanding model outputs, as evidenced by multiple requests for clarification on expected behaviors during inference and fine-tuning.

Conclusion

The MiniCPM-V project is experiencing active engagement from its user base, with a significant number of issues reflecting both technical challenges and requests for improved usability. Addressing these concerns through enhanced documentation, clearer examples, and potential bug fixes will be crucial for maintaining user satisfaction and fostering continued development within this rapidly evolving project.

Report On: Fetch pull requests

Overview

The analysis of the pull requests (PRs) for the MiniCPM-V project reveals a dynamic and active development environment. The project has seen a significant number of contributions, both in terms of new features and bug fixes, indicating a responsive approach to community feedback and an ongoing effort to enhance the model's capabilities.

Summary of Pull Requests

Open Pull Requests

PR #579: Fixes an error in the fine-tuning process of MiniCPM-V. This PR is crucial as it addresses specific issues (#578, #581) that could impact the model's performance during fine-tuning.
PR #521: Introduces features for setting different learning rates (vision_lr and resampler_lr) during fine-tuning. This PR is significant as it aims to improve fine-tuning performance based on user experiences and external research.
PR #556: Enhances exception messages for better readability and debugging. While minor, this PR improves the developer experience by making error messages clearer.
PR #461: Addresses a dependency issue with flash_atten on MPS. This PR is important for users relying on specific hardware configurations.
PR #460: A simple fix in the README.md file. While not critical, it contributes to better documentation.
PR #435: Adds Japanese translations to the README files, expanding accessibility for non-English speakers.
PR #403: Fixes an issue with running MiniCPM-V on V100 GPUs, which is crucial for users with this hardware.
PR #383: Adds fine-tuning scripts for a specific model variant (MiniCPM-Llama3-V-2_5-int4), catering to specialized use cases.
PR #304: Updates requirements.txt to include additional packages needed for fine-tuning, ensuring users have all necessary dependencies.
PR #301: Clears CUDA cache after responses to prevent VRAM issues when switching between different modes (sampling and beam search).
PR #281: Adds support for training with plain text data, addressing issues when mixing image-text pair data with text-only data in batches.
PR #278: Fixes a bug in web_demo_streamlit_2.5.py related to text mode interactions.

Closed Pull Requests

PR #484: Updates the Streamlit implementation for MiniCPM-V 2.6, enhancing its multi-modal capabilities and user experience.
PR #293: Updates documentation regarding inference on multiple GPUs but was not merged, possibly due to redundancy or changes in documentation strategy.
PR #543: Modifies evaluation metrics for MiniCPM-V 2.6, improving benchmarking processes.

Analysis of Pull Requests

The pull requests reflect several key themes in the development of MiniCPM-V:

Active Community Engagement: The number of open and closed PRs indicates a vibrant community contributing to the project. Contributions range from bug fixes and feature additions to documentation improvements, showcasing diverse engagement from users and developers alike.
Focus on Usability and Performance Enhancements: Many PRs aim to enhance usability (e.g., clearer error messages, better documentation) and performance (e.g., fine-tuning improvements, hardware compatibility fixes). This focus suggests a commitment to providing a robust and user-friendly tool.
Rapid Iteration and Responsiveness: The quick turnaround from PR creation to closure (often within days) highlights an efficient review process and responsiveness to emerging issues or feature requests.
Diverse Contributions Addressing Various Aspects of the Project:
- Technical improvements (e.g., fine-tuning scripts, CUDA cache management).
- Usability enhancements (e.g., improved error messages, Japanese translations).
- Documentation updates ensuring clarity and accessibility.
Global Reach and Accessibility Efforts: The addition of Japanese translations in PR #435 reflects efforts to make the project accessible to a broader audience.

In conclusion, the pull request activity in the MiniCPM-V project demonstrates a well-managed open-source initiative with active community involvement, continuous improvement efforts, and a strong focus on usability and performance optimization. The project's ability to rapidly address issues and incorporate new features is indicative of its healthy development lifecycle.

Report On: Fetch commits

Repo Commits Analysis

Development Team and Recent Activity

Team Members and Recent Contributions

Cui Junbo (Cuiunbo)
- Recent Activity:
- Merged pull request #543, modifying eval_mm for MiniCPM-V 2.6, with extensive changes across multiple files (9971 lines added).
- No recent commits in the last 30 days.
LDLINGLINGLING
- Recent Activity:
- Made 10 commits, primarily updating wechat.md and adding images related to WeChat integration.
- Collaborated with other team members through multiple updates to the same documentation file.
Haoyu Li (lihytotoro)
- Recent Activity:
- Contributed a significant commit (9971 lines) related to eval_mm for MiniCPM-V 2.6.
- Merged pull request indicating collaboration with Cui Junbo.
JamePeng
- Recent Activity:
- Made 4 commits focused on updating the Streamlit implementation for MiniCPM-V 2.6, including bug fixes and optimizations.
- Collaborated with other team members by merging branches and updating files.
Tianyu Yu (yiranyyu)
- Recent Activity:
- Contributed 5 commits mainly focused on updating README files and documentation.
- Engaged in merging branches and ensuring documentation is up-to-date.
Qianyu Chen (qyc-98)
- Recent Activity:
- Active in the 2.6-sft branch with multiple updates related to fine-tuning and README modifications.
- No recent contributions in the default branch.
Hongji Zhu (iceflame89)
- Recent Activity:
- Contributed to README updates and merged pull requests, enhancing documentation clarity.
- No recent contributions in the default branch.
Jacky Jin Jing, Mandlin Sarah, 2U1, Glenn124f
- Recent Activity:
- No recent commits or contributions noted.

Patterns and Themes

Documentation Focus: A significant amount of recent activity revolves around updating documentation (README.md, wechat.md), indicating a strong emphasis on user guidance and project clarity.
Collaborative Merging: Multiple merges indicate effective collaboration among team members, particularly between Cui Junbo and Haoyu Li regarding the eval_mm modifications.
Feature Enhancements: JamePeng's updates reflect ongoing efforts to enhance the Streamlit interface for improved user experience, showcasing a focus on usability alongside backend improvements.
Active Branching: The presence of an active branch (2.6-sft) suggests ongoing work on fine-tuning capabilities, which aligns with the project's goals of optimizing performance across various platforms.

Conclusions

The development team is actively engaged in enhancing both functionality and documentation of the MiniCPM-V project. Recent activities highlight a collaborative environment with a focus on improving user experience through comprehensive documentation and feature enhancements. The project appears to be well-positioned for continued growth and adaptation based on user feedback and technological advancements.