OSS Report: OpenBMB/MiniCPM-V

Aug. 19, 2024, 12:30 p.m. UTC This report was generated by Dispatch AI

MiniCPM-V Development Stagnates Amidst High Volume of Open Issues

MiniCPM-V, a project focused on advanced multimodal large language models for mobile devices, has experienced a slowdown in development activity, with no new commits or pull requests in the past 30 days. The project, known for its high-performance vision-language understanding capabilities, is backed by OpenBMB.

Recent Activity

The MiniCPM-V project currently faces 55 open issues, indicating ongoing challenges and active development needs. Recent issues highlight recurring themes such as memory usage during training, LoRA fine-tuning effectiveness, and multi-image input handling. Installation errors and hardware compatibility also present frequent obstacles for users.

Development Team and Recent Contributions

LDLINGLINGLING
- Focused on WeChat integration documentation updates.
- Last activity: 17 commits, primarily in docs and assets.
Tianyu Yu (yiranyyu)
- Updated README files across languages.
- Last activity: 29 commits, with significant documentation enhancements.
Qianyu Chen (qyc-98)
- Worked on fine-tuning scripts and dataset handling.
- Last activity: 19 commits in the finetune directory.
tc-mb
- Updated README files and added issue templates.
- Last activity: 7 commits.
Hongji Zhu (iceflame89)
- Focused on README clarity and accuracy.
- Last activity: 9 commits.
YuzaChongyi
- Fixed bugs in dataset handling scripts.
- Last activity: 4 commits.
HwwwwwwwH
- Merged PRs related to README updates.
- Last activity: 4 commits.
Cui Junbo (Cuiunbo)
- Made minor documentation changes.
- Last activity: 2 commits.
Haoye Zhang (Haoye17)
- Added image assets.
- Last activity: 1 commit.
JamePeng
- No recent commits but has an open PR related to chatbot functionality.

Of Note

High Number of Open Issues: With 55 open issues, the project faces significant challenges that may hinder progress if not addressed promptly.
Stagnant Development: No new commits or pull requests in the last month suggest a potential pause in active development efforts.
Documentation Focus: Recent activities heavily emphasize documentation updates, indicating an effort to improve user accessibility and understanding of the project's features.
Community Engagement: Despite being a newer project, MiniCPM-V has quickly gained traction with over 10,782 stars on GitHub, reflecting strong community interest.
Hardware Compatibility Concerns: Frequent issues related to installation errors and hardware compatibility highlight ongoing challenges in ensuring smooth deployment across different environments.

Quantified Reports

Quantify commits

Quantified Commit Activity Over 30 Days

Developer	Branches	PRs	Commits	Files	Changes
qianyu chen	2	1/1/0	19	8	8415
Tianyu Yu	1	0/0/0	29	29	4301
LDLINGLINGLING	1	1/2/0	17	10	367
Alphi	1	2/2/0	4	3	275
tc-mb	1	0/0/0	7	7	244
YuzaChongyi	1	0/0/0	4	3	87
Hongji Zhu	1	0/0/0	9	4	61
Cui Junbo	1	0/0/0	2	1	4
Haoye Zhang	1	0/0/0	1	7	0
None (JamePeng)	0	1/0/0	0	0	0
sky (cnsky2016)	0	1/0/0	0	0	0
Ikko Eltociear Ashimine (eltociear)	0	1/0/0	0	0	0
None (BothSavage)	0	2/0/1	0	0	0
Tejas Makode (TejMakode1523)	0	1/0/0	0	0	0
Dr. Artificial曾小健 (ArtificialZeng)	0	1/0/0	0	0	0
Arpit Pathak (Thepathakarpit)	0	1/0/1	0	0	0

_{PRs: created by that dev and opened/merged/closed-unmerged during the period}

Quantify Issues

Recent GitHub Issues Activity

Timespan	Opened	Closed	Comments	Labeled	Milestones
7 Days	35	25	57	22	1
14 Days	95	70	286	62	1
30 Days	119	95	363	81	1
All Time	421	378	-	-	-

_{Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.}

Detailed Reports

Report On: Fetch issues

Recent Activity Analysis

The MiniCPM-V project has seen a surge in activity, with a total of 43 open issues currently logged on GitHub. Recent submissions include a mix of bug reports, feature requests, and inquiries about model capabilities, particularly focusing on the latest version (2.6) and its performance in various contexts such as fine-tuning and inference.

Notably, there are several recurring themes among the issues, including concerns about memory usage during training, the effectiveness of LoRA fine-tuning, and the model's ability to handle multi-image inputs. Issues related to installation errors and compatibility with different hardware configurations also appear frequently, indicating potential challenges for users trying to deploy the model effectively.

Issue Details

Here are some of the most recent issues created and updated:

Issue #486: [BUG] Data fetch error - typo
- Priority: High
- Status: Open
- Created: 0 days ago
- Update: N/A
Issue #485: [llamacpp] - Is it possible to provide the ability to perform video inference by the server mode?
- Priority: Medium
- Status: Open
- Created: 0 days ago
- Update: N/A
Issue #483: [BUG] number of image start tokens and image end tokens mismatch
- Priority: High
- Status: Open
- Created: 1 day ago
- Update: N/A
Issue #482: [BUG] Error when running myMinicpmv2.6:latest model directly.
- Priority: High
- Status: Open
- Created: 1 day ago
- Update: Edited 0 days ago
Issue #481: [BUG] Training speed issue.
- Priority: Medium
- Status: Open
- Created: 3 days ago
- Update: N/A
Issue #480: [BUG] Lack of position_id in MiniCPMVProcessor.
- Priority: High
- Status: Open
- Created: 3 days ago
- Update: N/A
Issue #479: [vllm] Using vllm API server for video inference.
- Priority: Medium
- Status: Open
- Created: 3 days ago
- Update: N/A
Issue #478: [vllm] KeyError when running with MiniCPM-V-2_6-int4.
- Priority: Medium
- Status: Open
- Created: 3 days ago
- Update: N/A
Issue #477: [BUG] AVX instruction set error.
- Priority: High
- Status: Open
- Created: 3 days ago
- Update: N/A
Issue #476: [BUG] Multi-card API startup issue.
- Priority: Medium
- Status: Open
- Created: 4 days ago
- Update: N/A

Analysis of Notable Issues

The issue regarding the mismatch between image start and end tokens (#483) indicates a potential flaw in how the model processes images, which could lead to runtime errors during inference or training.
There is a significant focus on bugs related to video inference capabilities (#485, #479), suggesting that users are eager for enhanced multimedia processing features within the MiniCPM-V framework.
The high number of issues related to bugs in training speed and memory management (#481, #480) reflects ongoing concerns about resource efficiency and performance optimization in various environments.
The presence of multiple requests for support with LoRA fine-tuning (#481, #482) highlights a growing interest in efficient training methods that require fewer resources while still achieving high performance.

In summary, while MiniCPM-V shows promise with its advanced multimodal capabilities, users are encountering various challenges that need addressing to ensure smoother deployment and optimal performance across different use cases.

Report On: Fetch pull requests

Overview

The analysis of the pull requests (PRs) from the OpenBMB/MiniCPM-V repository reveals a dynamic development environment focused on enhancing multimodal capabilities, fixing bugs, and improving documentation. The repository currently has 12 open PRs, with a notable emphasis on updates related to the latest version, MiniCPM-V 2.6.

Summary of Pull Requests

PR #484: Update streamlit implementation for MiniCPM-V 2.6
Created 1 day ago, this PR enhances the Streamlit application to support new multimodal features, including text, images, and video processing. It introduces significant changes to file handling and user interaction.
PR #461: fix mps rely on flash_atten
Created 6 days ago, this PR addresses dependencies related to MPS (Metal Performance Shaders) and flash attention mechanisms, indicating ongoing efforts to optimize performance on Apple hardware.
PR #460: fix
Also created 6 days ago, this PR appears to be a minor fix with limited details provided, reflecting a common practice of addressing small issues as they arise.
PR #435: docs: add Japanese README
Created 10 days ago, this PR expands accessibility by adding a Japanese translation of the README file, demonstrating community engagement and inclusivity.
PR #403: 修复V100无法运行MiniCPM-V-2_6问题
Created 12 days ago, this PR addresses compatibility issues with V100 GPUs, showcasing the team's responsiveness to hardware-specific challenges.
PR #383: Fine tuning of MiniCPM-Llama3-V-2_5-int4
Created 13 days ago, this PR focuses on fine-tuning scripts for improved performance in specific model configurations.
PR #281: feat: Added judgment logic to support training with plain text data
Created 62 days ago, this PR introduces logic to handle text-only datasets effectively, addressing previous limitations in data handling.
PR #304: Update requirements.txt for finetuning requirements
Created 53 days ago, this PR updates the requirements file to include additional packages necessary for fine-tuning processes.
PR #301: Clear the torch cuda cache after response
Created 54 days ago, this PR optimizes memory management by clearing CUDA cache after responses to improve performance during model switching.
PR #293: Update inference_on_multiple_gpus.md
Created 58 days ago, this PR enhances documentation regarding multi-GPU inference setups, improving user guidance.
PR #278: fix a bug with web_demo_streamlit_2.5 at text mode
Created 63 days ago, this PR resolves an error in the Streamlit demo when using text mode without image uploads.
PR #36: [Draft] Add minicpmv finetune script
Created 131 days ago, this draft PR proposes a fine-tuning script for MiniCPM-V but has not yet been finalized or merged.

Analysis of Pull Requests

The current set of open pull requests reflects several key themes in the ongoing development of the MiniCPM-V project:

Focus on Multimodal Capabilities

A significant portion of the recent PRs is dedicated to enhancing the multimodal capabilities introduced in MiniCPM-V 2.6. For instance, PR #484 highlights improvements in Streamlit integration that allow users to upload and process various media types seamlessly—an essential feature for applications requiring diverse input modalities. This aligns with the project's goal of providing high-performance vision-language understanding.

Bug Fixes and Performance Optimizations

Several pull requests aim at fixing bugs and optimizing performance across different environments and hardware configurations. For example, PR #461 addresses issues related to MPS on macOS while PR #403 resolves compatibility problems with V100 GPUs. These efforts indicate a proactive approach to ensuring that the software remains robust across various platforms and configurations.

Documentation Improvements

The addition of multilingual documentation (as seen in PR #435) and updates to existing guides (like those in PR #293) demonstrate a commitment to user accessibility and community engagement. Clear documentation is crucial for fostering user adoption and facilitating contributions from developers who may not be fluent in English.

Community Engagement

The presence of multiple contributors working on diverse aspects—from bug fixes to feature enhancements—suggests an active community around MiniCPM-V. The discussions within some pull requests also indicate collaborative problem-solving efforts among contributors, which is vital for maintaining momentum in open-source projects.

Anomalies and Areas for Improvement

Despite the positive developments, there are notable anomalies such as the high number of open issues (55), which may reflect ongoing challenges or areas needing further attention from maintainers. Additionally, some pull requests like PR #460 lack detailed descriptions or context about their changes, which could hinder effective review processes and collaboration.

In conclusion, the current landscape of pull requests for MiniCPM-V illustrates a vibrant development effort focused on enhancing functionality while addressing user needs through continuous improvement and community involvement. However, there remains room for improvement in documentation clarity and issue resolution processes to ensure sustained project health and community satisfaction.

Report On: Fetch commits

Repo Commits Analysis

Development Team and Recent Activity

Team Members and Recent Contributions

LDLINGLINGLING
- Recent Activity: Contributed 17 commits, focusing on updating documentation related to WeChat integration, including images and markdown files. Collaborated with Tianyu Yu on README updates.
- Files Changed: 10 files, primarily in the docs and assets directories.
Tianyu Yu (yiranyyu)
- Recent Activity: Active with 29 commits, significantly updating README files across multiple languages. Involved in merging branches and enhancing documentation for the latest version (2.6).
- Files Changed: 29 files, with substantial changes to README and documentation files.
Qianyu Chen (qyc-98)
- Recent Activity: Made 19 commits, focusing on fine-tuning scripts and README updates for the 2.6 version. Engaged in merging branches and improving dataset handling.
- Files Changed: 8 files, particularly in the finetune directory.
tc-mb
- Recent Activity: Contributed 7 commits, mainly focused on updating README files and adding issue templates.
- Files Changed: 7 files.
Hongji Zhu (iceflame89)
- Recent Activity: Contributed 9 commits, primarily updating README files for clarity and accuracy.
- Files Changed: 4 files.
YuzaChongyi
- Recent Activity: Contributed 4 commits, mainly focused on fixing bugs in dataset handling scripts.
- Files Changed: 3 files.
HwwwwwwwH
- Recent Activity: Contributed 4 commits, focusing on merging pull requests related to README updates.
- Files Changed: 3 files.
Cui Junbo (Cuiunbo)
- Recent Activity: Made 2 commits with minor changes to documentation.
- Files Changed: 1 file.
Haoye Zhang (Haoye17)
- Recent Activity: Contributed a single commit adding image assets.
- Files Changed: 7 files.
JamePeng
- Recent Activity: No recent commits but has an open pull request related to chatbot functionality.

Patterns and Themes

The team is actively engaged in enhancing documentation, particularly around the new features introduced in version 2.6 of MiniCPM-V.
There is a strong focus on collaboration among team members, as seen in merged pull requests and shared contributions to README updates.
The majority of recent activity revolves around improving user accessibility through detailed documentation and visual aids (e.g., images for WeChat integration).
The development team shows a pattern of iterative improvements, especially in fine-tuning capabilities and multi-modal functionalities, indicating a responsive approach to user feedback and project requirements.

Conclusions

The development team is actively enhancing the MiniCPM-V project through collaborative efforts focused on documentation improvements and feature enhancements. The recent activities reflect a commitment to maintaining high-quality resources for users while also addressing technical updates necessary for the evolving capabilities of the software.