RAGFlow, an open-source Retrieval-Augmented Generation engine, aims to enhance document understanding using Large Language Models. It faces critical issues in embedding models and document parsing, affecting core functionalities.
Recent issues highlight significant challenges with embedding models (#2527, #2506) and parsing errors (#2519, #2527). These indicate potential instability in key features. Additionally, user interface inconsistencies (#2514) suggest problems with language settings.
Kevin Hu (KevinHuSh)
Alvin Cage (AlvinCage)
Chenbing (muzilib)
Yungongzi (yungongzi)
Fachuan Bai (baifachuan)
Feiue (liuhua)
JobSmithManipulation
Dada Hsueh (dadahsueh)
Michał Kiełtyka (Defozo)
Writinwaters
Guoyuhao2330 (lidp)
Hangters (黄腾)
The team is actively addressing bugs and enhancing features, with Kevin Hu leading significant contributions.
Timespan | Opened | Closed | Comments | Labeled | Milestones |
---|---|---|---|---|---|
7 Days | 39 | 8 | 44 | 3 | 1 |
14 Days | 96 | 48 | 140 | 6 | 1 |
30 Days | 189 | 90 | 326 | 13 | 1 |
All Time | 1153 | 696 | - | - | - |
Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.
Developer | Avatar | Branches | PRs | Commits | Files | Changes |
---|---|---|---|---|---|---|
balibabu | 1 | 47/47/0 | 54 | 82 | 7952 | |
黄腾 | 1 | 14/13/1 | 18 | 38 | 2557 | |
JobSmithManipulation | 1 | 11/8/3 | 8 | 16 | 2348 | |
Valdanito (Valdanitooooo) | 1 | 4/4/0 | 4 | 19 | 2318 | |
liuhua | 1 | 13/11/2 | 12 | 25 | 2269 | |
Kevin Hu | 1 | 46/45/1 | 52 | 46 | 965 | |
lidp | 1 | 16/15/1 | 19 | 24 | 854 | |
Fachuan Bai | 1 | 4/4/0 | 4 | 26 | 615 | |
writinwaters | 1 | 9/9/0 | 9 | 15 | 325 | |
Wang Baoling | 1 | 2/2/0 | 2 | 2 | 110 | |
Zhichang Yu | 1 | 2/1/0 | 1 | 7 | 92 | |
Toro | 1 | 2/2/0 | 2 | 2 | 12 | |
dependabot[bot] | 1 | 11/4/7 | 4 | 2 | 12 | |
yangbo.zhou | 1 | 1/1/0 | 1 | 1 | 6 | |
Dada Hsueh | 1 | 1/1/0 | 1 | 1 | 6 | |
Michał Kiełtyka | 1 | 1/1/0 | 1 | 3 | 5 | |
LIU HAO | 1 | 1/1/0 | 1 | 1 | 5 | |
dearjane | 1 | 0/0/0 | 1 | 1 | 4 | |
zhuhao | 1 | 0/0/0 | 1 | 1 | 4 | |
wwwlll | 1 | 2/1/1 | 1 | 1 | 4 | |
_Chenbing | 1 | 2/2/0 | 2 | 2 | 3 | |
yungongzi | 1 | 1/1/0 | 1 | 1 | 3 | |
Andrey | 1 | 0/0/0 | 1 | 1 | 3 | |
Wang | 1 | 1/1/0 | 1 | 1 | 3 | |
Vitaliy Groshev | 1 | 1/1/0 | 1 | 1 | 2 | |
AlvinCage | 1 | 1/1/0 | 1 | 1 | 2 | |
Jia Chen | 1 | 1/1/0 | 1 | 1 | 2 | |
Yuhao Tsui (cyhasuka) | 0 | 1/0/1 | 0 | 0 | 0 | |
移山搬砖派 (AbbottKilig) | 0 | 2/0/2 | 0 | 0 | 0 | |
None (yixiang1120) | 0 | 2/0/2 | 0 | 0 | 0 | |
narendra (narendra-bluebash) | 0 | 1/0/1 | 0 | 0 | 0 |
PRs: created by that dev and opened/merged/closed-unmerged during the period
The GitHub repository for RAGFlow has seen significant recent activity, with a total of 458 open issues. The latest issues span a variety of topics, including bugs, feature requests, and questions about functionality. Notably, there are recurring themes around embedding models, parsing errors, and user interface challenges.
Several issues highlight critical bugs that affect the core functionalities of the application, such as embedding errors and problems with document parsing. Additionally, there is a noticeable concern regarding the integration of various models and APIs, suggesting that users are facing difficulties in leveraging the full capabilities of RAGFlow.
Issue #2528: [Question]: Access interface of @login_required, always get unauthorized error 401
Issue #2527: [Bug]: I can get the result by search, but I can't get the answer by chatting with the same knowledge base
Issue #2523: [Feature Request]: Integrates jina-embeddings-v3-a-frontier-multilingual-embedding-model
Issue #2522: [Question]: How to make overlapping chunking?
Issue #2519: [Question]: Error at parsing files uploaded in demo
Issue #2518: [Question]: Is there an update plan for the open-source deepdoc model on Hugging Face?
Issue #2516: [Feature Request]: Configurable for excel, html table or row based text
Issue #2514: [Bug]: Initial language is English, but the UI is in Chinese
Issue #2513: [Question]: Error at parsing files uploaded in demo
Issue #2506: [Question]: Qwen2-72B-Instruct-GPTQ-Int4 of Xinference not listed in System model settings
Overall, while RAGFlow shows promise with its rich feature set, the current state of unresolved issues suggests a need for focused efforts on stability and user experience improvements.
The analysis of the pull requests (PRs) for the RAGFlow project reveals a dynamic development environment with a focus on continuous improvement, feature enhancement, and community engagement. The project has a significant number of closed PRs, indicating active maintenance and development efforts.
PR #2521: refine xinference
PR #2520: refine retrieval of multi-turn conversation
PR #2517: make excel parsing configurable
PR #2515: refactor(API): Split SDK class to optimize code structure
PR #2511: rm key set in xinference
PR #2510: fix self deployed llm lost
The PRs reflect a robust development process characterized by:
Active Maintenance and Feature Development: The frequency and variety of PRs indicate ongoing efforts to enhance RAGFlow's capabilities. Recent PRs focus on refining existing features, optimizing performance, and introducing new functionalities like configurable Excel parsing and improved multi-turn conversation handling.
Community Engagement: Contributions from various developers suggest a vibrant community involvement. The quick turnaround from PR creation to closure/merging indicates an efficient review process, likely facilitated by active maintainers who are responsive to community contributions.
Focus on Quality and Optimization: Several PRs aim at refactoring code for better structure, readability, and performance. This is evident from PRs like #2515, which splits SDK functionalities for clarity, and PRs addressing specific bugs or optimization opportunities (#2511, #2510).
Adaptation to User Needs: The introduction of configurable options (e.g., Excel parsing) shows responsiveness to user feedback or requirements. This adaptability is crucial for maintaining relevance and usability in diverse application scenarios.
Technical Challenges and Solutions: The presence of bug fixes (#2511, #2510) alongside feature enhancements highlights ongoing technical challenges that the development team is actively addressing. This is a normal part of software evolution but requires diligent effort to ensure stability alongside growth.
In conclusion, RAGFlow's development activity as reflected in these PRs demonstrates a healthy project lifecycle with active contributions aimed at enhancing functionality, optimizing performance, and ensuring quality through rigorous maintenance efforts.
Kevin Hu (KevinHuSh)
Alvin Cage (AlvinCage)
Chenbing (muzilib)
Yungongzi (yungongzi)
Fachuan Bai (baifachuan)
Feiue (liuhua)
JobSmithManipulation
Dada Hsueh (dadahsueh)
Michał Kiełtyka (Defozo)
Writinwaters
Guoyuhao2330 (lidp)
Hangters (黄腾)
Other contributors include Chunshan-Theta, fashioncj, LiuHao-1443, yangboz, dearjane, netandreus, hwzhuhao, Valdanitooooo, with varying contributions primarily focused on bug fixes and feature enhancements.
Overall, the development team is actively engaged in enhancing the RAGFlow project through collaborative efforts focused on both fixing existing issues and implementing new features.