OSS Report: AI4Finance-Foundation/FinGPT

Aug. 26, 2024, 8:30 p.m. UTC This report was generated by Dispatch AI

FinGPT Faces Persistent Challenges with Model Integration and Dataset Generation, Hindering User Experience

FinGPT, an open-source initiative by the AI4Finance Foundation, aims to democratize financial modeling through large language models tailored for financial tasks. The project has encountered significant issues related to model integration and dataset generation, impacting user experience and potentially slowing adoption.

Recent Activity

Recent issues and pull requests indicate ongoing struggles with model loading and compatibility, particularly involving bitsandbytes and accelerate libraries. Notable issues include the non-functional Hugging Face demo due to funding (#188) and persistent dataset generation errors (#84). These problems suggest a need for improved documentation and community support to resolve installation challenges.

Development Team Activity

Bruce Yanghy (ByFinTech): Focused on README.md updates, emphasizing documentation clarity.
Likun Lin (llk010502): Worked on file alignment and language support enhancements.
Boyu Zhang (boyuZh): Contributed to financial analysis feature improvements.
Daniel (Neng) Wang (Noir97): Involved in documentation and code restructuring.
Oliver Wang (oliverwang15): Engaged in performance benchmarking for FinGPT v3.3.
Tianyu Zhou (raphaelzhou1): Conducted performance evaluations comparing RAG vs. non-RAG models.

Of Note

Documentation Emphasis: Significant focus on updating README.md suggests prioritization of user guidance.
Collaboration Patterns: Team members frequently collaborate on PRs, indicating a cohesive development environment.
Language Support Expansion: Introduction of FinGPT-Forecaster-Chinese files reflects efforts to increase accessibility.
Performance Evaluation: Ongoing benchmarking activities highlight a commitment to improving model accuracy.
Community Engagement: Active community involvement is evident, though unresolved critical issues may hinder broader adoption.

Quantified Reports

Quantify Issues

Recent GitHub Issues Activity

Timespan	Opened	Closed	Comments	Labeled	Milestones
7 Days	0	0	0	0	0
30 Days	0	0	0	0	0
90 Days	4	0	2	3	1
1 Year	64	19	107	50	1
All Time	105	36	-	-	-

_{Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.}

Quantify commits

Quantified Commit Activity Over 30 Days

Developer	Avatar	Branches	PRs	Commits	Files	Changes
Ikko Eltociear Ashimine (eltociear)		0	1/0/0	0	0	0

_{PRs: created by that dev and opened/merged/closed-unmerged during the period}

Detailed Reports

Report On: Fetch issues

Recent Activity Analysis

The GitHub repository for the FinGPT project has seen significant activity, with a total of 69 open issues. Recent discussions indicate ongoing challenges with model training, dataset generation, and integration with external libraries. Notably, several users have reported errors related to model loading and configuration, particularly when using specific versions of libraries like bitsandbytes and accelerate. There is a recurring theme of users seeking guidance on resolving installation issues and understanding the nuances of the project's architecture.

Several issues highlight critical concerns, such as the Hugging Face demo being non-functional due to funding shortages (#188) and persistent errors in dataset generation (#84). The community appears engaged, with many users providing solutions or workarounds for common problems, yet there remains a backlog of unresolved issues that could hinder user experience and project adoption.

Issue Details

Most Recently Created Issues

Issue #188: Hugging Face demo not working.
- Priority: Help Wanted
- Status: Open
- Created: 32 days ago
- Updated: 25 days ago
Issue #187: Inquiry about the training dataset for FinGPT_Sentiment_Analysis_v1.
- Priority: Question
- Status: Open
- Created: 35 days ago
Issue #186: Error encountered while running the forecaster.
- Priority: Help Wanted
- Status: Open
- Created: 41 days ago
Issue #185: Request for guidance on local model invocation.
- Priority: Help Wanted
- Status: Open
- Created: 53 days ago
Issue #179: Zero rows after running prepare_data.ipynb.
- Priority: Bug
- Status: Open
- Created: 110 days ago

Most Recently Updated Issues

Issue #84: DatasetGenerationError during dataset generation.
- Priority: High
- Status: Open
- Created: 328 days ago
- Updated: 11 days ago
Issue #176: Errors related to deprecated arguments in model loading.
- Priority: Bug
- Status: Open
- Created: 122 days ago
- Updated: 41 days ago
Issue #161: Question regarding Hugging Face app functionality.
- Priority: Bug/Question
- Status: Open
- Created: 196 days ago
- Updated: 190 days ago
Issue #160: Fine-tuning error during model training.
- Priority: Bug
- Status: Open
- Created: 200 days ago
- Updated: 168 days ago
Issue #146: Slow fine-tuning performance reported by users.
- Priority: Help Wanted
- Status: Open
- Created: 234 days ago
- Updated: 196 days ago

Common Themes and Observations

A significant number of issues revolve around installation problems and compatibility with library versions, particularly bitsandbytes and accelerate.
Users frequently express confusion over model configurations and dataset handling, indicating a need for clearer documentation or tutorials.
The community is actively engaged, with many users providing feedback and solutions to others' problems; however, critical issues like funding for demo services may impact project visibility and usability.
There is a noticeable gap in addressing urgent bugs that affect core functionalities, such as dataset generation failures and demo accessibility.

The repository's activity reflects both a vibrant community eager to contribute and a set of persistent challenges that need addressing to enhance user experience and project stability.

Report On: Fetch pull requests

Overview

The analysis of the pull requests (PRs) for the AI4Finance-Foundation/FinGPT repository reveals a total of 6 open PRs and 44 closed PRs. The open PRs primarily focus on minor updates and fixes, while the closed PRs show a mix of significant contributions, including updates to documentation, bug fixes, and enhancements to functionality.

Summary of Pull Requests

Open Pull Requests

PR #192: chore: update rag.py
Created 0 days ago. This PR updates the spelling in the rag.py file from "Langauge" to "Language." It is a minor change but reflects ongoing maintenance efforts.
PR #184: Fix bitsandbytes install
Created 59 days ago. This PR addresses installation issues with the bitsandbytes package in Google Colab by ensuring that version 0.43.0 is used, preventing errors related to previous versions.
PR #174: import error in FinGPT_Training_LoRA_with_ChatGLM2_6B_for_Beginners.ipynb resolved
Created 134 days ago. This PR resolves an import error in a Jupyter notebook, indicating ongoing efforts to improve user experience and functionality.
PR #173: Update README.md
Created 139 days ago. This PR updates badge labels and corrects grammatical errors in the README file, emphasizing the project's commitment to clear communication.
PR #172: Update train_lora.py for kbit training in peft import
Created 150 days ago. This PR replaces deprecated functions with updated ones for kbit training, showcasing adaptation to evolving libraries.
PR #167: Allow benchmarking on CPU
Created 160 days ago. This PR modifies benchmarks to run on CPU, enhancing accessibility for users without GPU resources.

Closed Pull Requests

PR #183: align file names
Closed 71 days ago. This PR involved renaming files for consistency, reflecting good project hygiene.
PR #181: Update README.md
Closed 97 days ago. Fixed several links in the README, improving documentation quality.
PR #175: Add FinGPT-Forecaster-Chinese files
Closed 127 days ago. Introduced files for a Chinese version of FinGPT-Forecaster, expanding the project's reach.
PR #169: Updating FinGPT_Training_LoRA_with_ChatGLM2_6B_for_Beginners_v2-2.ipynb
Closed 154 days ago. Removed deprecated imports, indicating active maintenance of educational resources.
PR #166: Allow benchmarking on CPU
Closed but not merged. Similar to an open PR, this highlights potential disagreements or issues with implementation.

Analysis of Pull Requests

The pull request activity within the AI4Finance-Foundation/FinGPT repository indicates a healthy level of engagement and ongoing development. The open pull requests reflect a focus on minor improvements and bug fixes, which are essential for maintaining software quality and usability. For instance, PR #192 demonstrates attention to detail through simple text corrections that enhance code readability and professionalism.

Notably, several closed pull requests indicate significant contributions that enhance functionality or improve documentation. The addition of Chinese language support (PR #175) is particularly noteworthy as it broadens the project's accessibility to non-English speakers, aligning with the project's goal of democratizing financial modeling tools. Furthermore, the consistent updates to the README files across multiple PRs show a commitment to keeping documentation current and user-friendly.

There are also instances where pull requests were closed without merging (e.g., PR #166). This could suggest potential disagreements regarding implementation strategies or priorities within the development team. Such occurrences warrant further investigation as they may indicate underlying issues that could affect future collaboration or project direction.

The presence of numerous closed pull requests (44) compared to only six open ones suggests that there has been a substantial amount of work completed recently. However, it raises questions about the pace of new feature development versus maintenance work. While maintaining existing features is crucial, there should be a balance between addressing technical debt and innovating new functionalities that keep pace with user needs and competitive offerings in the financial AI landscape.

In conclusion, while the repository shows strong community engagement and active maintenance efforts, it would benefit from a strategic focus on feature development alongside ongoing improvements and fixes. Encouraging contributions that introduce new capabilities or enhancements could help sustain momentum and interest in the project moving forward.

Report On: Fetch commits

Recent Activities of the Development Team

Team Members and Recent Activity

Bruce Yanghy (ByFinTech)

Recent Commits: Multiple updates to the README.md file, including significant changes 40 days ago and several updates in the preceding months.
Collaborations: Worked alongside various team members on documentation improvements and merging pull requests.
Ongoing Work: Continuous updates indicate a focus on maintaining project documentation and ensuring clarity for users.

Likun Lin (llk010502)

Recent Commits: Merged pull requests related to file alignment and added files for the FinGPT-Forecaster-Chinese version.
Collaborations: Collaborated with Bruce Yanghy and other team members on merging contributions.
Ongoing Work: Active in enhancing language support for the project, indicating ongoing development efforts.

Boyu Zhang (boyuZh)

Recent Commits: Updated README.md and made changes related to financial report analysis.
Collaborations: Engaged with Bruce Yanghy on README updates and other documentation tasks.
Ongoing Work: Focused on improving financial analysis features, suggesting ongoing enhancements.

Daniel (Neng) Wang (Noir97)

Recent Commits: Contributed to multiple updates across various files, including utils.py and README.md. Notably involved in adding figures for documentation.
Collaborations: Worked closely with Bruce Yanghy and Likun Lin on several pull requests.
Ongoing Work: Involved in both documentation and code restructuring, indicating a dual focus on usability and functionality.

Oliver Wang (oliverwang15)

Recent Commits: Made extensive updates related to model performance, particularly focusing on FinGPT v3.3.
Collaborations: Collaborated with Tianyu Zhou on performance comparisons involving RAG (Retrieval-Augmented Generation).
Ongoing Work: Engaged in performance benchmarking, suggesting a focus on improving model accuracy.

Tianyu Zhou (raphaelzhou1)

Recent Commits: Conducted experiments comparing RAG vs. non-RAG sentiment classification, updating relevant documentation.
Collaborations: Worked with Oliver Wang to analyze model performance metrics.
Ongoing Work: Actively involved in performance evaluation, indicating a focus on enhancing model capabilities.

Patterns, Themes, and Conclusions

Documentation Focus: A significant portion of recent activity revolves around updating the README.md file and other documentation resources. This indicates an emphasis on improving user experience and accessibility of information.
Collaboration Across Team Members: There is a clear pattern of collaboration among team members, particularly in merging pull requests and contributing to shared documentation. This suggests a cohesive team dynamic aimed at collective improvement of the project.
Feature Enhancements and Language Support: The addition of features such as the FinGPT-Forecaster-Chinese version highlights ongoing efforts to broaden the project's applicability across different languages and regions.
Performance Benchmarking: Several members are focused on evaluating model performance, particularly through RAG comparisons. This indicates a commitment to refining model accuracy and effectiveness in financial applications.
Continuous Integration of Community Contributions: The regular merging of pull requests from various contributors reflects an active engagement with the community, fostering an environment of collaborative development.

Overall, the development team is actively engaged in enhancing both the technical capabilities of FinGPT and its usability through thorough documentation efforts.