OSS Report: AI4Finance-Foundation/FinGPT

Oct. 2, 2024, midnight UTC This report was generated by Dispatch AI

FinGPT Project Sees Focused Development on Model Training Amidst Broader Team Inactivity

FinGPT, an open-source project for financial large language models, aims to democratize access to advanced financial modeling tools. It emphasizes cost-effective model adaptation and fine-tuning.

Recent Activity

Recent issues and pull requests suggest a focus on improving usability and training methodologies. Key issues include the non-functional Hugging Face demo (#188) and challenges with GPU memory management. These indicate ongoing user engagement and areas needing improvement.

Development Team and Recent Activity

Yuncong Liu (ycsama0703):
- 5 commits in the last day, focusing on training notebooks.
- Added FinGPT_ Training with LoRA and Meta-Llama-3-8B.ipynb.
- Deleted outdated notebooks.
- Collaborated with Bruce Yanghy on a merge pull request.
Bruce Yanghy (ByFinTech): No recent activity besides collaboration with Yuncong Liu.
Other team members have been inactive, indicating a concentrated effort by Yuncong Liu on specific tasks related to model training.

Of Note

Hugging Face Demo Issue (#188): A critical user-facing problem affecting accessibility.
Training Methodology Update (PR #172): Shift from int8 to kbit training reflects adaptation to new insights.
CPU Benchmarking (PR #167): Enhances accessibility for users without GPUs.
Comprehensive Training Tutorial Update (Closed PR #194): Improves user experience with new features and clearer guidance.
Documentation Enhancements (PR #173): Ongoing efforts to maintain clarity and accuracy in project documentation.

Quantified Reports

Quantify Issues

Recent GitHub Issues Activity

Timespan	Opened	Closed	Comments	Labeled	Milestones
7 Days	0	0	0	0	0
30 Days	0	0	0	0	0
90 Days	4	0	2	3	1
1 Year	52	17	66	42	1
All Time	105	36	-	-	-

_{Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.}

Quantify commits

Quantified Commit Activity Over 30 Days

Developer	Branches	PRs	Commits	Files	Changes
Yuncong Liu	1	1/1/0	5	3	3083
RAVI GAUTAM (RGIIST)	0	0/0/1	0	0	0
ByFinTech	0	0/0/0	0	0	0

_{PRs: created by that dev and opened/merged/closed-unmerged during the period}

Detailed Reports

Report On: Fetch issues

Recent Activity Analysis

The FinGPT project currently has 69 open issues, indicating ongoing user engagement and potential areas for improvement. Notably, several issues are related to critical functionalities, such as the Hugging Face demo not working (#188) and various error messages encountered during model training and inference.

A recurring theme in the issues is the struggle with model compatibility and resource management, particularly concerning GPU memory limitations and software dependencies. This suggests that users may be facing challenges in effectively utilizing the models due to environmental constraints or outdated documentation.

Issue Details

Most Recently Created Issues

Issue #188: The Hugging Face demo does not work.
- Priority: Help wanted
- Status: Open
- Created: 68 days ago
- Updated: 61 days ago
Issue #187: Inquiry about the training dataset for FinGPT_Sentiment_Analysis_v1.
- Priority: Normal
- Status: Open
- Created: 71 days ago
Issue #186: Error encountered while trying to run the forecaster.
- Priority: Normal
- Status: Open
- Created: 77 days ago
Issue #185: Request for guidance on how to call the model locally.
- Priority: Help wanted
- Status: Open
- Created: 90 days ago
Issue #179: Zero rows after running prepare_data.ipynb.
- Priority: Normal
- Status: Open
- Created: 146 days ago

Most Recently Updated Issues

Issue #188 (Updated): The Hugging Face demo does not work.
Issue #186 (Updated): Error encountered while trying to run the forecaster.
Issue #185 (Updated): Request for guidance on how to call the model locally.
Issue #176 (Updated): Various errors related to model loading and usage.
Issue #161 (Updated): Inquiry about whether the Hugging Face app should work.

Analysis of Notable Anomalies and Themes

The issue regarding the non-functional Hugging Face demo (#188) is particularly significant as it directly affects user access to model capabilities, potentially deterring new users from engaging with the project.
A commonality among recent issues is confusion regarding model usage, especially concerning local setups and error handling during training processes. This indicates a need for clearer documentation or more robust error messages to guide users through troubleshooting.
Several issues relate to GPU memory management, such as out-of-memory errors during training (#136) and slow inference speeds (#107). This suggests that users may require better guidance on optimizing their environments for running resource-intensive models.
The presence of multiple requests for help indicates a community actively seeking support, which could be harnessed through improved community engagement strategies or dedicated support channels.

Overall, these insights reflect both the strengths and challenges faced by the FinGPT project as it continues to evolve in response to user needs and technological advancements.

Report On: Fetch pull requests

Overview

The provided data includes a series of pull requests (PRs) from the AI4Finance-Foundation/FinGPT repository, detailing contributions to the FinGPT project, which focuses on developing financial large language models. The PRs cover various aspects such as code updates, documentation improvements, and feature enhancements.

Summary of Pull Requests

Open Pull Requests

PR #192: A minor update to the rag.py file, correcting a typo in comments and ensuring consistent formatting. This PR is straightforward and improves code readability without affecting functionality.
PR #184: Fixes an installation issue with the bitsandbytes package in a Jupyter notebook, ensuring compatibility with Google Colab. This PR addresses a practical issue faced by users and is crucial for maintaining usability across different environments.
PR #173: Updates the README file to correct a Python version badge and fix grammatical errors. This PR enhances documentation accuracy and clarity.
PR #172: Modifies train_lora.py to replace int8 training with kbit training in the PEFT import. This change reflects an update in training methodology and could impact model performance or training efficiency.
PR #167: Allows benchmarking on CPU by updating benchmark scripts to dynamically move data to the device where the model is located. This PR broadens accessibility for testing on machines without GPUs.

Closed Pull Requests

PR #194: A comprehensive update to the training tutorial for FinGPT, incorporating new features like model comparison visualization and Google Drive integration for dataset storage. This PR significantly enhances user experience by providing clearer guidance and more robust tools for model training.

Analysis of Pull Requests

The analysis of these pull requests reveals several key themes:

Continuous Improvement: The open PRs demonstrate ongoing efforts to improve code quality, usability, and documentation. For instance, PR #192 focuses on code readability, while PR #184 addresses installation issues that could hinder user experience.
Adaptation to New Methodologies: Changes like those in PR #172 indicate an adaptation to evolving methodologies in model training. The shift from int8 to kbit training suggests a response to new research or practical insights that could enhance model performance or training efficiency.
Enhanced Accessibility: The ability to benchmark on CPU (PR #167) reflects a commitment to making the tools accessible to a wider audience, including those without access to high-end hardware.
User-Centric Enhancements: The closed PR (#194) that introduces new features for training tutorials highlights a focus on enhancing user experience through better guidance and tools.
Documentation and Clarity: Regular updates to documentation (e.g., PR #173) are crucial for maintaining clarity as the project evolves. Accurate documentation helps users understand new features or changes in methodology.

Overall, these pull requests illustrate a proactive approach to development within the FinGPT project, emphasizing quality improvement, methodological adaptation, user accessibility, and clear documentation.

Report On: Fetch commits

Repo Commits Analysis

Development Team and Recent Activity

Team Members:

Bruce Yanghy (ByFinTech)
- Recent Activity: No recent commits or changes.
Yuncong Liu (ycsama0703)
- Recent Activity:
- 5 commits in the last day, totaling 3,083 changes across 3 files.
- Key contributions include:
- Added FinGPT_ Training with LoRA and Meta-Llama-3-8B.ipynb.
- Deleted two previous notebook versions (FinGPT_ Training with LoRA and Llama3-7B.ipynb and Copy_of_FinGPT_Training_with_LoRA_and_ChatGLM2–6B.ipynb).
- Collaborated with Bruce Yanghy on a merge pull request.
Likun Lin (llk010502)
- Recent Activity: No recent commits or changes.
Daniel (Neng) Wang (Noir97)
- Recent Activity: No recent commits or changes.
Boyu Zhang (boyuZh)
- Recent Activity: No recent commits or changes.
Eleni Verteouri
- Recent Activity: No recent commits or changes.
Gason Bai
- Recent Activity: No recent commits or changes.
William Gazeley
- Recent Activity: No recent commits or changes.
Kalyani Mhala
- Recent Activity: No recent commits or changes.
Alfonso Amayuelas
- Recent Activity: No recent commits or changes.
Surav Shrestha
- Recent Activity: No recent commits or changes.
Peter Schofield
- Recent Activity: No recent commits or changes.
Tianyu Zhou (raphaelzhou1)
- Recent Activity: No recent commits or changes.
Shivam Singh
- Recent Activity: No recent commits or changes.
Kris248
- Recent Activity: No recent commits or changes.
mac
- Recent Activity: No recent commits or changes.
RGIIST
- Recent Activity: No recent commits or changes.

Patterns and Themes:

The most active contributor is Yuncong Liu, who has made significant updates in the last day, focusing on training notebooks for the FinGPT project.
Bruce Yanghy has not contributed recently, despite being a key member of the team.
There is a lack of activity from other team members, indicating possible focus on specific tasks by Yuncong Liu.
The project appears to be in a phase of refining training methodologies and cleaning up previous versions of notebooks.
The collaboration between Yuncong Liu and Bruce Yanghy suggests ongoing efforts to enhance the project's capabilities through shared contributions.

Conclusions:

The development team shows a concentrated effort from Yuncong Liu, while other members have been inactive recently. This may indicate a focused sprint on specific features related to model training, particularly with LoRA and Meta-Llama models, while other areas remain static for now.