‹ Reports
The Dispatch

OSS Report: stanford-oval/storm


STORM Development Sees Increased Activity Amidst Language Support and Citation Challenges

STORM, a knowledge curation system leveraging large language models, continues to evolve with active development focusing on modularity and user experience improvements.

Recent Activity

Recent issues and pull requests (PRs) reflect a dual focus on enhancing user experience and addressing technical challenges. The introduction of multiple retriever systems (#155) and themed frontend options (#135) indicates a strategic push towards customization and flexibility. However, the project faces challenges with language support (#169) and citation generation (#168), which are critical for maintaining reliability.

Development Team Activity

Inactive members include Yucheng-Jiang, AMMAS1, Fredheir, GuillermoBlasco, Haailabs, Hdelossantos, Kosiew, and Songkq.

Of Note

  1. Modular Enhancements: The addition of multiple retriever systems (#155) enhances STORM's adaptability for diverse data sources.

  2. User Interface Improvements: Themed frontend options (#135) improve user interaction by offering light/dark modes.

  3. Language Support Challenges: Arabic language support issues (#169) highlight ongoing challenges with RTL languages.

  4. Citation Inconsistencies: Reports of missing or invalid citations (#168) suggest a need for improved citation handling mechanisms.

  5. Community Engagement: Active discussions around code formatting and feature enhancements indicate strong community involvement but also reveal potential workflow friction points.

Quantified Reports

Quantify Issues



Recent GitHub Issues Activity

Timespan Opened Closed Comments Labeled Milestones
7 Days 6 1 2 5 1
30 Days 14 20 12 13 1
90 Days 65 59 142 57 1
All Time 95 72 - - -

Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.

Quantify commits



Quantified Commit Activity Over 30 Days

Developer Avatar Branches PRs Commits Files Changes
Yijia Shao 1 5/3/2 13 19 2451
Abrahan N. 1 1/2/0 5 10 255
ndehouche 1 0/0/0 1 3 172
sureenheer 1 1/1/0 2 4 26
None (AMMAS1) 0 1/0/0 0 0 0
None (kosiew) 0 0/0/1 0 0 0
None (songkq) 0 0/0/1 0 0 0
fredheir (fredheir) 0 1/0/1 0 0 0
HaAI Labs (haailabs) 0 0/1/0 0 0 0
None (hdelossantos) 0 0/1/0 0 0 0
Yucheng-Jiang 0 0/0/0 0 0 0
Guillermo (GuillermoBlasco) 0 1/0/1 0 0 0

PRs: created by that dev and opened/merged/closed-unmerged during the period

Detailed Reports

Report On: Fetch issues



Recent Activity Analysis

The STORM project has seen a notable increase in activity, with 23 open issues currently being tracked. Recent issues highlight ongoing challenges with language support, citation generation, and integration of various retrieval models. A significant focus appears to be on enhancing the Arabic language capabilities and addressing bugs related to citation inconsistencies.

Several issues indicate recurring themes, such as the need for better handling of Right-to-Left (RTL) languages, improvements in article quality, and integration of additional retrieval systems like PGVector and GraphRAG. The presence of multiple bug reports suggests that while the project is evolving, it faces stability challenges that could affect user experience.

Issue Details

Most Recently Created Issues

  1. Issue #169: Arabic Language Support

    • Priority: High
    • Status: Open
    • Created: 1 day ago
    • Description: Addresses challenges in supporting Arabic input and UI enhancements for RTL languages.
  2. Issue #168: [BUG] Inconsistent Citations Generation in Storm

    • Priority: High
    • Status: Open
    • Created: 5 days ago
    • Description: Reports issues with missing citations and invalid links in generated articles.
  3. Issue #167: PGVector and GraphRAG support (Retrieval for RAG and GraphRAG)

    • Priority: Medium
    • Status: Open
    • Created: 5 days ago
    • Description: User requests integration support for PGVector and GraphRAG systems.
  4. Issue #166: I wonder how it marks the reference in the article

    • Priority: Low
    • Status: Open
    • Created: 6 days ago
    • Description: Inquires about reference handling in article generation.
  5. Issue #165: Have a look at GPT-Researcher

    • Priority: Low
    • Status: Open
    • Created: 7 days ago
    • Description: Suggests collaboration opportunities with another project covering similar ground.

Most Recently Updated Issues

  1. Issue #160: [BUG] Running locally with Ollama 3.1 does not do research

    • Priority: High
    • Status: Open
    • Last Updated: 11 days ago
    • Description: User reports that running the application locally does not generate an outline as expected.
  2. Issue #161: [BUG] Running with Groq: too many con

    • Priority: Medium
    • Status: Open
    • Last Updated: 12 days ago
    • Description: Bug report regarding execution errors when using Groq.
  3. Issue #154: [BUG] Storm with Claude sonnet did not use up the maximum token (8192) in its output

    • Priority: Medium
    • Status: Closed (but relevant)
    • Last Updated: 18 days ago
    • Description: Discusses limitations on article length despite increased token limits.
  4. Issue #139: [BUG] No Outline Generated

    • Priority: High
    • Status: Closed (but relevant)
    • Last Updated: 17 days ago
    • Description: Issue with outline generation not aligning with user expectations.
  5. Issue #138: Ability to specify high priority references that should be used in addition to references found automatically

    • Priority: Medium
    • Status: Open
    • Last Updated: 36 days ago
    • Description: User requests functionality to prioritize specific references during research.

Summary of Implications

The recent surge in issues indicates a growing user base actively engaging with the STORM project, but also highlights critical areas needing attention, particularly around language support and citation accuracy. The project’s modular design allows for flexibility, yet the integration of new features must be managed carefully to maintain stability and usability. Addressing these concerns will be vital for enhancing user satisfaction and ensuring the project's long-term success.

Report On: Fetch pull requests



Overview

The analysis of the pull requests (PRs) for the STORM project reveals a mix of ongoing enhancements, bug fixes, and feature additions aimed at improving the system's modularity and user experience. The current state includes three open PRs and a significant number of closed PRs, indicating active development and community engagement.

Summary of Pull Requests

Open Pull Requests

  • PR #135: [Demo Enhancement] added storm wiki frontend with themes
    Created by Jaigouk Kim, this draft PR introduces a themed frontend allowing users to switch between light and dark themes. It also adds multiple search engines and options for fallback language models. Notable review comments suggest improvements in configuration flexibility and documentation.

  • PR #155: Multiple retriever systems
    Submitted by AMMAS1, this PR enables the use of multiple retrievers within STORM, enhancing its capability to fetch data from various sources simultaneously. This feature is significant for users requiring diverse data inputs.

  • PR #17: [doc] Add readme-zh for Chinese users
    Created by mahone3297, this PR aims to provide a Chinese version of the README file. It is currently on hold due to potential major updates planned for the repository.

Closed Pull Requests

  • PR #163: [Enhancement] Support backoff and retry for DuckDuckGoSearchRM
    Closed after merging, this PR enhances the DuckDuckGo search functionality by adding retry logic for rate limits, improving reliability.

  • PR #159: Use black as the python code formatter
    This PR established black as the standard code formatter for the project, ensuring consistent code style across contributions. It was merged after discussions about pre-commit hooks for developers.

  • PR #146: chore: add callback handler to article generation and article polish
    Although not merged, this PR proposed adding a callback handler to enhance article generation processes. The rationale was tied to personal usage needs, indicating a potential gap in current functionality.

  • PR #148: [Bug Fix] Fix VLLMClient to reflect recent updates in vllm
    This fix addressed compatibility issues with the latest VLLM server updates and was successfully merged.

Analysis of Pull Requests

The pull requests demonstrate several key themes in the ongoing development of STORM:

  1. User Experience Enhancements: The introduction of theme options in PR #135 highlights an emphasis on user interface improvements. The ability to choose between light and dark modes can significantly enhance user satisfaction, especially for those who spend extended periods interacting with the application. Additionally, allowing users to configure their preferred search engines and language models reflects a commitment to customization.

  2. Modularity and Flexibility: The move towards supporting multiple retrievers (PR #155) indicates a strategic shift towards modularity. By enabling users to run STORM with various data retrieval methods simultaneously, the project is positioning itself as a more versatile tool that can cater to diverse research needs. This aligns with STORM's design philosophy of being adaptable for different use cases.

  3. Community Engagement and Feedback: The active discussion surrounding PRs, particularly those related to code formatting (e.g., PR #159), showcases an engaged community willing to collaborate on improving the project's quality standards. However, it also reveals some friction points; for instance, there are concerns about how changes like adopting black might affect developers' workflows.

  4. Maintenance Challenges: Several older PRs remain open or have been closed without merging due to concerns about their relevance or alignment with future project directions (e.g., PR #17). This highlights a common challenge in open-source projects where contributors may propose changes that do not align with evolving project goals or where significant refactoring is anticipated.

  5. Feature Development vs. Bug Fixing: The balance between adding new features (like multiple retriever systems) and addressing bugs (such as compatibility fixes in PR #148) is crucial for maintaining user trust while also pushing forward innovation. The recent focus on enhancing existing functionalities suggests that the team is aware of the need to stabilize the platform before introducing more complex features.

In conclusion, while STORM is making significant strides in enhancing its capabilities through community contributions, it must continue addressing maintenance challenges and ensuring that new features align with user needs and expectations. The active engagement from contributors indicates a healthy development environment that can adapt over time but also requires careful management of priorities and resources.

Report On: Fetch commits



Repo Commits Analysis

Development Team and Recent Activity

Team Members and Their Recent Activities

  1. Yijia Shao (shaoyijia)

    • Recent Activity:
    • Implemented support for backoff and retry mechanisms in DuckDuckGoSearchRM.
    • Reformatted the entire codebase using black.
    • Merged multiple pull requests including enhancements for new retrieval modules (Tavily, DuckDuckGo).
    • Addressed bugs related to VLLMClient and fixed a typo in README.md.
    • Collaborations: Worked closely with Yucheng-Jiang, zenith110, and other contributors on various pull requests.
    • In Progress: Active in multiple branches with ongoing enhancements and fixes.
  2. Yucheng-Jiang

    • Recent Activity: No recent commits or contributions noted in the last 30 days.
    • Collaborations: Previously collaborated with Yijia Shao on code formatting and documentation updates.
  3. Abrahan N. (zenith110)

    • Recent Activity:
    • Contributed to the addition of new retrieval modules (Tavily, DuckDuckGo).
    • Made several updates to README.md and fixed bugs related to retrieval modules.
    • Collaborations: Collaborated with Yijia Shao on multiple features and bug fixes.
  4. Sureenheer

    • Recent Activity:
    • Fixed bugs related to file handling in demo light.
    • Contributed minor changes to the project.
    • Collaborations: Worked with Yijia Shao on merging pull requests.
  5. Ndehouche

    • Recent Activity:
    • Made a significant commit adding a new model (GroqModel) with extensive changes.
    • Collaborations: No recent collaborations noted.
  6. AMMAS1

    • Recent Activity: No recent commits or contributions noted.
    • Collaborations: Involved in previous pull requests but no current activity.
  7. Fredheir, GuillermoBlasco, Haailabs, Hdelossantos, Kosiew, Songkq

    • Recent Activity: No recent commits or contributions noted for any of these members.

Patterns, Themes, and Conclusions

  • Active Contributors: Yijia Shao is the most active member, contributing significantly to both feature development and bug fixes. The focus has been on enhancing retrieval capabilities and ensuring code quality through formatting.
  • Collaboration Dynamics: There is a strong collaborative environment with frequent merges between team members, particularly between Yijia Shao and zenith110.
  • Feature Enhancements vs. Bug Fixes: The recent activities show a balance between implementing new features (e.g., new retrieval modules) and addressing existing bugs, indicating a responsive development approach.
  • Stagnation of Some Members: Several team members have shown no activity in the past month, suggesting potential disengagement or focus on other projects.
  • Documentation Improvements: Continuous updates to documentation reflect an emphasis on user guidance and project clarity, which is crucial for a modular project like STORM.

Overall, the development team appears to be effectively advancing the STORM project through collaborative efforts while maintaining a focus on both feature enhancement and code quality.