‹ Reports
The Dispatch

OSS Report: stanford-oval/storm


STORM Project Sees Active Development with Focus on New Features and Bug Fixes

STORM, an advanced knowledge curation system by Stanford Oval, is actively enhancing its capabilities through new feature integrations and diligent bug resolution efforts.

The project has seen significant activity over the past month, with notable progress in expanding retrieval model options and improving user interface customization. Key developments include the addition of Brave Search support, enhancements to the demo frontend, and the integration of new language models like GroqOpenAIModel. The team has also been addressing critical bugs, such as those affecting outline generation and API access.

Recent Activity

Recent issues and pull requests indicate a dual focus on expanding functionality and resolving persistent bugs. The addition of new retrieval models like DuckDuckGoRM (#145) and SearXNG (#119) suggests a trajectory towards broader search capabilities. Concurrently, bug-related issues such as #139 (outline generation failures) highlight ongoing challenges in maintaining system reliability.

Development Team Activity

Of Note

Quantified Reports

Quantify commits



Quantified Commit Activity Over 30 Days

Developer Avatar Branches PRs Commits Files Changes
Abrahan N. 1 3/2/0 7 22 6304
AMMAS1 1 2/2/0 11 4 415
Yijia Shao 1 1/1/0 6 7 366
Ray 1 1/1/1 2 2 266
Kevin Jiang 1 1/1/0 2 7 115
Yucheng-Jiang 1 1/1/0 2 2 27
amrpyt 1 1/1/0 1 1 6
Paillat 1 1/1/0 1 1 2
Eduardo Aguilar Pelaez (edu-ap) 0 1/0/1 0 0 0
None (kosiew) 0 1/0/0 0 0 0
Jaigouk Kim (jaigouk) 0 4/0/3 0 0 0
HaAI Labs (haailabs) 0 1/0/0 0 0 0
None (Dean-98543) 0 1/0/1 0 0 0
None (buerbaumer) 0 1/0/1 0 0 0
None (hdelossantos) 0 1/0/0 0 0 0
Guillermo (GuillermoBlasco) 0 1/0/0 0 0 0

PRs: created by that dev and opened/merged/closed-unmerged during the period

Quantify Issues



Recent GitHub Issues Activity

Timespan Opened Closed Comments Labeled Milestones
7 Days 7 4 14 7 1
30 Days 34 21 67 27 1
90 Days 55 41 125 48 1
All Time 81 52 - - -

Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.

Detailed Reports

Report On: Fetch issues



Recent Activity Analysis

The STORM project currently has 29 open issues, indicating ongoing user engagement and potential areas for improvement. Notably, several issues revolve around bugs in the system, such as errors in generating outlines and retrieving data, which may hinder the overall functionality of the application. A recurring theme is the need for better error handling and clearer documentation to assist users in troubleshooting.

Several issues exhibit significant user interaction, with multiple comments seeking clarification or providing additional context. This suggests a community-driven approach to problem-solving, though it also highlights potential gaps in the project's documentation and support resources.

Issue Details

Recent Issues

  1. Issue #139: [BUG] No Outline Generated

    • Priority: High
    • Status: Open
    • Created: 6 days ago
    • Updated: 3 days ago
    • Description: Users report intermittent failures in generating article outlines, leading to incomplete outputs. The issue is compounded by unclear error messages and a lack of detailed troubleshooting guidance.
  2. Issue #138: Ability to specify high priority references

    • Priority: Medium
    • Status: Open
    • Created: 6 days ago
    • Description: A feature request for allowing users to specify preferred references during the research phase, which could enhance the quality of generated content.
  3. Issue #137: TopicExpert generate_queries The output is not displayed as described

    • Priority: Medium
    • Status: Open
    • Created: 6 days ago
    • Updated: 5 days ago
    • Description: Users have reported that the output from the generate_queries function includes unnecessary parameters (topic and question), which deviates from expected behavior.
  4. Issue #133: Bing search error

    • Priority: High
    • Status: Open
    • Created: 12 days ago
    • Updated: 2 days ago
    • Description: Users encounter numerous HTTP 403 errors when attempting to retrieve data via Bing search, indicating potential issues with API access or network configurations.
  5. Issue #120: Unable to run streamlit frontend

    • Priority: High
    • Status: Open
    • Created: 17 days ago
    • Updated: 13 days ago
    • Description: Installation conflicts related to package dependencies prevent users from successfully running the Streamlit frontend, highlighting a need for clearer installation instructions.

Important Observations

  • There are multiple instances of users experiencing similar issues with outline generation and API retrieval failures, suggesting systemic problems that may require urgent attention.
  • Feature requests indicate a desire for enhanced customization options, particularly regarding reference management during content generation.
  • The community actively engages in discussions around these issues, but there appears to be a gap in formal documentation or support mechanisms to streamline troubleshooting.

This analysis underscores the importance of addressing both technical bugs and enhancing user experience through improved documentation and feature enhancements.

Report On: Fetch pull requests



Overview

The dataset provided contains a comprehensive list of open and closed pull requests (PRs) for the STORM project, which is an advanced knowledge curation system utilizing large language models. The current state of the repository indicates a vibrant development activity with numerous contributions aimed at enhancing functionality, fixing bugs, and improving user experience.

Summary of Pull Requests

Open Pull Requests

  1. PR #146: chore: add callback handler to article generation and article polish
    Created 0 days ago. This PR introduces a callback handler for article generation and polishing stages, enhancing the system's extensibility by allowing additional information to be passed during these processes.

  2. PR #145: [New RM] Add DuckDuckGoRM
    Created 1 day ago. This PR implements a new retriever model using DuckDuckGo's API, expanding the search capabilities of the system and providing documentation for its usage.

  3. PR #144: [New LM] Added GroqOpenAIModel
    Created 2 days ago. This PR adds support for the Groq model, which is noted for its high performance in LLM inference, thus improving the options available for language model integration.

  4. PR #135: [Demo Enhancement] added storm wiki frontend with themes
    Created 8 days ago. This PR enhances the demo frontend by adding theme options and search engine configurations, significantly improving user interface customization.

  5. PR #119: [New RM] Add support for SearXNG
    Created 17 days ago. This PR integrates SearXNG as a privacy-focused search engine option, further diversifying retrieval methods available to users.

  6. PR #114: [New RM] Add duckduckgo retriever
    Created 19 days ago. This PR introduces a retriever for DuckDuckGo, emphasizing privacy in web searches without requiring an API key.

  7. PR #20: [New RM] Support DuckDuckGoSearchAPI and TavilySearchAPI as Alternatives to You.com
    Created 124 days ago. This PR supports two additional search APIs, enhancing flexibility in retrieval options.

  8. PR #17: [doc] Add readme-zh for Chinese users
    Created 125 days ago. This PR adds a Chinese version of the README file to improve accessibility for non-English speaking users.

Closed Pull Requests

  1. PR #136: Bug fix and enhancements to the files in the eval folder
    Closed 7 days ago without merging due to concerns about changes that could affect reproducibility.

  2. PR #134: [New RM] support brave search
    Merged 9 days ago, this PR added Brave Search as an option for retrieving information.

  3. PR #132: [RM Enhancement] Fixed SerperRM when knowledge graph is none to properly return data
    Merged 13 days ago after resolving issues with data retrieval when no knowledge graph was present.

  4. PR #130: :adhesive_bandage: Add GoogleModel in readme
    Merged 12 days ago to update documentation regarding newly added models.

  5. PR #128: chore: Update Gemini models in STORM Wiki pipeline
    Merged 14 days ago to include new Gemini models in the pipeline.

  6. PR #127: fixed vectorRM requiring embedding model in the example.
    Merged 14 days ago to correct example usage of VectorRM.

  7. PR #105: [New LM] Support Gemini model.
    Merged 14 days ago to integrate Gemini models into the STORM framework.

  8. PR #92: feat(lm): Improved Modularity and Maintainability for LLM Integration
    Closed without merging due to insufficient testing across LLMs; it proposed significant refactoring for better modularity.

Analysis of Pull Requests

The recent activity within the STORM repository reflects a strong focus on enhancing both functionality and user experience through various integrations and improvements. The open pull requests indicate ongoing efforts to expand the capabilities of the system by introducing new retrieval models (RMs) such as DuckDuckGo and SearXNG, which cater to privacy-conscious users while broadening search capabilities.

A notable trend is the emphasis on modularity and flexibility within the architecture of STORM, as seen in PRs like #135 and #144 that enhance user interface options and add new language models respectively. The community appears engaged in refining existing features while also pushing for innovative additions that align with user needs—such as theme customization and improved retrieval methods.

However, there are also signs of caution among contributors regarding changes that could impact system stability or reproducibility, as evidenced by PR #136 being closed despite its potential benefits. This reflects a mature approach towards software development where stability is prioritized alongside feature enhancement.

Moreover, there is a clear commitment to inclusivity through initiatives like adding documentation in multiple languages (e.g., PR #17). Such efforts not only broaden accessibility but also foster community engagement from diverse user bases.

The closed pull requests also highlight challenges faced during development; several were not merged due to concerns over code quality or alignment with project goals (e.g., PRs #136 and #92). This indicates an active review process that aims to maintain high standards within the codebase while encouraging contributions that genuinely enhance project capabilities.

In conclusion, STORM's ongoing development showcases a dynamic environment where contributors are actively working towards creating a robust knowledge curation tool while navigating challenges related to software quality and user engagement effectively. The balance between innovation and stability remains crucial as the project continues to evolve.

Report On: Fetch commits



Repo Commits Analysis

Development Team and Recent Activity

Team Members and Their Recent Activities

  1. Yijia Shao (shaoyijia)

    • Recent Activity:
    • Fixed issue #140.
    • Bumped up Python package version.
    • Merged multiple pull requests, including support for Brave Search and Gemini model.
    • Collaborations: Worked closely with Kevin Jiang on Brave Search support and with AMMAS1 on various improvements.
  2. Yucheng-Jiang

    • Recent Activity:
    • Updated issue templates.
    • Fixed bugs in demo light and My Articles component.
    • Collaborations: Collaborated with Yijia Shao on bug fixes.
  3. Kevin Jiang (kevindragon)

    • Recent Activity:
    • Contributed to Brave Search support.
    • Merged pull requests related to the same feature.
    • Collaborations: Worked with Yijia Shao.
  4. Abrahan N. (zenith110)

    • Recent Activity:
    • Made significant contributions, including adding support for SerperRM and fixing various bugs.
    • Engaged in extensive refactoring and merging of pull requests.
    • Collaborations: Collaborated with Yijia Shao and AMMAS1 on multiple features.
  5. Paillat (Paillat-dev)

    • Recent Activity:
    • Added GoogleModel documentation in README.
    • Collaborations: Worked with Yucheng-Jiang.
  6. AMMAS1

    • Recent Activity:
    • Made several commits focused on fixing bugs and enhancing VectorRM functionality.
    • Merged pull requests related to improvements in the utility functions.
    • Collaborations: Collaborated with Yijia Shao and Abrahan N.
  7. Ray (rmcc3)

    • Recent Activity:
    • Added support for DeepSeek language models.
    • Collaborations: Worked alongside Yijia Shao.

Patterns, Themes, and Conclusions

  • Active Collaboration: There is a strong collaborative environment, particularly evident in the joint efforts between Yijia Shao and other team members like Kevin Jiang and Abrahan N. This suggests a cohesive team dynamic focused on enhancing project features collectively.

  • Feature Enhancements vs Bug Fixes: The recent activities show a balanced focus on both feature enhancements (e.g., Brave Search, Gemini model) and bug fixes (e.g., issues in demo light). This indicates a proactive approach to maintaining software quality while also expanding functionality.

  • High Volume of Changes by Certain Members: Notably, Abrahan N. has made a significant number of changes (6304 across 22 files), indicating a deep involvement in ongoing development efforts, particularly around major features like SerperRM.

  • Continuous Integration of Community Contributions: The merging of numerous pull requests highlights an active engagement with community contributions, which is essential for maintaining an open-source project’s vitality.

  • Documentation Updates: Regular updates to documentation (e.g., README changes by multiple members) suggest an emphasis on usability and clarity for end-users, which is crucial for adoption and effective use of the software.

Overall, the development team is actively engaged in both enhancing the STORM project’s capabilities and ensuring its stability through diligent bug fixing and documentation efforts.