The STORM project, developed by the Stanford Oval organization, is an advanced system leveraging large language models (LLMs) to automate knowledge curation and report generation. It features both autonomous and collaborative modes, allowing for human-AI interaction in refining information synthesis. The project is in a robust state with significant community engagement, evidenced by its high number of stars and forks on GitHub. Its trajectory appears positive, with ongoing development and community contributions.
Yijia Shao (shaoyijia)
Yucheng Jiang
Eminem (zhoucheng89)
Patrick (patrick@cryptolock.ai)
Adam Montgomery (montasaurus)
Hagen Hübel (itinance)
宋小北 (xiaobeicn)
Evidencebp
Ikko Eltociear Ashimine (eltociear)
rm.py
.Abrahan N. (zenith110)
Hanly De Los Santos (hdelossantos)
Kevin Jiang (kevindragon)
Ray (rmcc3)
Timespan | Opened | Closed | Comments | Labeled | Milestones |
---|---|---|---|---|---|
7 Days | 5 | 2 | 2 | 5 | 1 |
30 Days | 8 | 3 | 4 | 8 | 1 |
90 Days | 36 | 16 | 28 | 36 | 1 |
All Time | 157 | 112 | - | - | - |
Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.
Risk | Level (1-5) | Rationale |
---|---|---|
Delivery | 4 | The project faces significant delivery risks due to a growing backlog of unresolved issues, with 45 open issues and a net increase in open issues over recent periods. The lack of milestone usage and prolonged open status of several pull requests, such as PR #155 and PR #192, further exacerbate these risks. Additionally, the introduction of new features and modules without comprehensive testing could lead to unforeseen delays. |
Velocity | 4 | The project's velocity is at risk due to the slow review process for pull requests, such as PR #268 and PR #192, which have been open for extended periods. The increasing backlog of unresolved issues also suggests potential stagnation in addressing critical tasks. Furthermore, limited engagement in issue comments indicates possible communication challenges within the team, affecting overall progress. |
Dependency | 3 | The project exhibits moderate dependency risks due to reliance on external libraries and systems, such as dspy and various language models. Issues like #262 highlight potential integration challenges with external dependencies. However, efforts to manage unreliable sources in the retriever module indicate some proactive measures to mitigate these risks. |
Team | 3 | Team-related risks are present due to low engagement in issue comments and potential communication challenges. The request for open-sourcing frontend code (#267) suggests transparency or collaboration concerns. However, active maintenance and merging of pull requests demonstrate some level of team cohesion. |
Code Quality | 3 | Code quality risks are moderate, with ongoing efforts to improve documentation and address minor errors through pull requests like #264 and #192. However, the presence of uncaught exceptions and iterable errors in issues like #262 indicates areas needing improvement. The lack of inline documentation in some modules may hinder maintainability. |
Technical Debt | 4 | Technical debt is accumulating due to frequent bug reports and unresolved issues indicating underlying codebase problems. The introduction of new features without thorough testing could exacerbate this debt. While there are efforts to refactor code for readability, the ongoing need for bug fixes suggests persistent technical debt concerns. |
Test Coverage | 4 | Test coverage appears insufficient given the recurring bug reports and error descriptions in issues like #262 and #257. The absence of explicit test coverage in key modules raises concerns about the project's ability to catch regressions or handle edge cases effectively. |
Error Handling | 4 | Error handling is inadequate as evidenced by uncaught exceptions reported in issues like #262. While some pull requests aim to address specific error handling improvements, the overall lack of comprehensive error management strategies poses a significant risk to system reliability. |
The recent activity on the GitHub repository for the STORM project shows a moderate level of engagement with 45 open issues. Notably, there is a mix of feature requests, bug reports, and questions from users, indicating active participation from the community. Several issues have been closed recently, demonstrating ongoing maintenance and responsiveness from the development team.
#274: Storm
#272: Integration of an Open source alternative to Open Ai's canvas/ Claude Artifacts
#270: question
#217: Want to run fully locally using OLLAMA and SEARXNG
#262: [BUG] Uncaught Exception
#267: About Plans to Open Source the Frontend Code
Overall, the STORM project is actively maintained with regular updates and community interaction. However, some critical issues may require more immediate attention to ensure smooth functionality and user satisfaction.
azure_api_key
function to make its parameter optional, enhancing flexibility in function calls.init_openai_model
definition to make azure_api_key
optional.article_generation.py
and storm_dataclass.py
.valid_url_to_snippets.get(url, {})
returns None.GoogleSearch
The STORM project shows active development with several open pull requests addressing both minor enhancements and significant feature additions. Notably, some PRs have been open for extended periods (#192 and #155), which might need prioritization or additional resources to resolve. The recently closed PRs indicate ongoing efforts to integrate new features like Azure AI Search and Google Search while maintaining code quality through linting corrections. The project also demonstrates responsiveness to community contributions, as seen in the quick closure of some PRs after necessary adjustments.
knowledge_storm/rm.py
Structure and Quality:
dspy.Retrieve
, ensuring a consistent interface across different retrieval methods.backoff
for retrying requests is a good practice for handling transient network issues.Potential Improvements:
knowledge_storm/lm.py
Structure and Quality:
Potential Improvements:
anthropic
, google.generativeai
) are clearly documented in the setup or requirements files.setup.py
Structure and Quality:
setuptools
to define package metadata and dependencies.README.md
, requirements.txt
), which is a good practice for maintainability.Potential Improvements:
requirements.txt
Structure and Quality:
Potential Improvements:
knowledge_storm/storm_wiki/modules/article_generation.py
Structure and Quality:
ThreadPoolExecutor
to improve performance when generating sections concurrently.Potential Improvements:
knowledge_storm/collaborative_storm/modules/co_storm_agents.py
Structure and Quality:
Potential Improvements:
Yijia Shao (shaoyijia)
Yucheng Jiang
requirements.txt
and fixing typos.Eminem (zhoucheng89)
Patrick (patrick@cryptolock.ai)
Adam Montgomery (montasaurus)
Hagen Hübel (itinance)
宋小北 (xiaobeicn)
Evidencebp
Ikko Eltociear Ashimine (eltociear)
rm.py
for typo corrections.Abrahan N. (zenith110)
Hanly De Los Santos (hdelossantos)
Kevin Jiang (kevindragon)
Ray (rmcc3)