Vanna, an open-source Python framework designed for text-to-SQL generation using Retrieval-Augmented Generation (RAG) techniques, continues to evolve with active development efforts aimed at improving database compatibility and addressing user-reported issues.
Recent issues and pull requests (PRs) highlight ongoing challenges and improvements. Notably, there are recurring reports of SQL syntax errors when interfacing with different databases, such as MSSQL, indicating a need for enhanced compatibility and error handling. Issues like #636, a high-priority bug in the Flask app, and #620, an API connection error with Azure OpenAI, underscore critical areas affecting usability and integration.
The development team has been actively addressing these challenges. Zain Hoda has been instrumental in merging PRs related to Azure AI and FAISS integration, while Mohammed Abbadi focused on FAISS updates. Other contributors like Jaya Maheshwari and Anush008 have also made significant contributions to specific features and bug fixes. The team's recent activities include:
pyproject.toml
and several Python scripts.azuresearch_vector.py
.n_results
for WeaviateDatabase.ollama_timeout
parameter.Overall, the Vanna project is on a trajectory of steady improvement, with active contributions aimed at enhancing functionality and user experience while addressing critical integration issues.
Timespan | Opened | Closed | Comments | Labeled | Milestones |
---|---|---|---|---|---|
7 Days | 3 | 2 | 0 | 2 | 1 |
30 Days | 13 | 12 | 16 | 9 | 1 |
90 Days | 59 | 30 | 61 | 35 | 1 |
1 Year | 234 | 157 | 425 | 169 | 1 |
All Time | 290 | 191 | - | - | - |
Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.
Developer | Avatar | Branches | PRs | Commits | Files | Changes |
---|---|---|---|---|---|---|
Mohammed Abbadi | 1 | 1/1/0 | 3 | 2 | 189 | |
Zain Hoda | 1 | 1/2/0 | 6 | 5 | 185 | |
Sinju P | 1 | 0/0/0 | 1 | 1 | 62 | |
Anush | 1 | 1/1/0 | 1 | 1 | 19 | |
Nikhil Talreja | 1 | 0/0/0 | 1 | 1 | 5 | |
Jaya Maheshwari | 1 | 1/1/0 | 1 | 1 | 4 | |
峰峰 | 1 | 1/1/0 | 1 | 1 | 4 | |
dusens | 0 | 0/1/0 | 0 | 0 | 0 | |
Sinju P (sinjup) | 0 | 2/1/0 | 0 | 0 | 0 | |
Mhamed Talhaouy (tal7aouy) | 0 | 1/0/0 | 0 | 0 | 0 | |
Skander Hellal (SkanderHellal) | 0 | 2/0/2 | 0 | 0 | 0 | |
Nikhil Talreja (talrejanikhil) | 0 | 1/1/0 | 0 | 0 | 0 | |
Luca Ordronneau | 0 | 0/1/0 | 0 | 0 | 0 |
PRs: created by that dev and opened/merged/closed-unmerged during the period
The Vanna AI GitHub repository currently has 99 open issues, with recent activity indicating a steady flow of user inquiries and bug reports. Notably, issues related to bugs in the Flask app, integration with various databases, and feature requests for enhanced functionality are prevalent. A recurring theme is the need for better error handling and support for multiple database types, particularly around SQL generation and execution.
Several issues stand out due to their implications for the project's usability and stability. For instance, there are multiple reports of SQL syntax errors when interfacing with different databases (e.g., MSSQL), suggesting a need for improved compatibility and error messaging. Additionally, users express frustration over the lack of features like customizable follow-up questions and the ability to manage training data more effectively.
Issue #638: how to use vanna in java spring ai project
Issue #637: How ro show results in Numbers and with thousand separator along with change the currency to not be $
Issue #636: flask app bug for rewriting
Issue #620: openai.APIConnectionError with AzureOpenAI
Issue #507: How to do multiple vanna.AI with one API
The issue regarding openai.APIConnectionError (#620) highlights a significant integration challenge that could affect users relying on Azure services. The discussions around this issue reveal confusion about configuration settings, which may hinder new users from successfully implementing Vanna.
The ongoing confusion regarding SQL syntax across different databases (e.g., MSSQL's LIMIT
vs. TOP
) indicates a critical area for improvement in documentation and error handling mechanisms within Vanna's SQL generation capabilities.
Users frequently request enhancements related to training data management, such as the ability to remove duplicates or update existing training data without creating conflicts. This suggests that current functionalities may not adequately support user needs for maintaining clean datasets.
These issues reflect both user experience challenges and technical hurdles that could impact the adoption and effectiveness of Vanna as a tool for text-to-SQL generation.
The analysis of the pull requests (PRs) for the Vanna project reveals a total of 9 open PRs, with a mix of feature additions, bug fixes, and refactoring efforts. The PRs demonstrate ongoing enhancements to the framework's capabilities, particularly in integrating new data sources and improving existing functionalities.
PR #539: feat: openrouter integration, added additional async methods
PR #621: fix: prevent user from deleting the last training data #613
PR #617: REPLACE MAGIC NUMBER WITH NAMED CONSTANT
PR #589: feat: opensearch supports document data update, query by table, embedding, etc.
PR #555: Feature/sqlite duckdb vector support
PR #525: Add timeout to requests
calls
PR #463: feat: add database engine and table name to support table ddl update
PR #460: Update base.py
PR #238: Vanna trulens performance metrics
The current state of open pull requests in the Vanna project indicates an active development environment with a focus on enhancing functionality and addressing user experience issues. A notable aspect is the emphasis on asynchronous programming (as seen in PR #539), which aligns with modern software development practices aimed at improving performance and responsiveness in applications that handle I/O operations extensively.
Several PRs are dedicated to fixing bugs or improving existing features, such as PR #621, which addresses a critical usability issue regarding training data management. This reflects a commitment to maintaining user satisfaction and operational integrity within the application.
Moreover, there is a clear trend towards expanding compatibility with various databases and services (e.g., PRs related to SQLite, DuckDB, OpenSearch), which enhances Vanna's versatility as a text-to-SQL generation tool. The addition of timeout settings in HTTP requests (PR #525) further indicates an awareness of potential pitfalls in network communications, aiming to enhance reliability.
However, some PRs have been noted for their size and complexity (e.g., PR #539), which could hinder timely reviews and merges. The suggestion from reviewers to split larger PRs into smaller ones should be taken seriously to facilitate smoother collaboration among contributors and maintainers.
Additionally, there are older PRs that remain open without significant activity or resolution (e.g., PR #238). This could indicate either a lack of prioritization or resource constraints within the team. Addressing these older requests could help streamline the project’s development process and ensure that valuable contributions do not stagnate.
In conclusion, while the current set of open pull requests showcases a robust effort towards continuous improvement and feature expansion in Vanna, attention must be given to managing the complexity of contributions effectively and ensuring timely resolutions to enhance overall project momentum.
Zain Hoda (zainhoda)
pyproject.toml
file and various Python scripts, particularly in the base.py
, azuresearch_vector.py
, and faiss.py
files.Jaya Maheshwari (Jaya-sys)
azuresearch_vector.py
.Mohammed Abbadi (m7mdhka)
faiss.py
, including integrating FAISS into VannaAI.Anush008
Nikhil Talreja
n_results
for WeaviateDatabase.Sinju P
Luca Ordronneau (lucaordronneau)
Dusens (dusens)
Dufeng1010 (dufeng1010)
ollama_timeout
parameter in a recent commit.The development team is engaged in active contributions towards improving Vanna's functionalities, focusing on integrations and bug fixes while collaborating effectively among members. The recent activities reflect a commitment to enhancing both performance and user interaction capabilities within the framework.