‹ Reports
The Dispatch

OSS Report: vanna-ai/vanna


Vanna Project Sees Steady Development with Focus on Database Integration and Bug Fixes

Vanna, an open-source Python framework designed for text-to-SQL generation using Retrieval-Augmented Generation (RAG) techniques, continues to evolve with active development efforts aimed at improving database compatibility and addressing user-reported issues.

Recent Activity

Recent issues and pull requests (PRs) highlight ongoing challenges and improvements. Notably, there are recurring reports of SQL syntax errors when interfacing with different databases, such as MSSQL, indicating a need for enhanced compatibility and error handling. Issues like #636, a high-priority bug in the Flask app, and #620, an API connection error with Azure OpenAI, underscore critical areas affecting usability and integration.

The development team has been actively addressing these challenges. Zain Hoda has been instrumental in merging PRs related to Azure AI and FAISS integration, while Mohammed Abbadi focused on FAISS updates. Other contributors like Jaya Maheshwari and Anush008 have also made significant contributions to specific features and bug fixes. The team's recent activities include:

  1. Zain Hoda: Merged multiple PRs; updated pyproject.toml and several Python scripts.
  2. Jaya Maheshwari: Added text to azuresearch_vector.py.
  3. Mohammed Abbadi: Integrated FAISS into VannaAI.
  4. Anush008: Refactored Qdrant query API.
  5. Nikhil Talreja: Added n_results for WeaviateDatabase.
  6. Sinju P: Fixed Postgres reconnection issue.
  7. Dufeng1010: Added ollama_timeout parameter.

Of Note

Overall, the Vanna project is on a trajectory of steady improvement, with active contributions aimed at enhancing functionality and user experience while addressing critical integration issues.

Quantified Reports

Quantify Issues



Recent GitHub Issues Activity

Timespan Opened Closed Comments Labeled Milestones
7 Days 3 2 0 2 1
30 Days 13 12 16 9 1
90 Days 59 30 61 35 1
1 Year 234 157 425 169 1
All Time 290 191 - - -

Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.

Quantify commits



Quantified Commit Activity Over 30 Days

Developer Avatar Branches PRs Commits Files Changes
Mohammed Abbadi 1 1/1/0 3 2 189
Zain Hoda 1 1/2/0 6 5 185
Sinju P 1 0/0/0 1 1 62
Anush 1 1/1/0 1 1 19
Nikhil Talreja 1 0/0/0 1 1 5
Jaya Maheshwari 1 1/1/0 1 1 4
峰峰 1 1/1/0 1 1 4
dusens 0 0/1/0 0 0 0
Sinju P (sinjup) 0 2/1/0 0 0 0
Mhamed Talhaouy (tal7aouy) 0 1/0/0 0 0 0
Skander Hellal (SkanderHellal) 0 2/0/2 0 0 0
Nikhil Talreja (talrejanikhil) 0 1/1/0 0 0 0
Luca Ordronneau 0 0/1/0 0 0 0

PRs: created by that dev and opened/merged/closed-unmerged during the period

Detailed Reports

Report On: Fetch issues



Recent Activity Analysis

The Vanna AI GitHub repository currently has 99 open issues, with recent activity indicating a steady flow of user inquiries and bug reports. Notably, issues related to bugs in the Flask app, integration with various databases, and feature requests for enhanced functionality are prevalent. A recurring theme is the need for better error handling and support for multiple database types, particularly around SQL generation and execution.

Several issues stand out due to their implications for the project's usability and stability. For instance, there are multiple reports of SQL syntax errors when interfacing with different databases (e.g., MSSQL), suggesting a need for improved compatibility and error messaging. Additionally, users express frustration over the lack of features like customizable follow-up questions and the ability to manage training data more effectively.

Issue Details

Most Recently Created Issues

  1. Issue #638: how to use vanna in java spring ai project

    • Priority: Low
    • Status: Open
    • Created: 1 day ago
    • Summary: User seeks guidance on integrating Vanna with a Java Spring AI project.
  2. Issue #637: How ro show results in Numbers and with thousand separator along with change the currency to not be $

    • Priority: Low
    • Status: Open
    • Created: 1 day ago
    • Summary: User requests assistance on formatting numerical results in a specific way.
  3. Issue #636: flask app bug for rewriting

    • Priority: High
    • Status: Open
    • Created: 1 day ago
    • Summary: Bug report indicating that the rewrite function in the Flask app does not trigger correctly on first use.
  4. Issue #620: openai.APIConnectionError with AzureOpenAI

    • Priority: High
    • Status: Open
    • Created: 20 days ago; last edited 8 days ago.
    • Summary: User experiences connection errors while using Azure OpenAI with Vanna, despite correct configuration.
  5. Issue #507: How to do multiple vanna.AI with one API

    • Priority: Medium
    • Status: Open
    • Created: 87 days ago; last edited 10 days ago.
    • Summary: User seeks advice on isolating multiple domains when using a single API.

Notable Anomalies and Complications

  • The issue regarding openai.APIConnectionError (#620) highlights a significant integration challenge that could affect users relying on Azure services. The discussions around this issue reveal confusion about configuration settings, which may hinder new users from successfully implementing Vanna.

  • The ongoing confusion regarding SQL syntax across different databases (e.g., MSSQL's LIMIT vs. TOP) indicates a critical area for improvement in documentation and error handling mechanisms within Vanna's SQL generation capabilities.

  • Users frequently request enhancements related to training data management, such as the ability to remove duplicates or update existing training data without creating conflicts. This suggests that current functionalities may not adequately support user needs for maintaining clean datasets.

Important Issues Summary

  • #638: Integration guidance for Java Spring AI.
  • #637: Formatting numerical results.
  • #636: Flask app bug affecting functionality.
  • #620: Connection errors with Azure OpenAI.
  • #507: Managing multiple domains with one API.

These issues reflect both user experience challenges and technical hurdles that could impact the adoption and effectiveness of Vanna as a tool for text-to-SQL generation.

Report On: Fetch pull requests



Overview

The analysis of the pull requests (PRs) for the Vanna project reveals a total of 9 open PRs, with a mix of feature additions, bug fixes, and refactoring efforts. The PRs demonstrate ongoing enhancements to the framework's capabilities, particularly in integrating new data sources and improving existing functionalities.

Summary of Pull Requests

  1. PR #539: feat: openrouter integration, added additional async methods

    • State: Open
    • Created: 71 days ago
    • Introduces asynchronous methods for MySQL and OpenRouter integration. The PR is significant due to its extensive changes (over 800 lines) and has received feedback suggesting it be split into smaller PRs for easier review.
  2. PR #621: fix: prevent user from deleting the last training data #613

    • State: Open
    • Created: 19 days ago
    • Addresses a UI issue where users could delete all training data, preventing new data additions. This fix is crucial for maintaining user experience.
  3. PR #617: REPLACE MAGIC NUMBER WITH NAMED CONSTANT

    • State: Open
    • Created: 21 days ago
    • Replaces a hardcoded magic number with a named constant for better code readability and maintainability.
  4. PR #589: feat: opensearch supports document data update, query by table, embedding, etc.

    • State: Open
    • Created: 39 days ago
    • Adds support for document updates and queries in OpenSearch, enhancing the framework's compatibility with various data sources.
  5. PR #555: Feature/sqlite duckdb vector support

    • State: Open
    • Created: 62 days ago
    • Implements support for DuckDB and SQLite as vector stores, significantly expanding the database options available to users.
  6. PR #525: Add timeout to requests calls

    • State: Open
    • Created: 76 days ago
    • Introduces default timeout settings for HTTP requests to prevent indefinite hanging, improving robustness.
  7. PR #463: feat: add database engine and table name to support table ddl update

    • State: Open
    • Created: 108 days ago
    • Enhances the framework's ability to manage database schema changes dynamically.
  8. PR #460: Update base.py

    • State: Open
    • Created: 109 days ago
    • Focuses on improving MySQL connection handling through better resource management practices.
  9. PR #238: Vanna trulens performance metrics

    • State: Open
    • Created: 220 days ago
    • Introduces a performance evaluation script for assessing various components of the Vanna application.

Analysis of Pull Requests

The current state of open pull requests in the Vanna project indicates an active development environment with a focus on enhancing functionality and addressing user experience issues. A notable aspect is the emphasis on asynchronous programming (as seen in PR #539), which aligns with modern software development practices aimed at improving performance and responsiveness in applications that handle I/O operations extensively.

Several PRs are dedicated to fixing bugs or improving existing features, such as PR #621, which addresses a critical usability issue regarding training data management. This reflects a commitment to maintaining user satisfaction and operational integrity within the application.

Moreover, there is a clear trend towards expanding compatibility with various databases and services (e.g., PRs related to SQLite, DuckDB, OpenSearch), which enhances Vanna's versatility as a text-to-SQL generation tool. The addition of timeout settings in HTTP requests (PR #525) further indicates an awareness of potential pitfalls in network communications, aiming to enhance reliability.

However, some PRs have been noted for their size and complexity (e.g., PR #539), which could hinder timely reviews and merges. The suggestion from reviewers to split larger PRs into smaller ones should be taken seriously to facilitate smoother collaboration among contributors and maintainers.

Additionally, there are older PRs that remain open without significant activity or resolution (e.g., PR #238). This could indicate either a lack of prioritization or resource constraints within the team. Addressing these older requests could help streamline the project’s development process and ensure that valuable contributions do not stagnate.

In conclusion, while the current set of open pull requests showcases a robust effort towards continuous improvement and feature expansion in Vanna, attention must be given to managing the complexity of contributions effectively and ensuring timely resolutions to enhance overall project momentum.

Report On: Fetch commits



Repo Commits Analysis

Development Team and Recent Activity

Team Members and Recent Contributions

  1. Zain Hoda (zainhoda)

    • Recent Activity:
    • Merged multiple pull requests, including bug fixes and feature enhancements related to Azure AI, FAISS integration, and multi-turn conversations.
    • Made significant updates to the pyproject.toml file and various Python scripts, particularly in the base.py, azuresearch_vector.py, and faiss.py files.
    • Total of 6 commits with 185 changes across 5 files in the last 30 days.
    • Collaborations: Worked closely with Jaya Maheshwari, Mohammed Abbadi, Anush008, Nikhil Talreja, and others on various features and bug fixes.
  2. Jaya Maheshwari (Jaya-sys)

    • Recent Activity:
    • Contributed a commit that added text to the question parameter in the azuresearch_vector.py.
    • Total of 1 commit with 4 changes across 1 file in the last 30 days.
    • Collaborations: Collaborated with Zain Hoda on the recent bug fix.
  3. Mohammed Abbadi (m7mdhka)

    • Recent Activity:
    • Contributed multiple updates to faiss.py, including integrating FAISS into VannaAI.
    • Total of 3 commits with 189 changes across 2 files in the last 30 days.
    • Collaborations: Worked with Zain Hoda on FAISS integration.
  4. Anush008

    • Recent Activity:
    • Merged a pull request that refactored the Qdrant query API.
    • Total of 1 commit with 19 changes across 1 file in the last 30 days.
    • Collaborations: Collaborated with Zain Hoda on Qdrant-related updates.
  5. Nikhil Talreja

    • Recent Activity:
    • Contributed a commit that added n_results for WeaviateDatabase.
    • Total of 1 commit with 5 changes across 1 file in the last 30 days.
  6. Sinju P

    • Recent Activity:
    • Fixed an issue related to Postgres not reconnecting after idle time.
    • Total of 1 commit with 62 changes across 1 file in the last 30 days.
  7. Luca Ordronneau (lucaordronneau)

    • Recent Activity:
    • No recent commits but had a merged pull request related to Azure search vector support.
  8. Dusens (dusens)

    • Recent Activity:
    • No recent commits but had a merged pull request related to adding QianwenAI functionality.
  9. Dufeng1010 (dufeng1010)

    • Recent Activity:
    • Added an ollama_timeout parameter in a recent commit.
    • Total of 1 commit with 4 changes across 1 file in the last 30 days.

Patterns and Themes

  • Active Development: The team is actively merging pull requests and addressing bugs, particularly around integrations with Azure AI and FAISS.
  • Collaboration: There is significant collaboration among team members, particularly between Zain Hoda and other contributors for feature implementations and bug fixes.
  • Focus Areas: Recent activities indicate a strong focus on enhancing database interactions (e.g., Azure, Qdrant, Weaviate) and improving overall system stability (e.g., Postgres reconnections).
  • Feature Expansion: The team is continuously expanding features related to natural language processing capabilities within SQL generation, indicating ongoing enhancements to user experience.

Conclusion

The development team is engaged in active contributions towards improving Vanna's functionalities, focusing on integrations and bug fixes while collaborating effectively among members. The recent activities reflect a commitment to enhancing both performance and user interaction capabilities within the framework.