pg_duckdb, a Postgres extension integrating DuckDB's analytics engine, is seeing active development focused on compatibility with newer Postgres versions and addressing critical bugs. The project aims to enhance high-performance analytics within Postgres environments.
Recent issues and pull requests (PRs) in the duckdb/pg_duckdb
repository indicate a focus on compatibility enhancements and bug fixes. Notable issues include #121 regarding query cancellation failures and #120 about resource leaks causing crashes. These highlight ongoing stability concerns. Enhancement requests for spatial data support (#141) and concurrent access (#134) suggest expanding capabilities.
Jelte Fennema-Nio (JelteF)
make installcheck
efficiency.EXPLAIN ANALYZE
support.Y. (Y--)
mkaruza
Heikki Linnakangas (hlinnaka)
Thijs (Tishj)
Japin Li (japinli)
Jonathan Dance (wuputah)
Mark (Mytherin)
Timespan | Opened | Closed | Comments | Labeled | Milestones |
---|---|---|---|---|---|
7 Days | 7 | 7 | 11 | 5 | 1 |
30 Days | 26 | 12 | 43 | 14 | 2 |
90 Days | 41 | 23 | 68 | 27 | 2 |
All Time | 60 | 33 | - | - | - |
Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.
Developer | Avatar | Branches | PRs | Commits | Files | Changes |
---|---|---|---|---|---|---|
Jelte Fennema-Nio | 7 | 22/18/1 | 38 | 122 | 33335 | |
mkaruza | 2 | 5/4/0 | 5 | 88 | 7906 | |
Y. | 6 | 8/8/0 | 26 | 51 | 6851 | |
Thijs | 3 | 4/3/0 | 22 | 28 | 2291 | |
Jonathan Dance (JD) | 1 | 5/5/0 | 5 | 3 | 313 | |
Mark | 1 | 1/1/0 | 1 | 1 | 203 | |
Heikki Linnakangas | 1 | 1/1/0 | 1 | 6 | 17 | |
Japin Li | 1 | 1/1/0 | 1 | 1 | 4 | |
liu shengsong (lss602726449) | 0 | 1/0/0 | 0 | 0 | 0 | |
None (brettgriffin-md) | 0 | 1/0/0 | 0 | 0 | 0 |
PRs: created by that dev and opened/merged/closed-unmerged during the period
The recent activity in the duckdb/pg_duckdb
GitHub repository shows a total of 27 open issues, with notable discussions surrounding compatibility enhancements and bug fixes. A significant focus has emerged on ensuring compatibility with newer versions of Postgres, particularly Postgres 15 and 16, as well as addressing critical bugs related to query execution and resource management. The presence of multiple enhancement requests indicates an active interest in expanding the extension's capabilities, particularly in spatial data support and concurrent access.
Several issues highlight urgent concerns, such as #121 regarding query cancellation not functioning correctly and #120 about resource leaks leading to crashes. These issues suggest that while the project is evolving, there are underlying stability and performance concerns that need to be addressed promptly. Additionally, the discussions reveal a theme of community engagement, with contributors actively seeking to resolve issues through pull requests and collaborative dialogue.
Issue #150: Add compatibility with Postgres 16 in Windows x86-64
Issue #145: Add compatibility with Postgres 15
Issue #141: Spatial extension support
Issue #134: Does this extension support concurrent access to DuckDB from multiple connections?
Issue #121: Query cancellation not working
Issue #120: Resource leak on error, eventual crash on out of memory
Issue #118: Error with extended query protocol
Issue #116: Case insensitivity mismatch between DuckDB and Postgres can cause issues
Issue #113: Use separated libduckdb.so rpm / deb
Issue #106: Misleading error when using read_xxx
functions with invalid URLs
This selection of issues illustrates both ongoing development efforts and critical areas requiring attention, particularly around compatibility and stability as the project integrates more deeply with Postgres functionalities.
The analysis of the pull requests (PRs) for the duckdb/pg_duckdb
repository reveals a total of 12 open PRs, with significant ongoing work aimed at enhancing the integration of DuckDB with PostgreSQL. The changes span various aspects, including architectural improvements, testing frameworks, and feature additions that support advanced query capabilities.
PR #147: Make DuckDB connection a singleton to support extended protocol
This PR introduces a singleton pattern for DuckDB connections per PostgreSQL connection, enabling support for the extended protocol. It addresses issues related to query execution consistency and error handling but raises concerns regarding potential state issues when errors occur.
PR #143: Add pytest-based testing framework
This PR adds a Python-based testing framework using pytest, which allows more flexible application-style testing compared to the existing pgregress
. It aims to enhance test coverage for features like EXPLAIN ANALYZE
.
PR #142: Add support for using http to connect S3
This PR proposes to allow HTTP connections to S3, expanding the flexibility of data access methods. Review comments suggest enhancements in default settings for SSL usage.
PR #140: Vendor in Postgres ruleutils and handle public.read_csv
This PR vendors in Postgres' ruleutils.c
, enabling better query generation for DuckDB compatibility. It aims to address search path issues and improve query handling.
PR #128: feat: update readme
A minor update to the README file, likely improving documentation clarity.
PR #97: Add the PostgresStorageExtension
to DuckDB
This PR introduces a new extension that is still in early development stages but aims to enhance storage capabilities within DuckDB.
PR #88: Local object cache for cached_httpfs extension
This PR implements caching mechanisms for remote objects accessed via HTTPFS, though it raises concerns about concurrency and locking behavior during caching operations.
PR #75: Minimal create table support
A work-in-progress PR that lays the groundwork for creating tables in DuckDB, indicating ongoing efforts to expand functionality.
PR #71: WIP IndexOnlyScan
A draft PR focused on implementing an index-only scan feature, showcasing ongoing performance optimization efforts.
PR #63: Parallel Scan with multiple processes
Another draft PR aimed at enhancing performance through parallel processing capabilities in scanning operations.
PR #57: SHLIB_LINK: prefer new libs to old libs
A minor fix addressing linker errors when older versions of libraries are present during builds.
PR #53: Docker image
This draft PR aims to create a Docker image for pg_duckdb, facilitating easier deployment and testing environments.
A total of 78 closed PRs indicate a robust development cycle with many successful merges addressing various issues, enhancements, and features. Notable closed PRs include those that added essential functionalities like EXPLAIN ANALYZE support (#136), filtering with VARCHAR (#137), and improvements in installation processes (#139).
The current landscape of open pull requests indicates a strong focus on enhancing the integration between DuckDB and PostgreSQL. The introduction of singleton connections (PR #147) is particularly significant as it addresses performance and consistency issues that arise from managing multiple connections. This architectural change not only improves efficiency but also aligns with modern best practices in resource management within database systems.
The addition of a pytest-based testing framework (PR #143) represents a strategic move towards improving code quality and reliability. By allowing more sophisticated application-level tests, this framework can help identify edge cases and ensure that new features do not introduce regressions. The emphasis on testing is further supported by other PRs focusing on specific functionalities such as EXPLAIN ANALYZE and filtering with VARCHAR.
Another notable theme is the ongoing effort to enhance data access capabilities through improved support for cloud storage solutions (e.g., HTTP connections to S3 in PR #142). This aligns with the project's goal of providing seamless integration with modern data architectures, allowing users to leverage cloud storage without complex ETL processes.
However, there are also potential concerns highlighted in several open PRs. For instance, the singleton connection approach may lead to issues if not managed correctly, particularly regarding error states (as noted in PR #147). Similarly, the caching mechanism proposed in PR #88 raises questions about concurrency control during simultaneous access attempts. These points underscore the importance of rigorous testing and validation as new features are integrated into the system.
The presence of multiple draft PRs focused on performance optimizations (e.g., parallel scans and index-only scans) indicates an active pursuit of efficiency improvements. These efforts are crucial as they directly impact user experience and system responsiveness, particularly in data-intensive applications where performance is paramount.
In conclusion, the current set of pull requests reflects a dynamic development environment focused on enhancing functionality, improving performance, and ensuring reliability within the pg_duckdb project. The community's engagement through discussions and reviews further strengthens this collaborative effort towards building a robust PostgreSQL extension that leverages DuckDB's powerful analytics capabilities.
Jelte Fennema-Nio (JelteF)
make installcheck
to utilize the -j
flag, enhancing build efficiency.EXPLAIN ANALYZE
and added tests for various features.Y. (Y--)
mkaruza
Heikki Linnakangas (hlinnaka)
Thijs (Tishj)
Japin Li (japinli)
Jonathan Dance (wuputah)
Mark (Mytherin)
Overall, the development team is actively engaged in enhancing the pg_duckdb project through collaborative efforts that focus on usability, performance, testing, and documentation improvements.