PaperQA2, a Python-based tool designed for answering questions from scientific documents, is experiencing issues with API key access and performance regressions, impacting user experience and functionality.
Recent issues and pull requests (PRs) reveal a focus on addressing critical bugs and enhancing documentation. Notable issues include #412, which highlights the unavailability of the Crossref API key, posing a significant obstacle for users. Issue #397 reports a critical bug related to missing models in the OpenAI API, affecting core functionalities. Additionally, performance concerns are raised in issue #408 regarding document ingestion speed.
The development team has been actively working on these challenges. Key contributors include:
LitQAv2TaskDataset
, fixed BaseModel
defaults..docx
file support (#403) addresses a previously missing feature, enhancing document compatibility.Timespan | Opened | Closed | Comments | Labeled | Milestones |
---|---|---|---|---|---|
7 Days | 20 | 52 | 43 | 2 | 1 |
30 Days | 24 | 53 | 46 | 6 | 1 |
90 Days | 32 | 57 | 57 | 13 | 1 |
1 Year | 64 | 65 | 126 | 38 | 1 |
All Time | 165 | 128 | - | - | - |
Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.
Developer | Avatar | Branches | PRs | Commits | Files | Changes |
---|---|---|---|---|---|---|
mskarlin | 6 | 15/12/3 | 29 | 76 | 43458 | |
Andrew White | 2 | 12/13/0 | 25 | 61 | 20954 | |
James Braza | 3 | 46/40/4 | 40 | 46 | 6763 | |
Geemi Wellawatte | 2 | 3/3/0 | 19 | 9 | 626 | |
Siddharth Narayanan | 2 | 2/1/1 | 7 | 8 | 145 | |
Tyler Nadolski (nadolskit) | 1 | 1/0/0 | 4 | 3 | 99 | |
Tabish Mir | 1 | 2/1/0 | 1 | 1 | 30 | |
Yusuf (Yusufibin) | 0 | 0/0/1 | 0 | 0 | 0 | |
Krish Dholakia (krrishdholakia) | 0 | 0/0/1 | 0 | 0 | 0 |
PRs: created by that dev and opened/merged/closed-unmerged during the period
The GitHub repository for the Future-House/paper-qa project currently has 37 open issues, with recent activity indicating a mix of questions, documentation requests, and bugs. Notably, there are several urgent inquiries regarding API key issues and performance regressions after recent updates. A common theme among the issues is the transition to new versions and the associated challenges users face, particularly with embedding models and API integrations.
Several issues stand out due to their implications for user experience and functionality. For instance, Issue #397 highlights a critical bug related to a missing model in the OpenAI API, which could affect users relying on specific functionalities. Additionally, Issue #381 discusses rate limits encountered during document indexing, suggesting potential scalability concerns as users attempt to process larger datasets. The presence of multiple questions about documentation and usage (e.g., #409, #402) indicates that users may struggle with understanding how to effectively utilize the tool's capabilities.
Most Recently Created Issues:
Issue #412: Not possible to get Crossref API Key - This tool is no longer available
Issue #409: Documentation in CONTRIBUTING.md
for pytest-recording
and VCR cassettes
Issue #408: Bypassing Some Pre-processing and Validation Steps in Pipeline for Faster Document Ingestion
Issue #402: CLI functionality in Python module
Issue #399: Dependency Dashboard
Most Recently Updated Issues:
Issue #397: Error Code 404
Issue #393: Azure Open AI API KEY
Issue #392: Installation problem
Issue #391: How to increase the number of citations in the answers?
Issue #390: How to use Version 5 with LiteLLM and Ollama?
The recent activity in the Future-House/paper-qa GitHub repository reflects a dynamic environment where users are actively engaging with the tool while encountering various challenges related to updates, documentation clarity, and integration with APIs. The issues raised indicate a need for better guidance on using new features and addressing critical bugs that could hinder user experience.
The analysis of the pull requests (PRs) from the Future-House/paper-qa repository reveals a mix of ongoing enhancements, bug fixes, and documentation updates. The repository currently has 9 open PRs and 221 closed PRs, indicating a vibrant development activity with a focus on improving functionality and usability.
PR #411: Broken title search ut
Created by Tyler Nadolski. This PR addresses a failing unit test related to citation counts from different sources. It highlights discrepancies in citation counts based on the order of sources queried. Review comments suggest improvements in regex usage and code style.
PR #410: pytest-recording
docs in CONTRIBUTING.md
Created by James Braza. This PR adds documentation for using the pytest-recording
plugin, enhancing the contribution guidelines for developers.
PR #407: Promoting agent factories to Settings
Created by James Braza. This enhancement improves encapsulation by moving agent factories into a settings module, streamlining imports.
PR #403: Add support to read docx files
Created by Tabish Mir. This PR introduces functionality to parse .docx
files, addressing a previously missing feature while also fixing case sensitivity issues in file handling.
PR #205: Add docx reader
Created by Nish (NISH1001). This older PR proposes adding a reader for .docx
files but has not been updated since significant changes were made to the repository.
PR #177: Batch summarisation
Created by Zac Pullar-Strecker. This PR aims to enhance performance for batch summarization tasks but also requires rebasing due to recent changes in the main branch.
PR #131: Implement adversarial prompting
Created by David Brodrick. This PR introduces a method for adversarial prompting, which enhances answer quality but also needs updates following major refactoring in the repository.
PR #82: copy from Zotero storage
Created by Gabriel Simmons. This PR aims to improve file handling by copying PDFs from Zotero storage rather than downloading them, but it has not been updated since v5 was released.
PR #81: Better BibTeX citekeys
Created by Gabriel Simmons. This PR adds support for Better BibTeX citekeys but also requires rebasing due to recent changes.
mypy
) and enhancing testing reliability.The current landscape of pull requests in the Future-House/paper-qa repository reflects an active development cycle characterized by both ongoing enhancements and necessary bug fixes. The open pull requests indicate that contributors are focusing on critical areas such as improving test reliability (e.g., PR #411), enhancing documentation (e.g., PR #410), and adding new features like support for .docx
files (e.g., PR #403).
One notable aspect is the presence of older pull requests that have not been merged or updated since significant changes were made to the repository (e.g., PR #205 and PR #177). This suggests potential challenges in maintaining alignment with the evolving codebase, which can lead to contributor frustration and hinder project momentum if not addressed promptly.
Moreover, there is an evident emphasis on improving code quality through type checking and linting processes, as seen in multiple closed pull requests that focus on passing mypy
checks and integrating tools like pylint
. This commitment to maintaining high code quality standards is commendable and essential for long-term project sustainability.
In terms of collaboration dynamics, review comments on several open pull requests show constructive feedback aimed at improving code practices and ensuring adherence to project standards (e.g., regex improvements suggested in PR #411). However, some contributors express difficulty keeping up with recent changes, indicating a need for better communication regarding major updates or changes in direction within the project.
The project's evolution from PaperQA1 to PaperQA2 signifies substantial architectural shifts aimed at enhancing performance and usability. The introduction of features like agentic workflows and a user-friendly CLI demonstrates an intent to cater to researchers' needs more effectively.
In conclusion, while the project exhibits robust activity levels with numerous contributions aimed at enhancing functionality and usability, it faces challenges related to managing older pull requests and ensuring contributors remain aligned with ongoing developments. Addressing these issues will be crucial for maintaining momentum and fostering a collaborative development environment moving forward.
James Braza (jamesbraza)
LitQAv2TaskDataset
for agent training/evaluation.BaseModel
defaults and crashes in chunk_text
.factories-in-settings
.Andrew White (whitead)
issue-366
branch.Michael Skarlinski (mskarlin)
test-speed
.Geemi Wellawatte (geemi725)
issue-366
branch.Siddharth Narayanan (sidnarayanan)
Tyler Nadolski (nadolskit)
Tabish Mir (taabishm2)
Yusufibin & Krrish Dholakia