‹ Reports
The Dispatch

GitHub Repo Analysis: ranaroussi/yfinance


Executive Summary

The yfinance project is an open-source Python library for accessing financial data from Yahoo! Finance. It is maintained by a community of contributors and is actively developed. The project is currently stable but faces challenges related to API changes and data accuracy.

Recent Activity

Team Members and Their Activities

ValueRaider

FX196 (Yuhong Chen)

Eric Pien

Ran Aroussi

Patterns, Themes, and Conclusions

Risks

Of Note

  1. Unresolved PRs: Long-standing open PRs like #1879 need attention to prevent backlog accumulation.
  2. Testing Gaps: Some PRs lack comprehensive tests, risking code stability.
  3. Draft PRs Stagnation: Drafts like #1984 remain inactive, suggesting possible resource allocation issues.

Quantified Reports

Quantify issues



Recent GitHub Issues Activity

Timespan Opened Closed Comments Labeled Milestones
7 Days 2 2 11 2 1
30 Days 10 9 25 10 1
90 Days 41 24 115 41 1
1 Year 180 112 552 180 1
All Time 1343 1192 - - -

Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.

Rate pull requests



2/5
The pull request introduces a minor change by adding a timeout parameter to an existing function call. While this can prevent potential hanging issues, the change is trivial and does not address the suggestion to refactor using `YfData` or target the `dev` branch. Additionally, the timeout value of 5 seconds seems arbitrary without justification. The PR lacks thoroughness and significance.
[+] Read More
2/5
The pull request addresses a minor typographical error in the README.md file, changing 'an note' to 'a note'. While this correction improves the readability of the document, it is an insignificant change with minimal impact on the overall project. Such minor fixes are common and do not substantially enhance the documentation's quality or content. Therefore, this PR is rated as needing work due to its trivial nature.
[+] Read More
3/5
The pull request adds type annotations to several methods, which is a positive step towards improving code clarity and maintainability. However, the changes are relatively minor and do not introduce any new functionality or significant improvements. The lack of thorough documentation or discussion on the necessity and impact of these changes also limits its significance. Overall, it's an average contribution that aligns with standard coding practices but lacks substantial impact.
[+] Read More
3/5
The pull request adds a feature to scrape ETF top holdings, which is a useful addition. However, it lacks thorough documentation and unit tests, as noted in the comments. The commit history is messy, which complicates the review process. While the code appears functional and integrates with existing structures, the lack of clarity and testing reduces its quality. Overall, it's an average contribution that could be improved with better organization and additional testing.
[+] Read More
3/5
The pull request addresses a user confusion issue by adding warnings when 'auto_adjust' is not set, which is a useful improvement. However, the changes are relatively minor, involving only 24 lines of code with a net change of 22 additions and 2 deletions. The PR is still in draft status and hasn't been updated in over 98 days, indicating potential neglect or lack of urgency. While it may resolve an existing issue (#1982), the significance of the change is modest, and it lacks comprehensive testing or documentation updates. Overall, it's an average PR with room for further enhancement.
[+] Read More
3/5
The pull request addresses a specific issue by removing unnecessary warnings in the configuration file and adding conditional checks in the code to handle potential None values. The changes are minor and focused, improving code robustness without introducing new features or significant refactoring. While the PR is well-targeted and resolves the issue at hand, it lacks broader impact or complexity that would warrant a higher rating. The modifications are mostly syntactical and do not introduce new functionality or optimizations.
[+] Read More
3/5
The pull request introduces a new method to extract company officers, which is a useful addition. However, the changes are minimal, affecting only a few lines of code. The implementation lacks thorough documentation and testing, which are essential for ensuring robustness and maintainability. Additionally, there is some friction in communication as seen in the comments, indicating potential collaboration issues. Overall, it is a functional but unremarkable update.
[+] Read More
3/5
The pull request addresses a specific issue by adding error handling for a 'malformed' database error, which is a reasonable and necessary change. The code modification is minimal, with a clear try-except block to catch the specific error. However, the PR lacks detailed comments or documentation explaining the broader context of the change, and there's no test case added to ensure the fix works as intended. Overall, it's an average PR that solves a problem but could be improved with more thorough documentation and testing.
[+] Read More
4/5
The pull request introduces a valuable search utility function that generalizes the existing `isin` search, allowing for more flexible query types. It addresses a gap in the `yfinance` library by enabling ticker searches directly, which was previously not supported. The implementation is clean and integrates well with existing code. The discussion in the comments highlights its usefulness and potential to reduce redundancy with existing functions like `get_news`. However, the lack of detailed documentation or examples in the README slightly detracts from its completeness.
[+] Read More
4/5
The pull request addresses a specific issue with reading timezone information from earning dates, which is a significant fix for the functionality of the library. It includes improvements in handling ambiguous timezones and adds new tests to ensure the correctness of the changes. The PR is well-documented with comments and discussions that clarify the problem and solution. However, it lacks a broader review from requested reviewers, which could enhance its robustness. Overall, it's a well-executed and necessary improvement, but not exemplary.
[+] Read More

Quantify commits



Quantified Commit Activity Over 14 Days

Developer Avatar Branches PRs Commits Files Changes
ValueRaider 2 4/3/0 5 5 152
Eric Pien 1 0/1/0 1 1 74
Yuhong Chen 1 1/1/0 1 1 4
Ran Aroussi 1 0/0/0 1 1 1
Andrii Shkabrii (shkabrii) 0 1/0/1 0 0 0
Ikko Eltociear Ashimine (eltociear) 0 1/0/0 0 0 0
Sai Roopesh (Sai-Roopesh) 0 1/0/1 0 0 0
Nikola Milosevic (nikolamilosevic86) 0 1/0/0 0 0 0

PRs: created by that dev and opened/merged/closed-unmerged during the period

Quantify risks



Project Risk Ratings

Risk Level (1-5) Rationale
Delivery 3 The project faces a moderate delivery risk due to unresolved critical issues like #2093, which affects Python 3.10 users. The backlog of unresolved issues and the lack of milestones further contribute to this risk. However, the team's structured approach to prioritizing high-priority issues and recent improvements in issue handling suggest some mitigation.
Velocity 3 Velocity is moderate, with concentrated contributions from key members like ValueRaider. The closure rate of issues remains around 62%, indicating a backlog. The disparity in contribution levels among team members and unresolved draft PRs like #1984 suggest potential bottlenecks.
Dependency 4 Dependency risks are significant due to reliance on external data sources and APIs, as highlighted by issues like #1982 and #1940. Rate limits and HTTP errors (#602) further underscore these risks, which could affect software reliability if not managed effectively.
Team 3 Team dynamics show potential risks with concentrated contributions from a few individuals and contributor disagreements noted in PR #2085. This could impact velocity and delivery if key contributors face bottlenecks or conflicts arise.
Code Quality 2 Code quality is generally good, with efforts to enhance clarity through type annotations and attention to documentation. However, some PRs lack thorough documentation and testing, which could impact maintainability.
Technical Debt 3 Technical debt is moderate, with ongoing efforts to improve code quality and reduce redundancy. However, unresolved PRs and lack of comprehensive tests in some areas indicate challenges in managing technical debt effectively.
Test Coverage 3 Test coverage appears moderate, with implied testing through structured code but lacking explicit references to test cases or frameworks. This uncertainty could lead to undetected bugs or regressions.
Error Handling 3 Error handling is robust in some areas, with specific exceptions and logging mechanisms. However, the lack of tests or documentation for error handling improvements in PRs like #2088 may hinder effectiveness.

Detailed Reports

Report On: Fetch issues



Recent Activity Analysis

The yfinance project has seen a range of issues related to data accuracy, API changes, and feature requests. Notable anomalies include frequent errors with specific tickers, inconsistencies in data retrieval, and issues with the handling of certain financial metrics. A recurring theme is the challenge of adapting to changes in Yahoo's API and ensuring data consistency across different environments.

Notable Issues

  • Syntax Errors: Issue #2093 highlights a syntax error affecting Python 3.10 users, causing import failures.
  • Data Inconsistencies: Multiple issues report discrepancies in historical data (#1982) and missing or incorrect financial metrics (#1940).
  • API Limitations: Users frequently encounter rate limits and HTTP errors when making multiple requests (#602).
  • Feature Requests: There are ongoing requests for enhanced functionality, such as accessing historical financial metrics and economic calendars (#265, #601).

Common Themes

  • Data Accuracy: Many issues focus on the accuracy and completeness of the data retrieved from Yahoo Finance.
  • API Changes: Adjustments to Yahoo's API often lead to temporary disruptions in data access.
  • User Experience: Users seek improvements in error handling, documentation, and feature availability.

Issue Details

Recently Created Issues

  1. #2093: Syntax error with f-string in Python 3.10.

    • Priority: High
    • Status: Open
    • Created: 2 days ago
    • Updated: 1 day ago
  2. #2086: IndexError when retrieving NASDAQ Composite data.

    • Priority: Medium
    • Status: Open
    • Created: 2 days ago
  3. #2084: Inability to retrieve PE Ratios.

    • Priority: Low
    • Status: Open
    • Created: 10 days ago
    • Updated: Today

Recently Updated Issues

  1. #2046: IndexError due to incorrect timeseries dates.

    • Priority: Medium
    • Status: Closed
    • Created: 48 days ago
    • Updated: 5 days ago
  2. #1940: Error parsing holders JSON data.

    • Priority: Medium
    • Status: Closed
    • Created: 156 days ago
    • Updated: 6 days ago
  3. #1909: Help function not working for yf.download.

    • Priority: Low
    • Status: Closed
    • Created: 187 days ago
    • Updated: 10 days ago

These issues reflect ongoing efforts to maintain compatibility with Yahoo's API and address user-reported bugs and feature requests. The community remains active in reporting problems and suggesting enhancements, contributing to the project's continuous development.

Report On: Fetch pull requests



Analysis of Pull Requests for yfinance

Open Pull Requests

#2097: Minor README Update

  • Details: A trivial fix to correct a typo in the README.
  • Significance: Low impact, can be merged quickly.

#2088: Handle 'malformed' Error in db.connect()

  • Details: Addresses an error related to database connections, potentially fixing issue #1573.
  • Significance: Important for users experiencing this error. Needs review and testing.

#2085: Extract Company Officers/Executives

  • Details: Adds functionality to extract company officers or executives.
  • Comments: Some friction between contributors; guidance on Git practices suggested.
  • Significance: Useful feature addition but requires resolution of contributor disagreements.

#2079: Fix "reportOptionalIterable"

  • Details: Fixes an issue with the "reportOptionalIterable" function.
  • Comments: Suggestion to switch branches for code changes.
  • Significance: Minor fix, but necessary for code stability.

#2010: Fix Timezone Reading from Earning Dates

  • Details: Improves handling of timezone data in earnings dates.
  • Comments: Extensive discussion on handling timezones and delisted symbols.
  • Significance: Important for accurate data representation, needs careful testing.

#1984: Price Auto Adjust Confusion

  • Details: Draft PR to address user confusion regarding auto_adjust.
  • Significance: Addresses user experience issues but remains in draft status.

#1949: Add Search Utility

  • Details: Generalizes isin search to support new queries.
  • Comments: Discussion on redundancy with existing functions like Ticker.news.
  • Significance: Enhances search capabilities, needs further integration consideration.

Notable Closed Pull Requests

#2094: Fix Malformed f-string in Release 0.2.45

  • Details: Quick fix for a syntax error that broke the release.
  • Significance: Critical fix, promptly merged to restore functionality.

#2066: Implement Screener Feature

  • Details: Introduces a new Screener class for filtering securities.
  • Significance: Major feature addition enhancing market screening capabilities.

#2058: Support Sector and Industry Data

  • Details: Adds support for fetching sector and industry data.
  • Significance: Expands the library's data access capabilities significantly.

#2041: Support for Funds Data

  • Details: Adds functionality to fetch funds-related data.
  • Significance: Addresses multiple user requests, broadening the library's scope.

Concerns and Observations

  1. Unmerged Open PRs with Long Duration:

    • PRs like #1879 (Add timeout to session.get()) and others have been open for extended periods without resolution. These may need prioritization or closure if no longer relevant.
  2. Contributor Friction and Guidance Needed:

    • Instances of contributor disagreements (e.g., #2085) suggest a need for clearer contribution guidelines or mediation by maintainers.
  3. Draft PRs Remaining Stagnant:

    • Several draft PRs (e.g., #1984) remain open without progress. Consider closing or prompting updates from authors.
  4. Documentation and Testing Gaps:

    • Some PRs lack sufficient documentation or tests (e.g., #2023). Ensuring these are addressed before merging is crucial for maintaining code quality.
  5. Feature Integration Considerations:

    • New features like the Screener (#2066) and sector/industry support (#2058) require careful integration into existing workflows to avoid redundancy and ensure usability.

Overall, while the project shows active development with significant feature additions, attention to unresolved PRs, contributor collaboration, and comprehensive testing will enhance stability and community engagement.

Report On: Fetch Files For Assessment



Source Code Assessment

CHANGELOG.rst

  • Content: The changelog provides a comprehensive history of updates, features, fixes, and maintenance changes. It is well-structured with clear versioning.
  • Quality: The entries are detailed, providing references to issues and contributors. This is beneficial for tracking the project's evolution and understanding recent changes.
  • Structure: Organized by version number, with sections for features, fixes, and maintenance. This makes it easy to navigate.
  • Insights: Frequent updates indicate active development. The inclusion of GitHub issue numbers allows for easy cross-referencing.

yfinance/version.py

  • Content: Contains a single line defining the current version of the package.
  • Quality: Simple and effective for tracking the version programmatically.
  • Structure: Direct and minimalistic, which is appropriate for its purpose.

yfinance/screener/screener.py

  • Content: Implements a Screener class for querying Yahoo Finance data using predefined or custom query bodies.
  • Quality:
    • Uses type hints, enhancing readability and maintainability.
    • Includes error handling for invalid keys and missing/extra keys in query bodies.
    • Utilizes properties for encapsulation, which is good practice.
  • Structure: Well-organized with methods logically grouped. Private methods are prefixed with an underscore, following Python conventions.
  • Insights: The use of f-strings and consistent error logging indicates modern Python practices.

yfinance/scrapers/history.py

  • Content: Handles fetching and processing historical market data from Yahoo Finance.
  • Quality:
    • Extensive use of comments and docstrings improves understandability.
    • Implements robust error handling and logging, crucial for debugging data fetching operations.
    • The code is complex but modularized into methods that handle specific tasks like data fetching, parsing, and repairing.
  • Structure:
    • The class is large but broken down into logical sections. However, further refactoring could improve readability and maintainability.
    • Uses utility functions from other modules to keep the code DRY (Don't Repeat Yourself).
  • Insights: The presence of detailed logging suggests a focus on traceability and debugging.

tests/test_screener.py

  • Content: Unit tests for the Screener class using Python's unittest framework.
  • Quality:
    • Tests cover key functionalities such as setting default bodies, handling predefined bodies, and patching bodies.
    • Use of mocking (unittest.mock) to simulate external dependencies like network requests enhances test reliability.
  • Structure:
    • Tests are well-organized with descriptive method names indicating their purpose.
    • Setup using setUpClass ensures efficient initialization across tests.
  • Insights: The presence of these tests indicates a commitment to ensuring code quality through automated testing.

Overall, the codebase demonstrates good practices in terms of organization, error handling, and testing. Active development is evident from frequent updates in the changelog. The use of modern Python features like type hints and f-strings enhances code readability and maintainability.

Report On: Fetch commits



Development Team and Recent Activity

Team Members and Their Activities

ValueRaider

  • Commits: 5 commits with 152 changes across 5 files.
  • Recent Work:
    • Released versions 0.2.45 and 0.2.46.
    • Merged several pull requests, including fixes for malformed f-strings and improvements to dividend repair.
    • Implemented screener feature and improved dividend repair logic.
    • Collaborated with FX196, ericpien, and others on various features and fixes.

FX196 (Yuhong Chen)

  • Commits: 1 commit with 4 changes in yfinance/screener/screener.py.
  • Recent Work:
    • Fixed a malformed f-string in the release 0.2.45.
  • Collaboration: Worked with ValueRaider on fixing f-string issues.

Eric Pien

  • Commits: 1 commit with 74 changes in README.md.
  • Recent Work:
    • Simplified the README for the screener feature.
  • Collaboration: Contributed to the implementation of the screener feature alongside ValueRaider.

Ran Aroussi

  • Commits: 1 commit with minor changes in README.md.
  • Recent Work:
    • Updated README documentation.

Patterns, Themes, and Conclusions

  • Active Development: The team is actively working on both bug fixes and new features, as evidenced by recent commits and version releases.
  • Collaboration: There is significant collaboration among team members, particularly involving ValueRaider who appears to be leading or coordinating many of the recent efforts.
  • Focus Areas: Recent activities have focused on improving existing functionalities like dividend repair and introducing new features such as the screener module.
  • Documentation Updates: Efforts are being made to keep documentation up-to-date with recent changes, as seen in README updates by multiple contributors.

Overall, the development team is engaged in continuous improvement of the project, addressing both technical debt and expanding functionality.