Firecrawl, an API service by Mendable.ai designed for web crawling and data extraction, has seen a surge in issue reporting and user engagement despite a lack of recent commits or pull requests in the last 30 days.
The project currently has 68 open issues, with several new ones created in the past few days. These issues primarily focus on bugs related to encoding errors (#547) and scraping failures (#540), as well as feature requests like automatic retries for failed requests (#518). The high volume of issue reporting suggests a growing user base encountering challenges with the tool's current capabilities.
The development team, consisting of members like Nicolas (nickscamara) and Gergő Móricz (mogery), has not made any new commits recently. Their previous work involved significant contributions to API controllers and services, focusing on functionality improvements and bug fixes. The absence of recent activity may indicate a pause in development or a shift in focus to addressing existing issues.
Developer | Avatar | Branches | PRs | Commits | Files | Changes |
---|---|---|---|---|---|---|
Eric Ciarla | 2 | 1/1/0 | 7 | 65 | 55526 | |
Gergő Móricz | 2 | 3/4/0 | 73 | 66 | 14815 | |
Nicolas | 7 | 12/12/0 | 80 | 73 | 6794 | |
Rafael Miller | 7 | 17/11/1 | 42 | 116 | 5809 | |
Kent (Chia-Hao), Hsu | 1 | 2/2/0 | 3 | 11 | 1267 | |
Kevin Swiber | 1 | 2/1/0 | 1 | 1 | 38 | |
Thomas Kosmas | 1 | 0/0/0 | 2 | 1 | 33 | |
Quan Ming | 1 | 1/1/0 | 3 | 2 | 10 | |
tak-s | 1 | 1/1/0 | 2 | 2 | 9 | |
Yuki Matsukura | 1 | 2/1/0 | 1 | 1 | 1 | |
Alfred Nutile (alnutile) | 0 | 1/0/0 | 0 | 0 | 0 | |
Matt Joyce (mattjoyce) | 0 | 0/0/1 | 0 | 0 | 0 | |
Cherilyn Buren (NiuBlibing) | 0 | 0/1/0 | 0 | 0 | 0 | |
darker (Sanix-Darker) | 0 | 0/0/1 | 0 | 0 | 0 | |
Jakob Stadlhuber (JakobStadlhuber) | 0 | 2/1/1 | 0 | 0 | 0 | |
None (dependabot[bot]) | 0 | 18/0/18 | 0 | 0 | 0 |
PRs: created by that dev and opened/merged/closed-unmerged during the period
Timespan | Opened | Closed | Comments | Labeled | Milestones |
---|---|---|---|---|---|
7 Days | 14 | 5 | 20 | 1 | 1 |
30 Days | 49 | 36 | 96 | 7 | 1 |
90 Days | 183 | 143 | 414 | 27 | 1 |
All Time | 251 | 183 | - | - | - |
Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.
The Firecrawl project has seen a significant amount of recent activity, with 68 open issues currently logged. Notably, several issues have been created or updated within the last few days, indicating an active engagement from both users and developers. Common themes among these issues include bugs related to encoding and scraping failures, questions about functionality, and feature requests aimed at enhancing the tool's capabilities.
Several issues stand out due to their urgency or complexity. For instance, Issue #540 regarding the failure to scrape content from a specific URL has been marked as high-priority, highlighting the need for immediate attention. Additionally, there are recurring reports of encoding problems, particularly with non-English websites (e.g., Issue #547), which could affect the tool's usability in diverse contexts.
Another theme is the discussion around improving user experience by adding features such as automatic retries for failed requests (Issue #518) and better handling of JavaScript-rendered pages (Issue #543). The presence of multiple questions about functionality also suggests that users may require more guidance on how to effectively utilize Firecrawl's features.
Issue #548: [BUG] Getting 408 when trying to run firecrawl locally
Issue #547: [BUG] The encoding is not correct for some Chinese language sites
Issue #546: [Question] Do you support crawling pages requires login?
Issue #545: [BUG] Doesn't work on /scrape
Issue #544: [Feat] Send "cancel" to fire-engine on timeout
Issue #540: [BUG] https://www.solvhealth.com/privacy Only main content causing no content to be returned?
Issue #538: Strange behaviors in concurrency
Issue #519: [Feat] What will happen to the links that uses authentication services?
Issue #518: [Feat] Add automatic retries to failed links on crawl
Issue #517: [Feat] Run actions like clicking or scrolling on page before extraction
The recent activity in Firecrawl's GitHub repository reflects a dynamic environment with numerous bugs being reported and addressed, alongside feature requests aimed at improving user experience and functionality. Key issues revolve around encoding problems, scraping failures, and enhancements for crawling capabilities, indicating areas where users seek more robust solutions or clearer documentation.
The analysis of the pull requests (PRs) for the Firecrawl project reveals a total of 16 open PRs, reflecting ongoing development efforts focused on bug fixes, feature enhancements, and integration improvements. The activity indicates a collaborative environment with significant contributions from multiple developers.
PR #542: [Bug] Fixed go sdk workflow
PR #541: [Feat] Added attempts to sdks for db saving time
PR #535: fix docker compose port setting
PR #527: [V1] Release
PR #516: Ensuring USE_DB_AUTHENTICATION is true in single URL scraper
PR #505: Add another Open-Source Integration
PR #493: [Feat] Added llama parser sdk and timeout for scrape
PR #373: [Feat]: Add RUST SDK client for firecrawl API
PR #355: feat: small room optimisation of the apps/api Dockerfile image
PR #438: [Feat] Added rate limit singleton for redis
PR #389: [Feat] Proposal to resolve the redirect url
PR #344: Adds support for npm i firecrawl
PR #343: Adds support for pip install firecrawl
PR #280: Add rendering service to improve scalability
PR #278: Usage billing support for overuse
PR #10: Categorize gitignore items
.gitignore
.The recent activity in the Firecrawl repository shows a strong focus on improving both functionality and usability across various aspects of the project. A notable trend is the introduction of new SDKs and integrations, such as the Go SDK (#542) and Rust SDK (#373), which broaden the project's appeal to developers using different programming languages. This aligns with Firecrawl's goal of being an accessible web crawling solution across multiple platforms.
Another significant theme is addressing bugs and enhancing security measures, as seen in PRs like #516 (ensuring proper database authentication) and #541 (optimizing database saving times). These enhancements are critical as they directly impact user trust and application reliability.
There is also an emphasis on optimizing performance through various means—reducing Docker image sizes (#355), implementing rate limiting with Redis (#438), and enhancing logging capabilities (#496). These optimizations are essential in maintaining efficient operations as user demand grows.
The discussions within PR comments reveal an active community engagement where contributors provide feedback and suggestions, fostering collaboration among developers. For instance, discussions around PR #535 regarding Docker Compose settings highlight the importance of clear communication in resolving technical issues collaboratively.
However, there are some closed PRs that were not merged due to overlapping functionalities or because they were superseded by other changes (e.g., PRs #506 and #527). This indicates a need for better coordination among team members to avoid redundancy in efforts and streamline contributions effectively.
Overall, the current state of pull requests reflects a healthy development cycle characterized by active contributions aimed at enhancing functionality, fixing bugs, and improving overall user experience while maintaining robust community engagement practices.
Nicolas (nickscamara)
Gergő Móricz (mogery)
Rafael Miller (rafaelsideguide)
Thomas Kosmas (tomkosm)
Eric Ciarla (ericciarla)
Yuki Matsukura (matsubo)
Kent Hsu (KentHsu)
Kevin Swiber (kevinswiber)
Quan Ming (wahpiangle)
Caleb Peffer (calebpeffer)
Tak-S (tak-s)
Dependabot
v1-webscraper
branch, particularly in enhancing the /map
endpoint and adding tests.queue-worker
, addressing race conditions, logging issues, and job success propagation.v1-webscraper
branch, including websocket functionality for crawl status.Overall, the development team is engaged in a productive cycle of feature enhancement, testing, and collaboration aimed at improving the Firecrawl project.