‹ Reports
The Dispatch

GitHub Repo Analysis: Skyvern-AI/skyvern


Skyvern Project Technical Report

Introduction

Skyvern is a cutting-edge software project aimed at automating browser-based workflows through the integration of Large Language Models (LLMs) and Computer Vision technologies. This report provides a detailed analysis of the project's current state, including an examination of its source code, issues, pull requests, and team activities. The goal is to offer insights into the technical health, development pace, and potential areas for improvement within the project.

Source Code Analysis

Detailed File Reviews

skyvern/forge/sdk/services/bitwarden.py

skyvern/forge/sdk/workflow/models/block.py

skyvern/webeye/browser_factory.py

skyvern/webeye/actions/handler.py

skyvern/forge/sdk/api/llm/config_registry.py

skyvern/forge/prompts/skyvern/extract-action.j2

Conclusion on Source Code

The source files demonstrate sophisticated use of various programming techniques, including object-oriented programming, asynchronous operations, and design patterns. While the overall quality is high, there are opportunities for improving documentation, reducing complexity, and enhancing testability.

Analysis of Development Activities

Team Contributions

Recent Commits and Collaborations

Patterns and Insights

The development team exhibits a healthy division of labor with clear specializations in backend logic, frontend development, documentation, and operational setups. Regular collaborations in PR reviews suggest a cohesive team environment. The frequency of commits across different areas indicates an active development phase with ongoing efforts to refine and expand the project's capabilities.

Open Issues Analysis

Critical Issues

Uncertainties and Feature Requests

Recent Trends

The closure of several setup-related issues suggests improvements in installation processes and configurations. Documentation enhancements indicate ongoing efforts to make the project more accessible to new users.

Conclusion

Skyvern is demonstrating robust growth and continuous improvement in automating browser-based workflows using advanced technologies. While there are areas requiring attention such as testing of new features across different environments and clarification on new feature integrations, the project's active maintenance and development indicate a promising trajectory. The team's structured approach to tackling both technical debt and new features is commendable. Continued focus on reducing complexity and enhancing documentation will further solidify Skyvern's position as a leading solution in its domain.

Quantified Commit Activity Over 14 Days

Developer Avatar Branches PRs Commits Files Changes
Kerem Yilmaz 1 24/21/3 21 45 2993
Salih Altun 1 8/8/0 8 12 181
Shuchang Zheng 1 10/11/0 11 17 170
LawyZheng 2 7/6/0 7 6 136
Suchintan 1 2/2/0 2 1 4
OB42 1 2/1/0 1 1 2
Shixian Sheng 1 1/1/0 1 1 2
Ikko Eltociear Ashimine 1 1/1/0 1 1 2

PRs: created by that dev and opened/merged/closed-unmerged during the period

~~~

Executive Summary: State of the Skyvern Software Project

Overview

Skyvern is a cutting-edge software project that leverages advanced technologies such as Large Language Models (LLMs) and Computer Vision to automate browser-based workflows. This approach allows Skyvern to adapt dynamically to changes in web page layouts, offering a robust alternative to traditional automation tools. The project is under active development by Skyvern-AI and is well-received in the open-source community, as evidenced by its GitHub activity and contributions.

Strategic Insights

Development Pace and Team Efficiency

The Skyvern project exhibits a healthy pace of development with regular updates and contributions from a diverse team. Recent activities suggest that the team is effectively addressing both new feature developments and maintenance issues. This balance is crucial for sustaining long-term growth and stability of the software.

Market Opportunities

By integrating LLMs and Computer Vision, Skyvern positions itself uniquely in the market of automation tools, potentially reducing the dependency on manual updates when web interfaces change. This capability could be particularly appealing to enterprises looking for stable, scalable automation solutions, thereby opening up significant market opportunities.

Strategic Costs vs. Benefits

While the innovative approach provides a competitive edge, it also involves complexities related to maintaining such advanced technologies. The ongoing need to update models and manage sophisticated code can increase operational costs. However, these costs are justified by the high value and unique capabilities provided by Skyvern, which can lead to substantial market penetration and customer satisfaction.

Team Size Optimization

The current team structure, with members specializing in different aspects such as backend logic, frontend development, and documentation, appears to be well-optimized for the project's needs. However, as the project scales, there might be a need to expand the team, particularly adding more expertise in areas like AI model integration and cloud infrastructure to handle increased load and complexity.

Key Issues and Recommendations

Critical Issues Needing Attention

Strategic Recommendations

Conclusion

Skyvern is on a promising trajectory with its innovative use of technology in automating web interactions. The project demonstrates robust development activity and strategic management of resources. Addressing the key issues identified and considering the strategic recommendations will further enhance its position in the market and ensure sustainable growth.

Quantified Commit Activity Over 14 Days

Developer Avatar Branches PRs Commits Files Changes
Kerem Yilmaz 1 24/21/3 21 45 2993
Salih Altun 1 8/8/0 8 12 181
Shuchang Zheng 1 10/11/0 11 17 170
LawyZheng 2 7/6/0 7 6 136
Suchintan 1 2/2/0 2 1 4
OB42 1 2/1/0 1 1 2
Shixian Sheng 1 1/1/0 1 1 2
Ikko Eltociear Ashimine 1 1/1/0 1 1 2

PRs: created by that dev and opened/merged/closed-unmerged during the period

Quantified Reports

Quantify commits



Detailed Reports

Report On: Fetch issues



Analysis of Open Issues for Skyvern-AI/skyvern

Notable Problems and Uncertainties

JSON Parsing and Model Output (#290)

  • Critical: Issue #290 is a critical problem where the Anthropic models are not reliably outputting valid JSON objects, causing failures in the parse_api_response function. This issue has been addressed by forcing the model to output JSON and changing the JSON extraction method. However, it has only been tested with Anthropic's Claude 3 Opus and not with OpenAI models due to a lack of credits.
  • Uncertainty: The fix needs further testing with OpenAI models to ensure compatibility and reliability across different LLM providers.

Task Execution Failure (#288)

  • Notable Problem: Issue #288 indicates a failure when running run_ui.bash, but the screenshot that could provide more context is missing. This issue is currently blocked due to insufficient information.

Integration with Google Extensions (#261)

  • Uncertainty: Issue #261 suggests integrating Skyvern with Google Chrome extensions, but it's unclear what specific interactions are desired. This feature could greatly expand Skyvern's capabilities but requires clarification.

Integration with Ollama Litellm (#242)

  • Notable Problem: Users are seeking guidance on integrating Ollama Litellm with Skyvern (Issue #242). There's an example provided in PR #251, but it seems like official support or documentation is still needed.

Local Vision Model Replacement (#180)

  • Uncertainty: Issue #180 raises questions about replacing GPT-4 Turbo with a local vision model. There's confusion about where vision models are used within the codebase, and while some guidance has been provided, there may be a need for clearer documentation or tutorials.

Different Ports for Visualizer and Server (#164, #135)

  • TODO: Issues #164 and #135 both address the need to run the visualizer and server on different ports. There's an open PR for this task, but it appears to have errors that need resolution.

Database Connection Issues (#157)

  • Notable Problem: Issue #157 discusses problems connecting to Postgres in Docker on a Windows system. This issue involves platform-specific challenges and may indicate broader compatibility issues with Windows.

Recent Closures and Trends

Recent Fixes

  • Several issues related to setup scripts, Docker configurations, and environment variables have been recently closed (e.g., #287, #286, #285). These closures suggest active maintenance of setup processes and configurations.

Quickstart Documentation Improvements

  • Issues related to improving quickstart documentation have been addressed recently (#284, #283), indicating an ongoing effort to make the project more accessible to new users.

Backward Compatibility

  • Issue #282 was closed after addressing backward compatibility concerns with Anthropic models. This suggests attention to maintaining support for existing configurations as new features are added.

Summary

The open issues indicate that Skyvern is actively being developed and maintained, with particular attention to model integration, setup processes, and user experience improvements. Critical issues such as JSON parsing reliability (#290) and task execution failures (#288) require immediate attention. Uncertainties around feature requests like Chrome extension interactions (#261) and local model integrations (#180) highlight the need for clearer communication or documentation. The recent closure of issues related to setup and configuration suggests that these areas are receiving timely updates, which is positive for user onboarding and overall project health.

Report On: Fetch pull requests



Open Pull Requests Analysis

PR #290: Force Claude3 models to output JSON object and parse it more reliably

  • Summary: This PR aims to address issues with the Claude3 models from Anthropic not consistently outputting valid JSON objects. The proposed changes include forcing the model to output JSON by prefilling the response and improving the JSON extraction method.
  • Notable Points:
    • The PR is very recent, created 0 days ago, and addresses a significant issue that could affect the stability of the Skyvern system.
    • It has been tested only with Anthropic's Claude 3 Opus model and not with OpenAI due to a lack of credits.
  • Potential Issues:
    • Limited testing scope might mean that the changes could behave differently with other models or under different conditions.
    • The PR should be tested thoroughly with all supported models before merging.

PR #289: fix scrolling problem

  • Summary: This PR contains a fix for an unspecified scrolling issue.
  • Notable Points:
    • The PR is also very recent and seems to be a minor fix.
  • Potential Issues:
    • Lack of detailed information about what problem is being fixed and how the fix was tested.

PR #164: Different portsUpdate ./run_skyvern.sh to run both the visualizer and the server on different ports

  • Summary: This PR allows running the UI and server on different ports and merges related scripts into a single file.
  • Notable Points:
    • The PR has been open for over a month, which could indicate difficulties in getting it reviewed or passing tests.
    • There's a comment from a bot indicating that the PR is stale, as well as a request for updates due to errors in the PR.
  • Potential Issues:
    • Given its age and comments about errors, this PR might need additional attention or could be at risk of becoming obsolete if not updated soon.

PR #150: Adding Gemini Api Integration

  • Summary: This PR integrates Gemini API into Skyvern.
  • Notable Points:
    • There is an extensive discussion on code style, linter issues, and proper configuration settings for the integration.
    • The conversation indicates active review and iteration on the code changes.
  • Potential Issues:
    • The discussion suggests that there may still be unresolved issues or concerns about code quality and standards.

PR #116: Add a method for chromium-attached browser_key

  • Summary: This draft PR adds support for attaching to an existing Chromium browser instance using Playwright.
  • Notable Points:
    • It's marked as a draft and work in progress, indicating that it's not ready for final review or merging.
    • There are comments discussing technical challenges encountered during development.
  • Potential Issues:
    • As it's still in draft status, it might require significant work before it can be considered for merging.

Closed Pull Requests Analysis

Notable Closed Pull Requests:

  • PR #287, PR #286, PR #285, PR #284, PR #283, PR #282, etc. are all recently closed pull requests that have been merged. They cover various fixes, updates, and improvements such as handling login states, updating workflows, maintaining backward compatibility, documentation fixes, etc. These indicate active maintenance and incremental improvements in the project.

Closed Without Merge:

  • PR #264: This pull request was closed without being merged. It appears to have been superseded by another pull request (PR #265) which was merged. This suggests that there may have been an issue with the initial changes that required a new approach or additional work.

General Observations:

  • There is active development and maintenance within the project, with several pull requests being merged recently. This indicates good project health.
  • A few pull requests have been closed without merging, but they seem to be handled appropriately either by creating new pull requests with necessary changes or because they were no longer needed.
  • Some older pull requests like PR #164 may require additional attention to ensure they don't become stale or outdated.

Overall, while there are some potential issues with open pull requests needing further testing or updates (such as PR #290), the project seems to be actively maintained with recent pull requests being addressed promptly.

Report On: Fetch Files For Assessment



Detailed Analysis of Source Code Files

1. skyvern/forge/sdk/services/bitwarden.py

- **Purpose**: Manages interactions with the Bitwarden CLI for secure credential management.
- **Structure**:
 - Defines `BitwardenConstants` using `StrEnum` for environment variable keys.
 - `BitwardenService` class provides methods to interact with the Bitwarden CLI, including login, unlock, list items, and logout functionalities.
- **Quality**:
 - Good use of static methods and structured logging.
 - Exception handling is robust, catching specific errors related to Bitwarden operations and raising custom exceptions.
 - Uses subprocesses securely by sanitizing the environment variables and handling command outputs carefully.
- **Improvement Areas**:
 - Could benefit from more detailed docstrings explaining parameters and return types.
 - The method `get_secret_value_from_url` is quite long and handles multiple responsibilities; consider breaking it down into smaller functions for better maintainability.

2. skyvern/forge/sdk/workflow/models/block.py

- **Purpose**: This file likely contains critical logic for managing workflow blocks but was truncated in the provided content.
- **Assessment Based on Available Information**:
 - Expected to handle complex logic given its significance in managing URL load timeouts and workflow operations.
 - Without the full content, specific assessments on code quality and structure cannot be made.

3. skyvern/webeye/browser_factory.py

- **Purpose**: Manages browser contexts and interactions using Playwright.
- **Structure**:
 - Defines a protocol `BrowserContextCreator` and a factory class `BrowserContextFactory` for creating browser contexts with different configurations.
 - Implements functionality to manage browser artifacts like videos and HAR files.
- **Quality**:
 - Well-structured using design patterns like Factory for browser creation which enhances extensibility.
 - Good use of async functions to handle browser operations efficiently.
 - Includes detailed logging which aids in debugging and monitoring.
- **Improvement Areas**:
 - Some methods are quite complex; for instance, `check_and_fix_state` could be refactored to improve readability and maintainability.

4. skyvern/webeye/actions/handler.py

- **Purpose**: Handles various web actions like clicking, inputting text, uploading files, etc., within a browser context.
- **Structure**:
 - Uses a class-based approach to register and execute actions based on type. Each action type like click, input text, etc., has a corresponding handler function.
- **Quality**:
 - Demonstrates an advanced use of Python features like decorators and asynchronous programming.
 - Robust error handling and logging provide clear insights into action executions and failures.
- **Improvement Areas**:
 - The file is lengthy and handles multiple responsibilities; consider splitting into smaller modules based on action types or functionality.

5. skyvern/forge/sdk/api/llm/config_registry.py

- **Purpose**: Manages configurations for different LLM (Language Learning Models) providers.
- **Structure**:
 - Contains a registry class that holds configurations for various LLM providers. Configurations are validated and registered dynamically based on the application settings.
- **Quality**:
 - Uses a central registry for managing configurations which simplifies access across the application.
 - Exception handling specific to configuration errors helps in identifying setup issues quickly.
- **Improvement Areas**:
 - Dependency on global settings within the registry might make unit testing challenging; consider using dependency injection for settings.

6. skyvern/forge/prompts/skyvern/extract-action.j2

- **Purpose**: Template for generating JSON structured actions based on webpage analysis to drive automated workflows.
- **Structure**:
 - A Jinja2 template that outlines how actions should be structured in JSON format based on the analysis of web page elements and user goals.
- **Quality**:
 - Provides a clear template structure that can be dynamically filled based on various conditions such as user goals, errors, etc., which enhances flexibility in automated decision-making processes.
- **Improvement Areas**:
 - As a template, it's crucial to ensure that all possible edge cases are considered to prevent runtime errors during template rendering.

Conclusion

The reviewed files demonstrate a sophisticated use of Python programming techniques including OOP, asynchronous programming, design patterns like Factory, and effective error handling. However, there are areas where complexity could be reduced or documentation could be enhanced to improve maintainability and understandability.

Report On: Fetch commits



Project Report: Skyvern

Overview

Skyvern is an innovative software project designed to automate browser-based workflows through the use of Large Language Models (LLMs) and Computer Vision. It is developed and maintained by the organization Skyvern-AI. The project aims to provide a robust alternative to traditional automation solutions that often break due to website layout changes. By integrating computer vision and LLMs, Skyvern can interact with web elements in real-time, making it adaptable to new websites and resistant to changes in existing ones. The project is open-source, licensed under the GNU Affero General Public License v3.0, and its repository shows a healthy amount of activity with a significant number of stars (4379), forks (295), and watchers (31). The trajectory of the project appears positive with ongoing contributions and feature development.

Team Members and Recent Activities

Kerem Yilmaz (ykeremy)

  • Recent Commits: 21 commits; involved in various enhancements and bug fixes.
  • Files Worked On: Multiple, including skyvern/forge/sdk/services/bitwarden.py, poetry.lock, pyproject.toml.
  • Collaborations: PR reviews from various team members.
  • Patterns: Frequent contributor across multiple files, focusing on both backend logic and configuration files.

OB42

  • Recent Commits: 1 commit; fixed setup script for anthropic models.
  • Files Worked On: setup.sh.
  • Collaborations: None observed in the recent commits.
  • Patterns: Single commit suggests a targeted fix rather than ongoing development work.

Shixian Sheng (KPCOFGS)

  • Recent Commits: 1 commit; minor README update.
  • Files Worked On: README.md.
  • Collaborations: None observed in the recent commits.
  • Patterns: Commit indicates involvement in documentation or presentation aspects of the project.

Ikko Eltociear Ashimine (eltociear)

  • Recent Commits: 1 commit; updated quickstart documentation.
  • Files Worked On: docs/quickstart.mdx.
  • Collaborations: None observed in the recent commits.
  • Patterns: Commit suggests a role in maintaining or improving project documentation.

LawyZheng

  • Recent Commits: 7 commits; worked on various features including scrolling issues and action handling.
  • Files Worked On: Includes skyvern/webeye/scraper/domUtils.js and skyvern/forge/agent.py.
  • Collaborations: None observed in the recent commits.
  • Patterns: Active in developing features related to the core functionality of Skyvern.

Shuchang Zheng (wintonzheng)

  • Recent Commits: 11 commits; involved in database optimizations, API key creation, and other backend improvements.
  • Files Worked On: Includes skyvern/forge/sdk/db/client.py and skyvern/forge/api_app.py.
  • Collaborations: Appears to have PRs reviewed by ykeremy.
  • Patterns: Focused on backend optimizations and enhancements.

Salih Altun (msalihaltun)

  • Recent Commits: 8 commits; UI improvements and frontend development tasks.
  • Files Worked On: Multiple frontend files such as skyvern-frontend/src/routes/tasks/list/TaskList.tsx.
  • Collaborations: Reviewed by ykeremy on PRs.
  • Patterns: Active involvement in frontend development, indicating a specialization in UI/UX.

Suchintan

  • Recent Commits: 2 commits; minor updates related to Docker setup and environment variables.
  • Files Worked On: skyvern/webeye/browser_factory.py and related files.
  • Collaborations: Reviewed by ykeremy on PRs.
  • Patterns: Involvement suggests a focus on deployment or devops aspects of the project.

Conclusions

The development team behind Skyvern is actively working on improving the software's capabilities, with a clear division of labor among frontend, backend, documentation, and devops tasks. The team members collaborate through PR reviews, indicating a structured workflow. The frequent updates to configuration files and backend logic suggest that the project is still evolving rapidly, with ongoing efforts to enhance stability, performance, and user experience. The addition of new features such as workflow support indicates that Skyvern is expanding its scope to cater to more complex automation needs.