‹ Reports
The Dispatch

OSS Watchlist: hatchet-dev/hatchet


Dependency Management and Performance Optimization Highlight Recent Development Efforts

The development team has focused on maintaining up-to-date dependencies and optimizing performance, but critical bugs and potential disputes pose risks to project stability.

Recent Activity

Team Members

Recent Activity

dependabot[bot]

Gabe Ruttner (grutt)

Alexander Belanger (abelanger5)

Patterns, Themes, and Conclusions

  1. Dependency Management: Dependabot continues to play a crucial role in maintaining up-to-date dependencies, ensuring the project remains secure and compatible with the latest versions of external libraries.
  2. Bug Fixes and Optimizations: Gabe Ruttner focused on optimizing step run queries and fixing related issues, indicating an effort to improve performance and reliability.
  3. Feature Enhancements: New features such as configurable data retention periods were introduced by Gabe Ruttner, enhancing the platform's configurability.
  4. Collaborative Efforts: Alexander Belanger collaborated with Gabe Ruttner on merging branches and fixing expired crons, showing teamwork in addressing ongoing issues.

Risks

Critical Bug Causing Double Execution of Steps When Restarting the Engine

Severity: High

This bug can lead to significant workflow reliability issues, causing unexpected behavior and potentially duplicating actions that should only occur once.

Prolonged Disagreement Among Team Members

Severity: Medium

Disagreements can indicate deeper issues within the team that may affect productivity and project direction.

Multiple Rewrites of the Same Source Code Files in a Short Period

Severity: Medium

Frequent changes to the same files can indicate instability or unclear requirements, which may introduce bugs or inconsistencies.

Of Note

Ambiguous Specifications for Important Functionality

Ambiguity in specifications can lead to misaligned implementations and wasted effort.

Non-Critical PRs Left Open Without Updates

While not urgent, leaving PRs open without updates can indicate potential bottlenecks in the review process.

Overall, while there have been significant improvements in dependency management and performance optimization, addressing critical bugs and ensuring clear communication among team members are essential for maintaining project stability and progress.

Detailed Reports

Report On: Fetch commits



Development Team and Recent Activity

Team Members

  • dependabot[bot]

  • Gabe Ruttner (grutt)

  • Alexander Belanger (abelanger5)

Recent Activity

dependabot[bot]

  • 0 days ago: Bumped google.golang.org/api from 0.187.0 to 0.188.0.
  • 2 days ago: Bumped golang.org/x/crypto from 0.24.0 to 0.25.0.
  • 2 days ago: Bumped github.com/getkin/kin-openapi from 0.125.0 to 0.126.0.
  • 5 days ago: Bumped dependabot/fetch-metadata from 2.1.0 to 2.2.0.
  • 6 days ago: Bumped github.com/go-co-op/gocron/v2 from 2.7.1 to 2.8.0.
  • 6 days ago: Bumped google.golang.org/grpc from 1.64.0 to 1.65.0.
  • 7 days ago: Bumped go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc from 1.27.0 to 1.28.0.
  • 7 days ago: Bumped go.opentelemetry.io/otel/sdk from 1.27.0 to 1.28.0.

Gabe Ruttner (grutt)

  • 3 days ago: Fixed step run queries, including removing unused count, fixing indexes, limiting PollStepRuns, creating concurrently if not exists, and other related fixes.
  • 4 days ago: Introduced configurable data retention period feature.
  • 6 days ago: Implemented better logging in dispatcher.

Alexander Belanger (abelanger5)

  • 3 days ago: Merged main branch into feature branch and fixed expired crons removal.
  • 6 days ago: Fixed indexes on workflow runs and events.

Patterns, Themes, and Conclusions

  1. Dependency Management: Dependabot continues to play a crucial role in maintaining up-to-date dependencies, ensuring the project remains secure and compatible with the latest versions of external libraries.
  2. Bug Fixes and Optimizations: Gabe Ruttner focused on optimizing step run queries and fixing related issues, indicating an effort to improve performance and reliability.
  3. Feature Enhancements: New features such as configurable data retention periods were introduced by Gabe Ruttner, enhancing the platform's configurability.
  4. Collaborative Efforts: Alexander Belanger collaborated with Gabe Ruttner on merging branches and fixing expired crons, showing teamwork in addressing ongoing issues.

Analysis of Progress Since Last Report

Since the last report, there has been significant activity:

  1. New Features:

    • Configurable data retention period by Gabe Ruttner.
  2. Bug Fixes and Optimizations:

    • Step run queries optimization by Gabe Ruttner.
    • Removal of expired crons by Alexander Belanger.
  3. Dependency Updates:

    • Multiple updates by dependabot[bot], including google.golang.org/api, golang.org/x/crypto, github.com/getkin/kin-openapi, dependabot/fetch-metadata, github.com/go-co-op/gocron/v2, google.golang.org/grpc, go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc, and go.opentelemetry.io/otel/sdk.

Overall, the team has made substantial progress with new features, bug fixes, dependency updates, and optimizations, indicating a productive development cycle aimed at enhancing functionality and improving system performance and reliability.

Report On: Fetch issues



Analysis of Progress Since Last Report

Since the previous analysis 7 days ago, there has been some notable activity in the hatchet-dev/hatchet repository. Here is a detailed breakdown of the changes and their implications:

New Issues

Issue #682: feat: add failure information to the onFailure steps

  • Created: 8 days ago by None (abelanger5)
  • Significance: This issue proposes enhancing the context for onFailure steps to include failure reasons and details about which step failed. This will significantly improve debugging and error handling capabilities.

Issue #666: feat: Deduplicated enqueue

  • Created: 12 days ago by Ivan Malison (colonelpanic8)
  • Significance: This feature request suggests implementing deduplicated enqueue functionality to avoid redundant workflow executions, which is crucial for long-running and expensive workflows. The discussion around this issue has been active, with multiple comments exploring potential implementation strategies.

Issue #662: show Hatchet version in the web UI

  • Created: 13 days ago by None (abelanger5)
  • Edited: 8 days ago
  • Comments:
    • Shivankar Sharma (shiv4nk4r) provided a detailed description of changes made to address this issue, including adding a version.tsx file and modifying the navigation bar to display the version.

Issue #659: stream support for concurrent workflow execution

  • Created: 13 days ago by Vaidik Nakrani (vaidik0508)
  • Significance: This issue raises a question about Hatchet's support for stream output in concurrent workflow executions, which could be important for handling simultaneous events efficiently.

Issue #652: Webhook workers: upsert logic is suboptimal

  • Created: 14 days ago by Luca Steeb (steebchen)
  • Significance: This issue highlights a problem with the upsert logic for webhook workers, where secrets are overridden when adding a webhook worker with the same URL. This could lead to security concerns and operational inefficiencies.

Issue #642: Improve step error message on workflow scheduling timeout

  • Created: 14 days ago by None (pveierland)
  • Significance: This issue suggests improving error messages related to workflow scheduling timeouts to make them more informative. Clearer error messages can help users understand and resolve issues more quickly.

Ongoing Discussions

Issue #666: feat: Deduplicated enqueue

  • Discussion Highlights:
    • Ivan Malison proposed using PostgreSQL constraints for deduplication.
    • Abelanger5 discussed potential implementation approaches, including using metadata fields or checksums.
    • The conversation also touched on whether this should be implemented at the workflow or step level.
    • The discussion is ongoing with multiple comments exploring various technical solutions.

Closed Issues

Issue #689: bug: Pyright complaining about WorkflowMeta argument type

  • Created and Closed: 6 days ago by Wayde Gilliam (waydegg)
  • Significance: This issue was related to a type error in Pyright when registering a workflow in Python. It was promptly addressed and closed, indicating active maintenance and quick resolution of bugs.

Issue #687: feat: Periodically cleanup historical job runs

  • Created: 7 days ago by Ian Clark (evenicoulddoit)
  • Closed: 4 days ago
  • Significance: This issue raised the need for an automated cleanup mechanism for old job runs. It was quickly addressed, showing responsiveness to user needs and concerns about database management and compliance.

Issue #663: feat: manually mark workers as inactive

  • Created: 13 days ago by None (abelanger5)
  • Closed: 9 days ago
  • Significance: This feature allows manual locking of workers to prevent new step runs from being assigned while allowing them to complete existing tasks. The quick closure indicates efficient implementation of new features.

Observations

  • The repository continues to see active development with new features, bug fixes, and documentation improvements.
  • There is a focus on enhancing user experience through better error messages (#642) and documentation updates (#662).
  • Security remains a priority with regular dependency updates (#687).

Summary

The recent activity in the hatchet-dev/hatchet repository includes critical updates that enhance functionality, security, and user experience. The development team remains proactive in addressing issues promptly, which is a positive indicator for the project's health.

Detailed Breakdown of New Issues

  • Issues like #682 and #666 highlight ongoing efforts to improve code quality and user experience.
  • Documentation updates (#662) are particularly important for user onboarding and reducing friction.

Detailed Breakdown of Closed Issues

  • The closure of issues like #689 through #663 reflects continuous efforts to keep dependencies current while also adding new features and fixing critical bugs.

Conclusion

The hatchet-dev/hatchet repository has seen significant progress over the past 7 days with numerous bug fixes, feature enhancements, and dependency updates. The development team remains responsive and proactive in addressing issues, which bodes well for the project's future stability and usability.

Report On: Fetch PR 703 For Assessment



PR #703

Overview

Repository: hatchet-dev/hatchet
State: Open
Created: 0 days ago
Base branch: main
Head branch: fix--reverse-migration-for-indexes

Description

This pull request addresses a critical bug fix related to high throughput write queries. It reverses the migration introduced in PR #696, which added additional indexes that became problematic under heavy write conditions. The changes in this PR aim to improve the system's performance by removing these problematic indexes.

Type of Change

  • [x] Bug fix (non-breaking change which fixes an issue)

Files Changed

  1. sql/migrations/20240709205134_v0.36.2.sql
    • Added new SQL migration file to drop the problematic indexes.
    • Lines added: 32
  2. sql/migrations/atlas.sum
    • Updated the checksum file to include the new migration.
    • Lines added: 2
    • Lines removed: 1

Code Quality Assessment

SQL Migration File (20240709205134_v0.36.2.sql)

  • Purpose: The SQL file is designed to reverse the creation of several indexes that were identified as problematic under heavy write conditions.
  • Quality:
    • The use of DROP INDEX CONCURRENTLY IF EXISTS ensures that the index removal process is non-blocking and safe for concurrent operations, which is crucial for maintaining database availability during the migration.
    • Each index removal is clearly documented with comments indicating the reverse operation, which aids in understanding and future maintenance.
    • The file adheres to best practices for SQL migrations, ensuring atomicity and safety.

Checksum File (atlas.sum)

  • Purpose: This file maintains checksums for all migration files to ensure integrity and proper sequencing of migrations.
  • Quality:
    • The checksum for the new migration file (20240709205134_v0.36.2.sql) has been correctly added.
    • The update maintains consistency with previous entries, ensuring that the migration system can accurately track and apply migrations.

Comments and Feedback

  • vercel[bot]: Automated comment providing status updates on deployment previews, indicating that the documentation site is ready for inspection and feedback.

Commits

  1. fix: reverse migration by Gabe Ruttner (grutt)
    • Added the SQL migration file to drop problematic indexes.
  2. chore: hash by Gabe Ruttner (grutt)
    • Updated the checksum file to include the new migration.

Summary

This PR effectively addresses a critical performance issue by reversing previously introduced indexes that negatively impacted high throughput write queries. The changes are well-documented, adhere to best practices, and ensure minimal disruption during deployment.

Recommendations

  • Testing: Ensure thorough testing in a staging environment with high write loads to confirm that the removal of these indexes resolves the performance issues without introducing new problems.
  • Monitoring: After deployment, closely monitor database performance metrics to verify improvements and quickly identify any unforeseen issues.

Overall, this PR demonstrates good code quality and attention to detail in addressing a critical bug fix.

Report On: Fetch pull requests



Analysis of Progress Since Last Report

Since the previous analysis conducted 7 days ago, there has been significant activity in the repository with various pull requests being opened and closed. Here's a detailed report on the changes:

Notable Problems with Open PRs:

  1. PR #703: fix: reverse migration

    • State: Open
    • Created: 0 days ago
    • Description: Introduces additional indexes that can become problematic under heavy write.
    • Comments: Vercel bot provided deployment updates.
    • Commits: 2 commits focusing on reversing migration for indexes.
    • Files Changed: sql/migrations/20240709205134_v0.36.2.sql, sql/migrations/atlas.sum
  2. PR #702: Fix improved assign

  3. PR #700: fix: resolve unresolved failed steps

  4. PR #697: fix: remove expired crons

    • State: Open
    • Created: 3 days ago
    • Description: Removes expired workflow run and event cleanup queries suspected to cause DB connection issues.
    • Comments: Vercel bot provided deployment updates.
    • Commits: 2 commits focusing on removing expired crons.
    • Files Changed: internal/services/controllers/workflows/controller.go
  5. PR #695: feat: sticky workers

    • State: Open
    • Created: 5 days ago
    • Description: Adds support for defining worker state as key-value pairs and specifying desired state in the step definition.
    • Comments: Vercel bot provided deployment updates.
    • Commits: Multiple commits focusing on adding sticky workers feature, UI updates, and documentation stubs.
    • Files Changed: Multiple files including api-contracts/dispatcher/dispatcher.proto, frontend/app/src/pages/main/workflow-runs/$run/components/step-run-events.tsx, etc.

Recently Closed/Merged PRs of Interest:

  1. PR #704: chore(deps): bump google.golang.org/api from 0.187.0 to 0.188.0 -State Closed -Created/Closed0 days ago -Description Updates Golang API dependency to version 0.188.0.

  2. PR#701fixbatchedreassignandrequeue -Stateclosed -Created/Closed1dayago -Description**Addressesdbloadunderhighthroughput

3.PR#699chore(deps)bumpgithub.com/getkin/kin-openapifrom01250to01260 -Stateclosed -Created/Closed2daysago -Description**Updateskin-openapidependencytoversion01260

4.PR#698chore(deps)bumpgolang.org/x/cryptofrom0240to0250 -Stateclosed -Created/Closed2daysago -Description**UpdatesGolangcryptodependencytoversion0250

5.PR#696fixsteprunqueries -Stateclosed -Created/Closed3daysago -Description**Improvesperformanceunderhighloadofwfrs

6.PR#694chore(deps)bumpdependabot/fetch-metadatafrom210to220 -Stateclosed -Created/Closed5daysago -Description**Updatesdependabot/fetch-metadatafrom210to220

7.PR#693featconfigurabledataretentionperiod -Stateclosed -Created/Closed5daysagoedited4daysagoclosed4daysagoMergedbyNoneabelanger54daysagoAddssupportforaconfigurabledataretentionperiodviatheenvironmentvariableSERVER_LIMITS_DEFAULT_TENANT_RETENTION_PERIODwhichissettoaGodurationstringFixes687

8.PR#692chorebetterloggingindispatcherAddsbetterlogsforspecificerrorsinthedispatcher

9.PR#691chore(deps)bumpgithub.com/go-co-op/gocron/v2from271to280BumpsGoCrondendencytoversion280

10.PR#690chore(deps)bumpgoogle.golang.org/grpcfrom1640to1650UpdatesGooglegRPCdependencytoversion1650

11.PR#688fixindexesonworkflowruns,eventsAddsindicesoneventsandworkflowrunsforcommonlyusedqueryparamsFixesissuewithindexingonworkflowrunsandevents

12.PR#686chore(deps)bumpgo.opentelemetry.io/otel/exporters/otlp/otlptracefrom1270to1280UpdatesGolangAPIdependencytoversion1280

13.PR#685chore(deps)bumpgo.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpcfrom1270to1280UpdatesGolangAPIdependencytoversion1280

14.PR#684chore(deps)bumpgo.opentelemetry.io/otelfrom1270to1280UpdatesGolangAPIdependencytoversion1280

15.PR#683chore(deps)bumpgo.opentelemetry.io/otel/sdkfrom1270to1280UpdatesGolangAPIdependencytoversion1280

16.PR#679chore(deps)bumpgoogle.golang.org/apifrom01860to01870UpdatesGoogleAPIdependencytoversion01870

Report On: Fetch Files For Assessment



Analysis of Source Code Files

1. go.mod

  • Purpose: This file specifies the dependencies and their versions for the Go project.
  • Structure: The file is well-structured, listing both direct and indirect dependencies.
  • Quality:
    • Dependencies: The project uses a wide range of dependencies, including popular libraries like github.com/fatih/color, github.com/gorilla/sessions, and go.opentelemetry.io/otel.
    • Versioning: Dependencies are pinned to specific versions, which helps in maintaining consistency and avoiding breaking changes.
    • Comments: There are no comments explaining why certain dependencies are used, which could be helpful for future maintainers.

2. go.sum

  • Purpose: This file provides checksums for the dependencies listed in go.mod, ensuring their integrity.
  • Structure: The file is auto-generated and lists each dependency along with its version and checksum.
  • Quality:
    • Integrity: The presence of checksums ensures that the exact versions of dependencies are used, which is crucial for reproducibility.
    • Maintenance: The file is quite large (387 lines), which is typical for projects with many dependencies. Regular updates are necessary to keep it in sync with go.mod.

3. pkg/repository/prisma/db/db_gen.go

  • Purpose: This file appears to be auto-generated and related to database operations.
  • Structure: The file content is not provided, but typically such files contain generated code for database models and queries.
  • Quality:
    • Auto-generated: Auto-generated files should not be manually edited. It's important to ensure that the generation process is well-documented.
    • Consistency: Ensure that the generation tool is consistently used across different environments to avoid discrepancies.

4. prisma/schema.prisma

  • Purpose: This file defines the database schema using Prisma.
  • Structure: The file is quite large (1540 lines), indicating a complex schema.
  • Quality:
    • Schema Definition: Prisma schemas are generally well-structured, but it's important to ensure that they are kept in sync with the actual database state.
    • Documentation: Inline comments explaining complex relationships or constraints would be beneficial.

5. sql/schema/schema.sql

  • Purpose: This file contains the SQL schema for the database.
  • Structure: The file is also large (1417 lines), indicating a detailed schema definition.
  • Quality:
    • Schema Definition: Ensure that this schema is kept in sync with the Prisma schema if both are used.
    • Documentation: Inline comments explaining complex SQL constructs or relationships would be helpful.

6. frontend/docs/pages/self-hosting/configuration-options.mdx

  • Purpose: This documentation file outlines configuration options for self-hosting.
  • Structure: The document is structured in sections based on different configuration categories (e.g., Runtime Configuration, Database Configuration).
  • Quality:
    • Clarity: The document is clear and well-organized, making it easy for users to find relevant configuration options.
    • Completeness: It covers a comprehensive list of configuration options, which is good for flexibility.
    • Default Values: Providing default values helps users understand what to expect if they do not override these settings.

7. internal/services/controllers/workflows/queue.go

  • Purpose: This file handles workflow queue management, which is central to recent fixes and feature implementations.
  • Structure: The file is long (842 lines) and contains multiple functions related to workflow handling.
  • Quality:
    • Error Handling: The code includes error handling, but it could benefit from more granular error messages for easier debugging.
    • Logging: There are logging statements, which help in tracing execution flow. Ensure that sensitive information is not logged.
    • Concurrency Management: Functions like runGetGroupKeyRunRequeue and runGetGroupKeyRunReassign indicate careful handling of concurrent tasks using errgroup, which is good practice.

8. .github/workflows/auto-merge.yml

  • Purpose: This GitHub Actions workflow manages automatic merging of PRs from Dependabot.
  • Structure: The workflow consists of a single job with multiple steps triggered on pull request events by Dependabot.
  • Quality:
    • Conditions and Permissions: The workflow correctly checks if the actor is Dependabot and sets appropriate permissions for pull requests, issues, repository projects, and contents.
    • Steps Execution:
    • Fetches metadata using dependabot/fetch-metadata.
    • Enables auto-squash merge if the update type is minor or patch.
    • Approves PRs automatically based on update type conditions.
    • Uses GitHub CLI (gh) commands effectively within the workflow steps.

Summary

The analyzed files show a well-maintained project with clear dependency management (go.mod, go.sum), comprehensive documentation (configuration-options.mdx), detailed database schemas (schema.prisma, schema.sql), robust workflow management (queue.go), and automated CI/CD processes (auto-merge.yml). However, some areas could benefit from additional comments and documentation to aid future maintainers.

Aggregate for risks



Notable Risks

Critical bug causing double execution of steps when restarting the engine

Severity: High (3/3)

Rationale

This bug can lead to significant workflow reliability issues, causing unexpected behavior and potentially duplicating actions that should only occur once.

  • Evidence: Issue #552 highlights a critical bug where restarting the engine causes the same step to execute twice.
  • Reasoning: This issue directly impacts the core functionality of the system, leading to potential data corruption, unintended side effects, and unreliable operation of workflows.

Next Steps

  • Prioritize fixing this bug immediately.
  • Implement a robust testing mechanism to ensure that restarting the engine does not cause duplicate executions in the future.

Prolonged disagreement or argumentative engagement among team members

Severity: Medium (2/3)

Rationale

Disagreements can indicate deeper issues within the team that may affect productivity and project direction.

  • Evidence: PR discussions showing considerable disagreement about code and architectural issues (#524).
  • Reasoning: While healthy debate is part of development, prolonged disagreements can slow down progress and lead to fragmented solutions.

Next Steps

  • Escalate the discussion to a tech lead or technical executive for resolution.
  • Facilitate a meeting to align on architectural decisions and ensure all team members are on the same page.

Multiple rewrites of the same source code files in a short period

Severity: Medium (2/3)

Rationale

Frequent changes to the same files can indicate instability or unclear requirements, which may introduce bugs or inconsistencies.

  • Evidence: Multiple commits by Gabe Ruttner and Alexander Belanger on features like worker semaphore v2 and email alert groups (#540, #547).
  • Reasoning: While these changes are aimed at improving functionality, frequent rewrites can lead to integration issues and potential bugs if not managed carefully.

Next Steps

  • Conduct a thorough review of recent changes to ensure stability.
  • Establish clearer requirements and design specifications before implementing further changes.

Ambiguous specifications or direction for important functionality

Severity: Medium (2/3)

Rationale

Ambiguity in specifications can lead to misaligned implementations and wasted effort.

  • Evidence: Issue #541 requests retry delay parameters but lacks detailed defining criteria.
  • Reasoning: Without clear specifications, developers may implement features that do not meet user needs or project goals, leading to rework and delays.

Next Steps

  • Clarify and document detailed specifications for high-priority features.
  • Ensure all stakeholders review and agree on these specifications before development begins.

Non-critical PRs left open for several days without any updates

Severity: Low (1/3)

Rationale

While not urgent, leaving PRs open without updates can indicate potential bottlenecks in the review process.

  • Evidence: PR #501 has been open for 21 days with unresolved linter issues.
  • Reasoning: Delays in merging non-critical PRs can slow down overall development velocity and introduce merge conflicts over time.

Next Steps

  • Assign reviewers promptly and set clear deadlines for reviewing non-critical PRs.
  • Encourage regular updates on open PRs to keep them moving towards closure.