‹ Reports
The Dispatch

OSS Report: Azure/PyRIT


Development Stagnation in Azure/PyRIT Project Raises Concerns Amidst Growing Community Interest

PyRIT, a Python-based risk identification tool for generative AI developed by Microsoft's AI Red Team, aims to automate red teaming tasks to identify risks like hallucinations and bias in AI models. Despite its popularity, evidenced by 1,648 stars and 296 forks, the project has seen no updates since August 16, 2024.

Recent Activity

The recent activity in the Azure/PyRIT project is characterized by a focus on feature enhancements and bug fixes. Notable issues include #327, which requests support for Ollama and better documentation, and #263, addressing Azure SQL test failures on MacOS M1. These issues highlight ongoing efforts to improve usability and compatibility. The development team has been actively collaborating on various features, with Raja Sekhar Rao Dheekonda leading significant contributions such as dependency management and flexible memory labels. Richard Lundeen has focused on maintenance and documentation updates, while Jae Sung Song has developed new features like the image text converter.

Team Members and Recent Contributions (Reverse Chronological)

  1. Gary (dlmgary)

    • Implemented PAIR orchestrator.
  2. Shiven Chawla (shivenchawla)

    • Developed new chat targets for Azure OpenAI.
  3. Safwan Ahmed (SafwanA02)

    • Created Crescendo Orchestrator feature.
  4. Roman Lutz (romanlutz)

    • Updated dependencies for Python 3.12; worked on scoring features.
  5. Nina Chikanov (nina-msft)

    • Added exception handling to Azure TTS Target.
  6. Victor Valbuena (ValbuenaVC)

    • Developed Prompt Shield feature.
  7. Volkan Kutal (KutalVolkan)

    • Introduced xstest dataset feature.
  8. Jae Sung Song (jsong468)

    • Developed add_image_text_converter feature.
  9. Richard Lundeen (rlundeen2)

    • Implemented true_false inverter scorer; updated documentation.
  10. Raja Sekhar Rao Dheekonda (rdheekonda)

    • Fixed dependency management; implemented flexible memory labels.

Of Note

Quantified Reports

Quantify commits



Quantified Commit Activity Over 30 Days

Developer Avatar Branches PRs Commits Files Changes
rlundeen2 1 11/11/0 11 140 7279
jsong468 1 2/2/0 2 66 5202
Volkan Kutal 1 2/3/0 3 20 2391
Salma Zainana 1 0/1/0 1 16 2287
SafwanA02 1 0/1/0 1 24 2182
Raja Sekhar Rao Dheekonda 1 4/4/0 4 38 2104
Gary 1 0/1/0 1 8 1771
Roman Lutz 1 9/9/0 9 46 1769
Shiven Chawla 1 2/2/0 2 9 1350
Victor Valbuena 1 0/1/0 1 14 943
jbolor21 1 1/1/0 1 26 810
Nina Chikanov 1 3/3/0 3 14 323
Andrew Elgert (elgertam) 0 1/0/0 0 0 0
Martin Pouliot (mart123p) 0 1/0/1 0 0 0
None (saphirqi7) 0 1/0/0 0 0 0
None (AhmedSalem2) 0 0/0/1 0 0 0

PRs: created by that dev and opened/merged/closed-unmerged during the period

Quantify Issues



Recent GitHub Issues Activity

Timespan Opened Closed Comments Labeled Milestones
7 Days 2 4 6 2 1
30 Days 9 11 20 5 1
90 Days 22 15 39 11 1
All Time 42 28 - - -

Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.

Detailed Reports

Report On: Fetch issues



Recent Activity Analysis

The Azure/PyRIT project has seen a notable uptick in activity, with 14 open issues currently being tracked. The most recent issues reflect ongoing discussions about feature enhancements, bug fixes, and user inquiries regarding the integration of various AI models. Notably, there is a strong emphasis on improving documentation and usability, as well as addressing compatibility issues with different operating systems and Python versions.

Several themes emerge from the recent issues: a focus on enhancing the framework's capabilities (e.g., adding support for new datasets and features), addressing bugs related to specific functionalities (especially concerning MacOS compatibility), and user requests for clearer documentation and examples. There are also discussions around integrating new AI models, such as Azure OpenAI GPT-4o, which indicates an active interest in keeping the tool relevant with the latest advancements in AI technology.

Issue Details

Recently Created Issues

  1. Issue #327: Ollama Support and Initial Run Documentation

    • Priority: Feature Request
    • Status: Open
    • Created: 2 days ago
    • Updated: N/A
  2. Issue #291: FEAT add DecodingTrust dataset

    • Priority: Enhancement
    • Status: Open
    • Created: 25 days ago
    • Updated: N/A
  3. Issue #289: Got a new Jailbreak Prompt

    • Priority: Enhancement
    • Status: Open
    • Created: 28 days ago
    • Updated: 25 days ago
  4. Issue #282: FEAT Metadata for datasets should allow fields as string OR list of strings

    • Priority: Enhancement
    • Status: Open
    • Created: 32 days ago
    • Updated: N/A
  5. Issue #270: Add fetch function for datasets from HarmBench

    • Priority: Enhancement
    • Status: Open
    • Created: 45 days ago
    • Updated: N/A

Recently Updated Issues

  1. Issue #263: bug: Azure SQL Tests Fail in MacOS M1

    • Priority: Bug
    • Status: Open
    • Created: 50 days ago
    • Updated: 49 days ago
  2. Issue #290: FEAT add XSTest dataset

    • Priority: Enhancement
    • Status: Closed
    • Created: 25 days ago
    • Updated: 2 days ago
  3. Issue #283: gandalf example error (Failed to add request response to memory)

    • Priority: Bug
    • Status: Closed
    • Created: 32 days ago
    • Updated: 1 day ago
  4. Issue #242: FEAT Leetspeak converter should have a deterministic option

    • Priority: Enhancement
    • Status: Open
    • Created: 65 days ago
    • Updated: N/A
  5. Issue #186: Update WMDP Dataset

    • Priority: Enhancement
    • Status: Open
    • Created: 108 days ago
    • Updated: 25 days ago

Summary of Notable Issues

  • The request for Ollama support (#327) highlights a need for better integration and documentation as users struggle with rapid version changes.
  • The addition of datasets like DecodingTrust (#291) indicates an ongoing effort to enhance the framework's capabilities.
  • The bug related to Azure SQL tests failing on MacOS M1 (#263) underscores potential platform-specific issues that may hinder user experience.
  • The ongoing discussions around jailbreak prompts (#289) suggest that security and ethical considerations remain at the forefront of development efforts.

Overall, the recent activity reflects a vibrant community engaged in improving the PyRIT framework while addressing critical usability and functionality concerns.

Report On: Fetch pull requests



Report on Pull Requests

Overview

The Azure/PyRIT repository has a total of 8 open pull requests and 275 closed pull requests, showcasing ongoing development and enhancements to the Python Risk Identification Tool for Generative AI. The recent pull requests focus on a variety of features, bug fixes, and improvements related to orchestrators, scoring systems, and converters.

Summary of Pull Requests

Open Pull Requests

  1. PR #331: [DRAFT] FEAT: Operator-Provided Delays between Requests (in Seconds) for PSO

    • Introduces a request_delay parameter to the PromptSendingOrchestrator (PSO) to manage delays between prompt requests, addressing rate limiting issues. A single test has been added, with further documentation pending.
  2. PR #330: FEAT Add SQL Entra Auth for Azure SQL Server

    • Implements Microsoft Entra authentication for Azure SQL Server, replacing deprecated username/password methods. All tests pass.
  3. PR #329: FEAT: Add deterministic flag and custom substitutions to LeetspeakConverter

    • Adds features to ensure consistent leetspeak conversion through a deterministic flag and allows custom substitutions. Some unit tests are currently failing.

Closed Pull Requests

  1. PR #334: FIX Move pillow from dev to core dependency

    • Moves the Pillow library from development to core dependencies to support image converters.
  2. PR #333: MAINT: speeding up crescendo tests

    • Optimizes crescendo tests from 35 seconds to 6 seconds, with suggestions for further improvements.
  3. PR #332: DOC: Adding Notebook to document re-sending previous prompts

    • Adds documentation in the form of a notebook for resending previous prompts.
  4. PR #331: [DRAFT] FEAT: Operator-Provided Delays between Requests (in Seconds) for PSO

    • Introduces a delay parameter in PSO for managing request timing.
  5. PR #330: FEAT Add SQL Entra Auth for Azure SQL Server

    • Implements secure authentication methods for Azure SQL Server.
  6. PR #329: FEAT: Add deterministic flag and custom substitutions to LeetspeakConverter

    • Enhances the LeetspeakConverter with new features.
  7. PR #314: FEAT emoji jailbreak

    • Adds an emoji jailbreak feature based on community contributions.
  8. PR #307: FEAT: Add Likert scoring definition and prompt templates for persuasion and deception

    • Introduces new scoring definitions and templates for specific testing scenarios.

Analysis of Pull Requests

The recent pull requests reflect a diverse set of enhancements aimed at improving functionality, security, and usability within the PyRIT framework. Several themes emerge from the analysis:

Feature Enhancements

A significant number of pull requests focus on adding new features or enhancing existing functionalities. For instance, PR #331 introduces a delay mechanism in the PromptSendingOrchestrator to handle rate limiting effectively, which is crucial in real-world applications where API limits can disrupt operations. Similarly, PR #329 enhances the LeetspeakConverter by adding a deterministic flag and custom substitutions, catering to user needs for flexibility in text processing.

Security Improvements

Security is a recurring theme in many recent PRs. The introduction of Microsoft Entra authentication in PR #330 is a notable step towards ensuring secure access when interacting with Azure SQL Server databases. Additionally, PR #299 adds error handling mechanisms in the AML Chat Target, addressing potential vulnerabilities that could arise from unhandled exceptions during interactions with AI models.

Documentation and Testing

Documentation efforts are evident in several PRs, such as PR #332 which adds notebooks to document new functionalities like re-sending prompts. This is essential for user onboarding and understanding how to leverage new features effectively. Furthermore, there is an emphasis on testing; multiple PRs include unit tests or mention plans for future testing, indicating a commitment to maintaining code quality and reliability as new features are integrated.

Community Engagement

The active engagement from contributors is highlighted by the variety of discussions and suggestions within the pull requests. For example, PR #285 discusses replacing orchestrator IDs with UUIDs due to uniqueness concerns—a critical aspect in distributed systems where ID collisions can lead to significant issues. This level of discourse reflects a collaborative environment focused on improving the robustness of the tool.

Anomalies

While most pull requests follow standard practices, some anomalies were noted, such as PR #329 where unit tests were reported as failing without immediate resolution plans outlined by the contributor. This could indicate areas where additional support or clearer guidelines might be beneficial for contributors unfamiliar with testing frameworks or practices within the project.

In conclusion, the current state of pull requests in the Azure/PyRIT project illustrates an active development cycle that prioritizes feature enhancement, security improvements, thorough documentation, and community collaboration—all essential elements for building a reliable tool aimed at mitigating risks associated with generative AI technologies.

Report On: Fetch commits



Repo Commits Analysis

Development Team and Recent Activity

Team Members and Recent Contributions

  1. Raja Sekhar Rao Dheekonda (rdheekonda)

    • Recent Activity:
    • Fixed dependency management by moving Pillow from dev to core dependency.
    • Contributed to the implementation of a flexible memory labels and scoring feature in orchestrators.
    • Collaborated on various features including Exception Handling for Azure TTS Target and multi-modal support.
    • Collaboration: Worked with multiple team members including rlundeen2 and Volkan Kutal.
  2. Richard Lundeen (rlundeen2)

    • Recent Activity:
    • Focused on maintenance tasks, speeding up tests, and fixing bugs.
    • Implemented features such as true_false inverter scorer and multi-turn prompt sending orchestrator.
    • Contributed significantly to documentation updates.
    • Collaboration: Frequently collaborated with rdheekonda, jsong468, and others.
  3. Jae Sung Song (jsong468)

    • Recent Activity:
    • Developed the add_image_text_converter feature along with unit tests.
    • Updated documentation for various components.
    • Collaboration: Worked closely with rlundeen2 on documentation improvements.
  4. Volkan Kutal (KutalVolkan)

    • Recent Activity:
    • Introduced the xstest dataset feature and contributed to bias testing frameworks.
    • Collaboration: Co-authored several commits with rdheekonda.
  5. Victor Valbuena (ValbuenaVC)

    • Recent Activity:
    • Developed the Prompt Shield feature, enhancing security measures within the tool.
    • Collaboration: Collaborated with multiple team members including rdheekonda and rlundeen2.
  6. Nina Chikanov (nina-msft)

    • Recent Activity:
    • Added exception handling to Azure TTS Target and AML Chat Target.
    • Collaboration: Worked alongside other developers on various features.
  7. Roman Lutz (romanlutz)

    • Recent Activity:
    • Engaged in both feature development and maintenance tasks, including updating dependencies for Python 3.12.
    • Implemented several scoring features and contributed to documentation improvements.
    • Collaboration: Frequently collaborated with rdheekonda and rlundeen2.
  8. Safwan Ahmed (SafwanA02)

    • Recent Activity:
    • Developed the Crescendo Orchestrator feature, which adds new capabilities to the project.
    • Collaboration: Worked with multiple team members on this feature.
  9. Shiven Chawla (shivenchawla)

    • Recent Activity:
    • Contributed to the implementation of new chat targets for Azure OpenAI.
    • Collaboration: Collaborated with Nina Chikanov on various features.
  10. Gary (dlmgary)

    • Recent Activity:
    • Implemented the PAIR orchestrator, enhancing the project's functionality.
    • Collaboration: Worked alongside Roman Lutz on this feature.

Patterns, Themes, and Conclusions

  • The development team is actively engaged in both feature development and maintenance tasks, demonstrating a balanced approach to improving functionality while ensuring stability.
  • Collaboration is prominent among team members, with many co-authored commits indicating a strong culture of teamwork.
  • Recent activities reflect a focus on enhancing security features within the tool, aligning with the project's goal of risk identification in generative AI systems.
  • The diversity of contributions across various aspects of the project—from core functionality to documentation—suggests a well-rounded team capable of addressing multiple facets of software development effectively.
  • The increase in commits related to testing frameworks indicates an emphasis on maintaining robust quality assurance processes as the project evolves.

Overall, the recent activities showcase a committed team making significant strides in developing a comprehensive risk identification tool for generative AI applications.