OSS Report: google-deepmind/gemma

Sept. 16, 2024, 3:30 a.m. UTC This report was generated by Dispatch AI

Gemma Project Faces Persistent Installation Challenges Amid Active Community Engagement

Gemma, an open-weights Large Language Model (LLM) project by Google DeepMind, continues to experience significant community interest and engagement, as evidenced by ongoing discussions around installation issues and performance metrics. The project provides inference implementations using Flax and JAX frameworks.

Recent Activity

Recent issues and pull requests (PRs) indicate a focus on resolving installation difficulties and enhancing user experience. Notable issues include #43, a high-priority pip installation failure, and #23, concerning unit test execution problems. These suggest persistent challenges in setup and compatibility across environments. PRs like #47 aim to improve input validation in scripts, while #41 enhances usability by adding "Open in Colab" buttons to notebooks.

Development Team and Recent Contributions

Gemma Team
- 48 days ago: Added 2B v2 configuration.
- 48 days ago: Fixed duplicate BOS token in training input.
- 56 days ago: Fixed bug from reshape to transpose.
- 56 days ago: Replaced reciprocal computation in RMSNorm.
- 67 days ago: Simplified ffw tests and added docstrings for clarity.
- 69 days ago: Simplified parameter loading and fixed bugs in tests.
- 83 days ago: Fixed logits capping for Gemma v2 27b and 9b.
Michelle Casbon (texasmichelle)
- 54 days ago: Added herself and Ravin to authors.
- 54 days ago: Removed config override from tests, improving test reliability.
- 55 days ago: Updated README to include a link to the v2 technical report.
- 67 days ago: Contributed to adding query-key and attention-value einsums.
Jasper Snoek (JasperSnoek)
- 83 days ago: Set use_post_attn_norm and use_post_ffw_norm for specific models.
Kathleen Kenealy (kkenealy)
- 95 days ago: Added post-ffw norm to Gemma 27B flax implementation.
Jasper Uijlings (jrruijli)
- 100 days ago: Made attention types hashable tuples.
Shreya Pathak
- 108 days ago: Added sliding window attention to the public gemma repository.
Morgane Rivière (Molugan)
- 185 days ago: Updated flax dependency to 0.8 or higher.
Alistair Muldal (alimuldal)
- 208 days ago: Documented how to run unit tests, improving accessibility for new contributors.

Of Note

The project has a high number of stars (2392) and forks (299), indicating strong community interest despite installation hurdles.
There is only one branch (main), which may streamline development but could limit collaborative contributions.
The disclaimer stating that Gemma is not an official Google product might affect user expectations regarding support.
Michelle Casbon's active role suggests leadership within the team, focusing on documentation and code improvements.
Several PRs address similar issues, indicating potential coordination challenges among contributors.

Quantified Reports

Quantify Issues

Recent GitHub Issues Activity

Timespan	Opened	Closed	Comments	Labeled	Milestones
7 Days	0	1	0	0	0
30 Days	2	1	3	2	1
90 Days	7	4	10	3	1
All Time	31	10	-	-	-

_{Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.}

Quantify commits

Quantified Commit Activity Over 30 Days

Developer	Avatar	Branches	PRs	Commits	Files	Changes
Mandlin Sarah (mandlinsarah)		0	1/0/0	0	0	0
None (KumarGitesh2024)		0	1/0/0	0	0	0

_{PRs: created by that dev and opened/merged/closed-unmerged during the period}

Detailed Reports

Report On: Fetch issues

Recent Activity Analysis

The GitHub repository for the Gemma project by Google DeepMind currently has 21 open issues, with recent activity indicating ongoing engagement from users seeking assistance and reporting bugs. Notably, several issues revolve around installation problems and unit test failures, suggesting potential challenges in the setup process or compatibility with various environments.

A recurring theme among the issues is related to installation errors, particularly on different operating systems (Windows, WSL) and Python versions. Additionally, there are multiple inquiries regarding model performance metrics and discrepancies, which could indicate a need for clearer documentation or guidance on expected results.

Issue Details

Most Recently Created Issues

Issue #48: only using single sample from batch in finetuning example?
- Priority: Low
- Status: Open
- Created: 11 days ago
- Updated: 6 days ago
Issue #45: NameError found flax repo due to case sensitivity
- Priority: Medium
- Status: Open
- Created: 21 days ago
- Updated: Not updated since creation
Issue #44: how about ppl on wikitext?
- Priority: Medium
- Status: Open
- Created: 39 days ago
- Updated: Not updated since creation
Issue #43: pip install fail
- Priority: High
- Status: Open
- Created: 45 days ago
- Updated: 27 days ago
Issue #42: Reproducing evaluations
- Priority: Medium
- Status: Open
- Created: 54 days ago
- Updated: 33 days ago

Most Recently Updated Issues

Issue #23: Issue when "Running the unit tests"
- Priority: High
- Status: Open
- Created: 174 days ago
- Updated: 5 days ago
Issue #7: 'subprocess-exited-with-error' when installing gemma
- Priority: High
- Status: Open
- Created: 205 days ago
- Updated: 5 days ago
Issue #10: Colabs don't seem to work
- Priority: Medium
- Status: Open
- Created: 204 days ago
- Updated: 12 days ago
Issue #36: MMLU script require
- Priority: Low
- Status: Open
- Created: 111 days ago
- Updated: 108 days ago
Issue #32: Issue with unit tests on NVidia V100 (GPU)
- Priority: High
- Status: Open
- Created: 123 days ago
- Updated: 103 days ago

Summary of Observations

The majority of open issues relate to installation difficulties and bugs encountered during testing, indicating that users may struggle with the initial setup or compatibility across different environments.
There is a notable interest in performance metrics and evaluation reproducibility, suggesting that users are actively trying to benchmark the model against other frameworks or versions.
The presence of multiple unresolved queries about specific functionalities (like PPL evaluations) indicates a potential gap in documentation or user guidance that could be addressed to enhance user experience.

Overall, while the project shows active community engagement, addressing these recurring issues could improve usability and satisfaction among users exploring the capabilities of the Gemma models.

Report On: Fetch pull requests

Overview

The repository google-deepmind/gemma currently has 11 open pull requests (PRs) and 5 closed PRs. The open PRs primarily focus on enhancing documentation, fixing errors, and improving user experience through better input validation and tutorial updates.

Summary of Pull Requests

Open Pull Requests

PR #47: Enhance input validation in sampling script
Created by Mandlin Sarah, this PR introduces input validation for command-line arguments to prevent runtime errors. It is currently under review due to a reported error regarding checkpoint file paths.
PR #46: Update fine_tuning_tutorial.ipynb
Submitted by KumarGitesh2024, this PR addresses a NameError in the fine-tuning tutorial. It has received comments requesting access to shared resources.
PR #41: Added "Open in Colab" button to each notebook in the colab dir
Created by Paige Bailey, this PR enhances usability by adding direct links to open notebooks in Google Colab. It has garnered positive feedback and a request for merging.
PR #34: Add a .gitignore
Submitted by Mircea Trofin, this PR adds a .gitignore file to the repository, which is essential for excluding unnecessary files from version control.
PR #31: Fix huggingface_hub code snippet
Created by Omar Sanseviero, this PR updates the README with a corrected code snippet for downloading models from Hugging Face.
PR #28: Fix error in HF code in README
Submitted by Benjamin Bossan, this PR fixes an error in the Hugging Face download code snippet. However, it requires signing a Contributor License Agreement (CLA).
PR #27: Update sampling_tutorial.ipynb
Created by Anushan Fernando, this PR corrects a typo in the sampling tutorial notebook.
PR #25: Update fine_tuning_tutorial.ipynb
Submitted by Anique, this PR addresses variable naming inconsistencies in the fine-tuning tutorial.
PR #24: Fix typo in repo card
Created by Omar Sanseviero, this PR corrects a typo in the README file related to model download instructions.
PR #17: Auto-labels 'Gemma' on 'gemma' issues/PRs
Submitted by Shivam Mishra, this workflow automates labeling issues and PRs related to Gemma.
PR #3: fix pyproject.toml: The Poetry configuration is invalid
Created by wirthual, this PR resolves an issue with Poetry configuration that prevents successful package installation.

Closed Pull Requests

PR #39: Fix test when using pytest
Closed due to resolution of the issue through another commit.
PR #19: Correct a typo in the fine-tuning tutorial
Closed as it was fixed internally before merging.
PR #15: Add HF pointers
Merged after discussions regarding security checks and content verification.
PR #12: Fix ReadMe Command
Closed without specific details provided.
PR #2: Update Gemma Technical Report link in README
Closed without specific details provided.

Analysis of Pull Requests

The current state of pull requests in the google-deepmind/gemma repository reveals several key themes and areas of concern. Firstly, many of the open pull requests focus on improving documentation and user experience. For instance, PRs like #41 (adding "Open in Colab" buttons) and #46 (updating tutorials) highlight an ongoing effort to make the repository more accessible and user-friendly. This is crucial for attracting new users who may not be familiar with the intricacies of using large language models or navigating complex codebases.

However, there are notable challenges reflected in some of these contributions. For example, PR #47 encountered issues with input validation that led to runtime errors when users attempted to execute scripts with incorrect paths. Such feedback indicates that while contributors are eager to enhance functionality, there may be gaps in testing or documentation that need addressing before merging changes into the main branch. The comments on various pull requests suggest that contributors are actively engaging with each other but also highlight potential barriers such as CLA requirements that could slow down collaboration.

Another significant observation is the presence of multiple pull requests addressing similar issues—such as variable naming conventions across different tutorials (#25 and #46). This redundancy may indicate a lack of coordination among contributors or insufficient communication about ongoing work within the community. It could be beneficial for maintainers to implement clearer guidelines or a more structured approach to managing contributions to avoid overlapping efforts and streamline the review process.

Moreover, several older pull requests remain open without recent activity (e.g., PR #34 from 120 days ago). This stagnation could signal either a lack of resources for maintaining the repository or challenges in prioritizing contributions effectively. The absence of merges for these older requests may discourage new contributors from participating if they perceive that their efforts might not be recognized or integrated into the project promptly.

In summary, while there is significant enthusiasm around enhancing the gemma project through various contributions, there are underlying issues regarding coordination, communication, and responsiveness that need addressing. Ensuring timely reviews and merges while fostering an inclusive environment for contributors will be essential for maintaining momentum and encouraging further engagement within this vibrant community.

Report On: Fetch commits

Repo Commits Analysis

Development Team and Recent Activity

Team Members and Recent Contributions

Gemma Team
- Recent Activity:
- 48 days ago: Added 2B v2 configuration.
- 48 days ago: Fixed duplicate BOS token in training input.
- 56 days ago: Fixed bug from reshape to transpose.
- 56 days ago: Replaced reciprocal computation in RMSNorm.
- 67 days ago: Simplified ffw tests and added docstrings for clarity.
- 69 days ago: Simplified parameter loading and fixed bugs in tests.
- 83 days ago: Fixed logits capping for Gemma v2 27b and 9b.
Michelle Casbon (texasmichelle)
- Recent Activity:
- 54 days ago: Added herself and Ravin to authors.
- 54 days ago: Removed config override from tests, improving test reliability.
- 55 days ago: Updated README to include a link to the v2 technical report.
- 67 days ago: Contributed to adding query-key and attention-value einsums.
Jasper Snoek (JasperSnoek)
- Recent Activity:
- 83 days ago: Set use_post_attn_norm and use_post_ffw_norm for specific models.
Kathleen Kenealy (kkenealy)
- Recent Activity:
- 95 days ago: Added post-ffw norm to Gemma 27B flax implementation.
Jasper Uijlings (jrruijli)
- Recent Activity:
- 100 days ago: Made attention types hashable tuples.
Shreya Pathak
- Recent Activity:
- 108 days ago: Added sliding window attention to the public gemma repository.
Morgane Rivière (Molugan)
- Recent Activity:
- 185 days ago: Updated flax dependency to 0.8 or higher.
Alistair Muldal (alimuldal)
- Recent Activity:
- 208 days ago: Documented how to run unit tests, improving accessibility for new contributors.

Patterns and Themes

The majority of recent commits are concentrated around bug fixes and enhancements related to model configurations, indicating an ongoing effort to stabilize and improve the functionality of the Gemma models.
Michelle Casbon is notably active, contributing both documentation updates and code changes, which suggests a leadership role within the team.
Collaboration among team members is evident, with multiple contributors working on similar areas such as model configurations and testing improvements.
The project appears to be in a phase of refinement, focusing on enhancing usability through documentation and fixing identified issues rather than introducing major new features.
There is a lack of recent activity from some team members (e.g., mandlinsarah, KumarGitesh2024), which may indicate varying levels of engagement or focus within the team.

Overall, the development team's activities reflect a commitment to improving the Gemma project through collaborative efforts in bug fixing, documentation enhancement, and performance optimization.