‹ Reports
The Dispatch

GitHub Repo Analysis: Generic


ggerganov/whisper.cpp Project Analysis

Overview

Whisper.cpp is a versatile, high-performance ASR model inference project, supporting a wide range of platforms and architectures. It has a strong focus on performance optimization and customization. The project is active, with the latest push on November 3, 2023.

Activity

The project is popular, with 24,500 stars and 2,323 forks. It has 375 open issues and 20 open pull requests. The project has a total of 743 commits across 26 branches.

Notable Aspects

The project supports a variety of precision levels, quantization methods, and platforms. It has a clear roadmap for future development.

Open Pull Requests

The project has a variety of open pull requests, ranging from adding support for new Whisper models (#1424) to fixing encoding issues on Windows (#1313). The oldest open PRs include adding alternate OpenCL support (#891) and adding Bootstrap 5 styling to the application page (#968).

Issues

Recent issues range from support for new features to technical problems and feature requests, indicating diverse usage contexts. Oldest open issues include requests for Core ML integration samples (#889) and issues with undefined symbols (#892).

Conclusion

The project is actively maintained and widely used across various platforms. It's facing a mix of technical issues and feature requests, indicating a need for continuous improvement and enhancement.

Detailed Reports

Report on issues



The recent issues opened for the software project range from #1423 to #1384. They cover a wide range of topics, including support for new features like Distil-Whisper (#1423), issues with the Android Demo app (#1421), problems with CMake on macOS M2 (#1419), and questions about file support (#1416). There are also issues related to the use of WSL 2/Ubuntu (#1413), Intel Mac (#1411), and requests for new releases (#1409). Some users are experiencing problems with running the software at scale and in parallel (#1408), backward computation (#1407), and understanding the difference between different versions of the software (#1405). There are also requests for generating a static library for iOS (#1403), adding Bazel support (#1402), and generating a .txt file (#1398). The recent issues indicate a diverse range of problems and requests, suggesting that the software is being used in a variety of contexts and platforms, and that users are encountering a mix of technical issues and feature requests.

The oldest open issues range from #889 to #1004. They include requests for Core ML integration samples (#889), issues with undefined symbols (#892), questions about differences between different versions of the software (#894), and problems with the Java Binding (#1384). There are also issues with duplicate words being generated (#896), slow running times on certain devices (#897), and problems deploying the software on Sagemaker (#905). Some users have reported problems with speaker segmentation (#1395), errors when using OpenVINO (#1394), and issues with Metal and ggml-alloc support causing initialization failure on macOS (#1332). The recently closed issues are not provided, so it's not possible to summarize them. However, the open issues suggest that users are encountering a range of technical problems and have a variety of feature requests. The issues cover a wide range of topics, from specific technical problems to requests for new features and improvements. The common themes among these issues include problems with specific features, requests for new features or improvements, and issues related to running the software on specific platforms or devices.

Report on pull requests



Analysis

Open Pull Requests

The project has a total of 20 open pull requests.

Recent Open Pull Requests:

  1. #1424: This PR is about adding support for new distilled Whisper models. The PR is currently open and has no comments or review comments. The PR has one commit and modifies one file.

  2. #1418: This PR is about creating another server based on stream. The PR is currently open and has no comments or review comments. The PR has three commits and modifies 12 files.

  3. #1382: This PR is about enhancing compatibility with older Android versions using Java. The PR is currently open and has multiple comments and review comments. The PR has multiple commits and modifies 59 files.

  4. #1381: This PR is about fixing the download coreml script with zipfile. The PR is currently open and has no comments or review comments. The PR has one commit and modifies one file.

  5. #1375: This PR is about creating a basic example server for whisper.cpp. The PR is currently open and has no comments or review comments. The PR has four commits and modifies five files.

  6. #1370: This PR is about adding support for Swift Package Manager. The PR is currently open and has no comments or review comments. The PR has one commit and modifies three files.

  7. #1364: This PR is about replacing " with ' so it doesn't try to execute code in backticks. The PR is currently open and has no comments or review comments. The PR has three commits and modifies one file.

  8. #1313: This PR is about fixing the encoding issues on Windows. The PR is currently open and has multiple comments and review comments. The PR has multiple commits and modifies five files.

  9. #1293: This PR is about adding a context param for disabling GPU in whisper. The PR is currently open and has multiple comments and review comments. The PR has multiple commits and modifies 30 files.

  10. #1164: This PR is about adding buffer overrun detection to ./stream. The PR is currently open and has no comments or review comments. The PR has two commits and modifies three files.

  11. #1118: This PR is about switching to BPE tokenization implementation from openai/tiktoken. The PR is currently open and has no comments or review comments. The PR has one commit and modifies one file.

Oldest Open Pull Requests:

  1. #891: This PR is about adding alternate OpenCL support via the CLBlast Netlib BLAS API. The PR is currently open and has multiple comments and review comments. The PR has multiple commits and modifies one file.

  2. #939: This PR is about adding a Swift Package Manager manifest. The PR is currently open and has multiple comments and review comments. The PR has two commits and modifies three files.

  3. #968: This PR is about adding Bootstrap 5 styling to the application page. The PR is currently open and has no comments or review comments. The PR has two commits and modifies one file.

  4. #971: This PR is about adding support for outputting token-level confidence. The PR is currently open and has multiple comments and review comments. The PR has two commits and modifies one file.

  5. #1003: This PR is about changing ggml.c to make it compatible with C++ compilers. The PR is currently open and has no comments or review comments. The PR has four commits and modifies one file.

  6. #1021: This PR is about adding Asahi Linux Apple Neural Engine (ANE) support. The PR is currently open and has no comments or review comments. The PR has two commits and modifies six files.

  7. #1074: This PR is about adding Metal Decoder Inference, K-Quants added, Sync'd with latest GGML. The PR is currently open and has no comments or review comments. The PR has multiple commits and modifies 12 files.

Report on README and metadata



The project, ggerganov/whisper.cpp, is a high-performance inference of OpenAI's Whisper automatic speech recognition (ASR) model implemented in C/C++. The project is created by ggerganov and is licensed under the MIT License. The software is intended to be used for automatic speech recognition and supports a variety of platforms including Mac OS, iOS, Android, Java, Linux, FreeBSD, WebAssembly, Windows, and Raspberry Pi. The project is actively maintained with the latest push made on November 3, 2023.

The repository is quite popular and active with 24500 stars, 2323 forks, and 375 open issues. The project has 743 total commits and 26 branches, indicating a high level of activity. The software is written in C and is 8556 kB in size. The project uses a variety of technologies including ARM NEON, Accelerate framework, Metal, Core ML, AVX intrinsics for x86 architectures, VSX intrinsics for POWER architectures, and supports mixed F16 / F32 precision. The project also supports 4-bit and 5-bit integer quantization, low memory usage, zero memory allocations at runtime, and partial GPU support for NVIDIA via cuBLAS and OpenCL GPU support via CLBlast.

The project has several notable aspects. It supports a variety of platforms and architectures, making it versatile and widely usable. It also supports a variety of precision levels and quantization methods, allowing for a high degree of customization and optimization. The project also has a focus on performance, with features such as low memory usage and zero memory allocations at runtime. The project's README also mentions a roadmap and FAQ, indicating a clear direction for future development.