In the last 30 days, SGLang has experienced a notable uptick in user engagement, with 62 open issues highlighting critical bugs and feature requests that demand immediate attention. SGLang is a high-performance serving framework designed for large language models (LLMs) and vision-language models, aiming to enhance interaction speed and control over these models.
The recent activity indicates a vibrant development process, with several newly created issues focusing on bugs related to model performance and compatibility. For example, Issue #1109 reports a critical bug regarding head dimension dispatching, which could significantly impact user experience if unresolved. Additionally, there are ongoing discussions about supporting new models like Phi3V (#1108) and enhancing performance metrics (#1105). This mix of bug reports and feature requests illustrates the community's active engagement and the complexity of using large language models effectively.
Open Issues: 62 total
Recently Updated Issues:
Yineng Zhang (zhyncs)
Ying Sheng (Ying1123)
Lianmin Zheng (merrymercy)
Liangsheng Yin (hnyls2002)
Other Contributors:
This collaborative effort among team members demonstrates a strong focus on both addressing user-reported issues and implementing new features, reflecting a healthy development environment.
Timespan | Opened | Closed | Comments | Labeled | Milestones |
---|---|---|---|---|---|
7 Days | 32 | 17 | 57 | 25 | 1 |
30 Days | 93 | 50 | 253 | 66 | 1 |
90 Days | 103 | 50 | 282 | 70 | 1 |
All Time | 404 | 342 | - | - | - |
Like all software activity quantification, these numbers are imperfect but sometimes useful. Comments, Labels, and Milestones refer to those issues opened in the timespan in question.
Developer | Avatar | Branches | PRs | Commits | Files | Changes |
---|---|---|---|---|---|---|
Ying Sheng | 2 | 43/41/2 | 43 | 117 | 6043 | |
Lianmin Zheng | 2 | 18/18/0 | 19 | 72 | 4988 | |
Liangsheng Yin | 3 | 29/25/3 | 34 | 68 | 2765 | |
Juwan Yoo | 1 | 4/3/1 | 3 | 20 | 2004 | |
Yineng Zhang | 1 | 44/38/5 | 37 | 64 | 1623 | |
yichuan~ | 1 | 8/7/1 | 7 | 9 | 769 | |
Ke Bao | 1 | 3/3/0 | 3 | 11 | 528 | |
Aidan Cooper | 1 | 1/2/0 | 2 | 10 | 478 | |
min-xu-et | 1 | 6/5/1 | 5 | 4 | 410 | |
rainred | 1 | 3/3/0 | 3 | 10 | 319 | |
Lucien | 1 | 1/1/0 | 1 | 3 | 117 | |
liuyhwangyh | 1 | 1/1/0 | 1 | 5 | 84 | |
foszto | 1 | 1/1/0 | 1 | 6 | 20 | |
任嘉 | 1 | 0/0/0 | 1 | 3 | 18 | |
Zhiqiang Xie | 1 | 1/1/0 | 1 | 2 | 6 | |
Roger Wang | 1 | 1/1/0 | 1 | 1 | 3 | |
Kai Fronsdal | 1 | 0/1/0 | 1 | 1 | 3 | |
Meng, Peng | 1 | 1/1/0 | 1 | 1 | 2 | |
Mingyi | 1 | 1/1/0 | 1 | 1 | 2 | |
Li Bo (Luodian) | 0 | 2/0/1 | 0 | 0 | 0 | |
None (81549361) | 0 | 1/0/1 | 0 | 0 | 0 | |
Yonghao Zhuang (ZYHowell) | 0 | 1/0/0 | 0 | 0 | 0 | |
Haichuan (haichuan1221) | 0 | 0/0/1 | 0 | 0 | 0 |
PRs: created by that dev and opened/merged/closed-unmerged during the period
The recent activity on the SGLang GitHub repository indicates a vibrant and ongoing development process, with 62 open issues currently logged. Notably, several issues have been created or updated in the last few days, reflecting active user engagement and ongoing troubleshooting efforts.
A significant theme among recent issues is the presence of bugs related to model performance and compatibility, particularly concerning specific configurations and model types. For instance, Issue #1109 highlights a critical bug regarding the failure to dispatch a head dimension during execution with certain configurations. This suggests that users are encountering challenges with specific model setups, which could hinder adoption if not addressed promptly.
Moreover, there are multiple feature requests and discussions around enhancing existing functionalities, such as support for new models (e.g., Phi3V in #1108) and performance improvements (e.g., Issue #1105 discussing performance-enhancing features). The diversity of issues reflects both the complexity of using large language models and the community's eagerness to enhance the framework's capabilities.
Issue #1109: [Bug] Failure to Dispatch Head Dimension 80 in sglang with Specific Configurations
Issue #1108: [Feature] Do we have any plan for supporting Phi3V?
Issue #1105: [Develop] Performance Improving Feature
Issue #1102: [Bug] Low QPS for 1.2b model
Issue #1100: [Bug] Can't run Qwen2-57B-A14B-Instruct-GPTQ-Int4
Issue #1093: [Bug] Always Watch Dog TimeOut
Issue #1087: [Bug] cuda out of memory when using MQA and input_len=output_len=1024
Issue #1064: [Bug] Could not post to external IP address
The issues reflect various challenges faced by users, particularly concerning model compatibility and performance tuning. The presence of multiple bugs related to memory management and request handling suggests that while SGLang is a powerful tool for deploying large language models, there are still critical areas that require attention to ensure smooth operation across different configurations and environments.
The dataset contains a total of 7 open pull requests (PRs) and 672 closed PRs for the SGLang project, which focuses on enhancing the performance of large language models. The open PRs include a mix of feature additions, bug fixes, and maintenance tasks, indicating ongoing development and improvement efforts.
PR #1111: chore: bump v0.2.13
PR #1041: Sequence Parallel
PR #1035: [Feat] Add support for optional start len of logprobs
PR #1013: Mixed style of chunked prefill
PR #1011: Move sampler out of ScheduleBatch
PR #1004: [Feat/WIP] add llava-onevision
PR #573: Function calling for OpenAI backend
The current set of open pull requests reflects a strong focus on both performance optimization and feature expansion within the SGLang framework. The introduction of sequence parallelism (#1041) indicates an ongoing effort to enhance model serving capabilities, particularly as model sizes continue to grow. This aligns with industry trends where efficient resource utilization becomes critical for deploying large language models effectively.
Additionally, the PRs related to log probabilities (#1035) and mixed style chunked prefill (#1013) show a commitment to refining the user experience by minimizing latency during inference operations. These enhancements are essential as they directly impact the responsiveness of applications built on top of SGLang.
The presence of several maintenance-focused PRs, such as moving samplers out of scheduling logic (#1011) and updating documentation (#1004), suggests that the maintainers are not only focused on adding features but also on ensuring that the codebase remains clean and maintainable. This is crucial for long-term sustainability as it allows new contributors to onboard more easily while reducing technical debt.
Moreover, there is a clear emphasis on community engagement and responsiveness to user needs as seen in PRs like function calling support (#573). This indicates an awareness of evolving user requirements and an effort to keep pace with advancements in AI model interactions.
In conclusion, the recent activity within the SGLang repository demonstrates a healthy balance between adding new features and maintaining existing functionality. The focus on performance improvements through innovative techniques like sequence parallelism will likely position SGLang favorably among frameworks designed for large language models. However, continuous monitoring of open issues and community feedback will be essential to ensure that development aligns with user expectations and industry standards.
Yineng Zhang (zhyncs)
model_loader
code.nsys
usage and PR template.Ying Sheng (Ying1123)
window attention
and flashinfer
.jinja
as a chat template file.stop_token_ids
in the SGLang API.Lianmin Zheng (merrymercy)
cuda_graph_runner
.Liangsheng Yin (hnyls2002)
schedule_batch.py
file, optimizing its performance.Other Contributors:
Overall, the team's recent activities indicate a robust development cycle focused on enhancing SGLang's capabilities while ensuring stability through thorough testing and documentation practices.