The LangChain project is a comprehensive framework designed to build applications powered by large language models (LLMs). It supports various functionalities including core operations, community contributions, experimental features, and extensive documentation. The project demonstrates robust activity with continuous enhancements in functionality, integration capabilities, and user experience. The trajectory of the project is positive, with active community engagement and rapid development cycles.
Recent commits have focused on enhancing features, fixing bugs, improving documentation, and expanding integration capabilities. Key contributors include Erick Friis, ccurme, Michael Schock, and Dristy Srivastava among others. Notable collaborations are seen in areas such as vector store integrations, document loaders, and autonomous agents.
The project plans to continue enhancing its integration capabilities with new vector stores and improving its core functionalities. Upcoming tasks include:
LangChain is a dynamically evolving project with significant community involvement and a clear focus on enhancing the capabilities of applications powered by LLMs. While facing challenges related to integration and error handling, the project maintains a strong trajectory towards becoming a more robust and user-friendly platform.
Developer | Avatar | Branches | PRs | Commits | Files | Changes |
---|---|---|---|---|---|---|
Eugene Yurtsev | ![]() |
1 | 24/13/2 | 14 | 41 | 3698 |
vs. last report | = | +8/=/+1 | -2 | +13 | +1615 | |
ccurme | ![]() |
6 | 22/18/2 | 26 | 390 | 2992 |
vs. last report | +1 | +10/+8/+1 | -1 | +299 | +1313 | |
shumway743 | ![]() |
1 | 1/1/0 | 1 | 4 | 1920 |
vs. last report | +1 | =/+1/= | +1 | +4 | +1920 | |
Tomaz Bratanic | ![]() |
1 | 3/3/0 | 6 | 16 | 1685 |
vs. last report | = | =/+2/= | +4 | +14 | +1485 | |
junkeon | ![]() |
1 | 1/1/0 | 1 | 18 | 1490 |
Mateusz Szewczyk | ![]() |
1 | 1/1/0 | 1 | 9 | 1050 |
Erick Friis | ![]() |
1 | 17/15/1 | 16 | 34 | 957 |
vs. last report | -3 | -2/=/+1 | -9 | -183 | -6504 | |
volodymyr-memsql | ![]() |
1 | 1/1/0 | 1 | 3 | 932 |
Christophe Bornet | ![]() |
1 | 4/3/0 | 4 | 8 | 869 |
vs. last report | = | +1/-1/= | = | +4 | +625 | |
Jingpan Xiong | ![]() |
1 | 0/0/0 | 1 | 7 | 859 |
vs. last report | +1 | -1/=/= | +1 | +7 | +859 | |
Leonid Ganeline | ![]() |
1 | 7/5/0 | 9 | 181 | 799 |
vs. last report | = | -4/-1/= | +2 | -79 | -298 | |
Sivaudha | ![]() |
1 | 1/1/0 | 1 | 4 | 786 |
Bagatur | ![]() |
4 | 13/10/1 | 18 | 69 | 775 |
vs. last report | = | -12/-15/= | -11 | +10 | -3752 | |
Aditya | ![]() |
1 | 0/0/0 | 1 | 1 | 637 |
vs. last report | +1 | -1/=/= | +1 | +1 | +637 | |
Martin Kolb | ![]() |
1 | 1/1/0 | 1 | 3 | 461 |
Shengsheng Huang | ![]() |
1 | 1/1/0 | 1 | 5 | 426 |
am-kinetica | ![]() |
1 | 0/0/0 | 1 | 6 | 417 |
Joan Fontanals | ![]() |
1 | 0/0/0 | 1 | 4 | 385 |
Raghav Dixit | ![]() |
1 | 1/1/0 | 1 | 5 | 310 |
Pavlo Paliychuk | ![]() |
1 | 2/2/0 | 2 | 4 | 279 |
Brace Sproul | ![]() |
1 | 1/0/1 | 5 | 4 | 273 |
Alex Sherstinsky | ![]() |
1 | 1/1/0 | 1 | 4 | 272 |
vs. last report | = | =/=/= | = | = | +153 | |
Matt | ![]() |
1 | 0/0/0 | 1 | 3 | 271 |
Ethan Yang | ![]() |
1 | 1/1/0 | 2 | 5 | 241 |
vs. last report | = | +1/+1/= | +1 | +1 | +12 | |
Mish Ushakov | ![]() |
1 | 0/0/0 | 1 | 5 | 203 |
vs. last report | +1 | -1/=/= | +1 | +5 | +203 | |
Rahul Triptahi | ![]() |
1 | 0/0/0 | 1 | 3 | 198 |
vs. last report | = | -1/=/= | -1 | +1 | +142 | |
zR | ![]() |
1 | 1/1/0 | 1 | 2 | 159 |
vs. last report | = | =/=/= | = | = | = | |
Nuno Campos | ![]() |
1 | 2/2/0 | 2 | 7 | 135 |
vs. last report | = | -2/-2/= | -2 | -3 | -159 | |
hulitaitai | ![]() |
1 | 0/0/0 | 1 | 1 | 132 |
vs. last report | = | =/=/= | = | = | = | |
Lance Martin | ![]() |
1 | 1/1/0 | 1 | 4 | 130 |
vs. last report | +1 | -1/+1/-1 | +1 | +4 | +130 | |
Lei Zhang | ![]() |
1 | 2/2/0 | 2 | 4 | 121 |
Dhruv Chawla | ![]() |
1 | 0/0/0 | 1 | 1 | 120 |
vs. last report | = | -1/-1/= | -1 | -4 | -836 | |
Sean | ![]() |
1 | 1/1/0 | 1 | 4 | 119 |
aditya thomas | ![]() |
1 | 1/1/0 | 2 | 2 | 94 |
vs. last report | = | -2/-1/= | -1 | -6 | -153 | |
Harrison Chase | ![]() |
1 | 1/1/0 | 1 | 2 | 82 |
vs. last report | +1 | =/+1/= | +1 | +2 | +82 | |
William FH | ![]() |
1 | 3/2/0 | 2 | 3 | 80 |
Mark Needham | ![]() |
1 | 0/0/0 | 1 | 1 | 71 |
vs. last report | +1 | -1/=/= | +1 | +1 | +71 | |
Charlie Holtz | ![]() |
1 | 1/1/0 | 1 | 2 | 69 |
Massimiliano Pronesti | ![]() |
1 | 3/2/0 | 2 | 1 | 52 |
vs. last report | = | +2/+1/= | +1 | = | +18 | |
Jason_Chen | ![]() |
1 | 0/0/0 | 1 | 1 | 43 |
vs. last report | +1 | -1/=/= | +1 | +1 | +43 | |
Alex Lee | ![]() |
1 | 2/1/1 | 1 | 1 | 28 |
YISH | ![]() |
1 | 0/0/0 | 1 | 1 | 27 |
Katarina Supe | ![]() |
1 | 1/1/0 | 1 | 1 | 26 |
Anish Chakraborty | ![]() |
1 | 0/0/0 | 1 | 2 | 25 |
vs. last report | +1 | -1/=/= | +1 | +2 | +25 | |
JeffKatzy | ![]() |
1 | 1/1/0 | 1 | 2 | 25 |
Dmitry Tyumentsev | ![]() |
1 | 1/1/0 | 1 | 2 | 23 |
Oleksandr Yaremchuk | ![]() |
1 | 1/1/0 | 1 | 3 | 22 |
fzowl | ![]() |
1 | 1/1/0 | 1 | 2 | 20 |
Congyu | ![]() |
1 | 0/0/0 | 1 | 1 | 20 |
Andres Algaba | ![]() |
1 | 1/1/0 | 1 | 2 | 14 |
Aliaksandr Kuzmik | ![]() |
1 | 1/1/0 | 1 | 1 | 13 |
Nikita Pokidyshev | ![]() |
1 | 1/1/0 | 1 | 1 | 13 |
Dristy Srivastava | ![]() |
1 | 0/0/0 | 1 | 1 | 13 |
back2nix | ![]() |
1 | 1/1/0 | 1 | 1 | 12 |
Ivaylo Bratoev | ![]() |
1 | 0/0/0 | 1 | 1 | 10 |
vs. last report | +1 | -1/=/= | +1 | +1 | +10 | |
Salika Dave | ![]() |
1 | 1/1/0 | 1 | 1 | 10 |
Nestor Qin | ![]() |
1 | 1/1/0 | 1 | 1 | 8 |
Michael Schock | ![]() |
1 | 0/0/0 | 2 | 2 | 6 |
balloonio | ![]() |
1 | 0/0/0 | 1 | 1 | 6 |
vs. last report | = | -4/-4/= | -3 | -3 | -13 | |
GustavoSept | ![]() |
1 | 0/0/0 | 1 | 1 | 6 |
vs. last report | +1 | -1/=/= | +1 | +1 | +6 | |
monke111 | ![]() |
1 | 1/1/0 | 1 | 1 | 4 |
Ikko Eltociear Ashimine | ![]() |
1 | 1/1/0 | 1 | 1 | 4 |
vs. last report | = | -1/-1/= | -1 | -1 | -2 | |
Saurabh Chalke | ![]() |
1 | 0/0/0 | 1 | 1 | 4 |
Leonid Kuligin | ![]() |
1 | 2/1/0 | 1 | 1 | 3 |
vs. last report | = | =/-1/= | -1 | -8 | -59 | |
merdan | ![]() |
1 | 1/1/0 | 1 | 1 | 3 |
Rohit Gupta | ![]() |
1 | 0/0/0 | 1 | 1 | 3 |
vs. last report | = | -1/-1/= | = | = | = | |
hsmtkk | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
naaive | ![]() |
1 | 0/0/0 | 1 | 1 | 2 |
vs. last report | = | -1/-1/= | = | = | = | |
Souls-R | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
MajorDouble | ![]() |
1 | 0/0/0 | 1 | 1 | 2 |
vs. last report | = | -1/-1/= | = | = | = | |
jtanios | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
Guangdong Liu | ![]() |
1 | 3/0/0 | 1 | 1 | 2 |
vs. last report | = | =/=/-1 | -5 | -8 | -532 | |
Boris Djurdjevic | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
dpdjvhxm | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
A Noor | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
Chen94yue | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
Tabish Mir | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
samanhappy | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
Matheus Henrique Raymundo | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
Justsosostar | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
vs. last report | = | =/=/= | = | = | = | |
davidefantiniIntel | ![]() |
1 | 0/0/0 | 1 | 1 | 2 |
Stefano Ottolenghi | ![]() |
1 | 1/1/0 | 1 | 1 | 2 |
Steven Kreitzer (buroa) | 0 | 1/0/1 | 0 | 0 | 0 | |
vs. last report | = | =/=/+1 | = | = | = | |
Vincent JUGE (vjuge) | 0 | 1/0/1 | 0 | 0 | 0 | |
Konstantin Krestnikov (Rai220) | 0 | 1/0/1 | 0 | 0 | 0 | |
vs. last report | = | =/=/= | = | = | = | |
Rajendra Kadam (Raj725) | 0 | 2/0/0 | 0 | 0 | 0 | |
chyroc (chyroc) | 0 | 1/0/0 | 0 | 0 | 0 | |
None (MacanPN) | 0 | 1/0/0 | 0 | 0 | 0 | |
vs. last report | -1 | =/-1/-1 | -1 | -3 | -45 | |
Giacomo Berardi (giacbrd) | 0 | 1/0/0 | 0 | 0 | 0 | |
None (maang-h) | 0 | 1/0/0 | 0 | 0 | 0 | |
Philippe PRADOS (pprados) | 0 | 1/0/0 | 0 | 0 | 0 | |
Alejandro Oñate (alexol91) | 0 | 1/0/0 | 0 | 0 | 0 | |
None (nadworny) | 0 | 1/0/0 | 0 | 0 | 0 | |
Cahid Arda Öz (CahidArda) | 0 | 1/0/0 | 0 | 0 | 0 | |
JonZeolla (JonZeolla) | 0 | 1/0/1 | 0 | 0 | 0 | |
vs. last report | = | =/=/= | = | = | = | |
Carmelo Daniele (c-daniele) | 0 | 1/0/0 | 0 | 0 | 0 | |
hmn falahi (hmnfalahi) | 0 | 1/0/0 | 0 | 0 | 0 | |
Yutong_Liu (innerNULL) | 0 | 2/0/1 | 0 | 0 | 0 | |
None (scaserini) | 0 | 1/0/0 | 0 | 0 | 0 | |
Hannah Markfort (xCatalitY) | 0 | 1/0/0 | 0 | 0 | 0 | |
Alon Parag (Alonoparag) | 0 | 1/0/0 | 0 | 0 | 0 | |
Abhinav Sharma (abhi199250) | 0 | 1/0/0 | 0 | 0 | 0 | |
Cheese (cheese-git) | 0 | 1/0/0 | 0 | 0 | 0 | |
Dmitrii Ioksha (dimaioksha) | 0 | 1/0/0 | 0 | 0 | 0 | |
Dudi (dudizimber) | 0 | 2/0/0 | 0 | 0 | 0 | |
None (fubuki8087) | 0 | 1/0/0 | 0 | 0 | 0 | |
Jacob Lee (jacoblee93) | 0 | 0/1/0 | 0 | 0 | 0 | |
vs. last report | -1 | -2/-1/= | -1 | -2 | -5 | |
Mark Cusack (markcusack) | 0 | 1/0/0 | 0 | 0 | 0 | |
Asaf Joseph Gardin (Josephasafg) | 0 | 1/0/0 | 0 | 0 | 0 | |
Jamie Lemon (jamie-lemon) | 0 | 1/0/0 | 0 | 0 | 0 | |
Karim Lalani (lalanikarim) | 0 | 1/0/0 | 0 | 0 | 0 | |
Thomas Meike (meikethomas) | 0 | 1/0/1 | 0 | 0 | 0 | |
HoangNguyen689 (HoangNguyen689) | 0 | 1/0/0 | 0 | 0 | 0 | |
Mayank Solanki (spike-spiegel-21) | 0 | 2/0/1 | 0 | 0 | 0 | |
vs. last report | -1 | +1/-1/+1 | -1 | -1 | -2 |
PRs: created by that dev and opened/merged/closed-unmerged during the period
Since the last report 6 days ago, the LangChain project has seen a significant amount of activity across various branches and components. The development team has been focused on enhancing features, fixing bugs, and improving documentation. Below is a detailed analysis of the commits and changes made to the project.
libs/core/langchain_core/tracers/context.py
.libs/langchain/langchain/retrievers/self_query/base.py
.libs/community/langchain_community/document_loaders/pebblo.py
.libs/experimental/langchain_experimental/autonomous_agents/hugginggpt/task_executor.py
.The following developers have been particularly active, contributing across various aspects of the project:
The recent activities demonstrate a robust effort towards refining LangChain's functionality, enhancing user experience through better documentation, ensuring stability through bug fixes, and expanding the platform's capabilities with new integrations. The continuous development is crucial for maintaining LangChain's relevance and effectiveness in building context-aware reasoning applications.
Since the last report 6 days ago, there has been a significant amount of activity in the LangChain project. Here are the key updates:
Issue #20910: This issue discusses a problem with SQLDatabase.from_databricks
hanging indefinitely. This is a critical issue as it affects the usability of the database integration feature.
Issue #20909: Reports an "HTTP Error 404: Not Found" error when using ArxivLoader
. This indicates a potential issue with the document loader or the source API.
Issue #20908: Discusses a bug in CharacterTextSplitter
where the separator is incorrectly placed at the beginning of each chunk instead of at the end.
Issue #20907: Proposes support for hybrid search with a score threshold in Azure AI Search Retriever, indicating an enhancement in search capabilities.
Issue #20906: Addresses a missing metadata field during initialization in duckdb vector store
, which causes failures when connecting to existing tables.
Issue #20902: Adds a first version of the migrate script, suggesting improvements in database migration tools.
Issue #20895: Discusses a TypeError
encountered when using Synthetic data generator over vLLM, indicating issues in function compatibility or implementation.
Issue #20893: Proposes centralizing code for handling dynamic imports, which could improve modularity and reduce dependency issues.
Issue #20890: Discusses an issue with function calling where a list of integers doesn't work as expected, indicating potential problems in type handling or function implementation.
Issue #20889: Proposes allowing passing run_id
from config when invoking chains, suggesting enhancements in run management and tracking.
Issue #20884 & #20882: These issues discuss problems with retrievers returning multiple documents and timeout errors respectively, indicating potential issues in retrieval logic or configuration.
The LangChain project demonstrates robust activity with quick responses to new issues and continuous improvements in functionality and usability. The community's engagement in proposing features and resolving issues swiftly ensures that the project remains responsive to user needs and technological advancements.
This pull request introduces a new feature to the Azure AI Search retriever within the LangChain framework. It adds support for hybrid search with a score threshold, similar to existing functionality for similarity searches. This enhancement is aimed at improving the precision of search results by filtering out documents that do not meet a specified relevance score threshold.
The changes are localized to the azuresearch.py
file within the libs/community/langchain_community/vectorstores
directory. The modifications include:
New Method Implementation:
hybrid_search_with_relevance_scores
has been added. This method extends the existing hybrid_search_with_score
by incorporating a score threshold filter. It accepts a query string and an integer k
representing the number of top documents to retrieve, along with additional keyword arguments.score_threshold
parameter extracted from kwargs
. If no threshold is provided, it defaults to returning all results from the hybrid search. If a threshold is specified, it filters the results to include only those documents whose scores are above the threshold.Class Enhancements:
AzureSearchVectorStoreRetriever
class now includes an additional search type "hybrid_score_threshold"
to handle the new threshold-based search.allowed_search_types
has been introduced to define valid search types, enhancing maintainability and readability of the code by centralizing the allowable options.Error Handling:
search_type
has been streamlined by utilizing the newly defined allowed_search_types
class variable. This change simplifies the validation logic and improves error messaging for unsupported search types.Overall, PR #20907 introduces a useful feature that enhances the flexibility and utility of the Azure AI Search retriever in LangChain, with high-quality code additions that adhere to best practices in software development.
Since the previous analysis 6 days ago, there has been significant activity in the langchain-ai/langchain
repository. Here's a detailed breakdown of the changes:
PR #20907: This PR aims to support hybrid search with a score threshold in Azure AI Search Retriever. It was created 0 days ago and is currently open.
PR #20902: Adds a migration script to the CLI. This PR was also created 0 days ago.
PR #20893: Proposes centralized code for handling dynamic imports, making langchain-community
an optional dependency. This PR is still in draft status.
PR #20889: Allows passing run_id
from config when invoking the chain, enhancing traceability and debugging capabilities.
PR #20881: Implements bind_tools
for OllamaFunctions, enhancing functionality by allowing it to utilize tools bound to other models or functions.
PR #20863: Aims to remove batch size from LLM start callbacks, suggesting a shift in handling batch operations.
PR #20857: Moves the import of embeddings into local scope as part of ongoing efforts to decouple langchain
from community
.
PR #20856: Adds indexing via locality-sensitive hashing to the Yellowbrick vector store, enhancing its capabilities for nearest neighbor searches.
PR #20853: Checks dependencies as part of ongoing development efforts.
PR #20847 and PR #20845: Focus on moving functionalities to the community package, aligning with ongoing efforts to invert dependencies between langchain
and langchain-community
.
PR #20620: Removed example VSDX data due to potential security concerns with EMF files.
PR #20613: Fixed an issue with fireworks mapping in core functionalities.
PR #20610: Updated imports in various documentation files.
PR #20609: Added async methods to CassandraLoader, enhancing performance and modernizing the codebase.
PR #20605: Addressed issues in a Zhipuai notebook regarding timeout issues and use case demonstrations.
The repository has seen active development with multiple pull requests opened concerning enhancements, bug fixes, and documentation updates. The successful merging of several PRs highlights ongoing efforts to improve functionality and user guidance. However, several PRs were closed without merging, suggesting that some proposed changes are undergoing further discussion or revision before they can be finalized.
Moving forward, it will be crucial to monitor these discussions and any new implementations that may arise from them. The active management of open and recently closed pull requests suggests a dynamic development environment where enhancements are continuously evaluated and integrated into the project.
This pull request introduces a new migration script to the langchain-ai/langchain
repository. The script is designed to facilitate version transitions for users, ensuring that their software configurations and dependencies remain compatible and up-to-date.
New Files and Directories: Several new files and directories have been added, specifically under the libs/cli/langchain_cli/namespaces/migrate
path. This includes Python modules for handling migrations (migrate.py
, glob_helpers.py
, main.py
) and specific codemods (codemods
directory) that contain the logic for adjusting codebases to new API changes or library versions.
Migration Scripts: The core of this PR is the migration scripts capable of automatically updating user projects to align with newer versions of the LangChain framework. This includes JSON files (migrations_v0.2.json
, migrations_v0.2_partner.json
) that likely map old API calls to their new counterparts, facilitating automated code refactoring.
Integration with CLI: Changes in cli.py
suggest integration of the migration functionality into the existing LangChain CLI, making it accessible via command line interfaces. This integration checks for the presence of the libcst
library, which is presumably used for codemod transformations.
Unit Tests: New unit tests have been added (test_glob_helpers.py
, test_replace_imports.py
), indicating an emphasis on reliability and correctness for the migration tools.
Documentation and Metadata: While not explicitly shown in the diff, additions like README.md
files in new directories suggest that documentation has been considered. However, details on these documents are not provided in the diff.
libcst
for codemod operations suggests robustness, as it leverages established tools for syntax tree transformations.The pull request appears to be a substantial addition to the LangChain project, introducing necessary tools for managing transitions between different software versions. The modular approach, combined with testing and integration into existing CLI tools, reflects thoughtful engineering practices aimed at maintaining high code quality and user satisfaction.
Given the complexity and impact of such migration tools, further review by domain experts (especially on actual migration rules and their implications) would be advisable before merging. Additionally, comprehensive user documentation on how to utilize these migration scripts effectively will be crucial for adoption and utility.
The LangChain repository is a comprehensive framework for building applications powered by large language models (LLMs). It provides extensive support for various components, including core functionalities, community contributions, experimental features, and detailed documentation. The repository is well-organized and follows modern software engineering practices.
libs/core/langchain_core/tracers/context.py
tracing_enabled
to tracing_v2_enabled
could potentially break existing integrations if not handled properly.libs/langchain/langchain/retrievers/self_query/base.py
SelfQueryRetriever
class that integrates with various vector stores and translates structured queries into store-specific queries.libs/community/tests/unit_tests/vectorstores/test_azure_search.py
libs/community/langchain_community/document_loaders/pebblo.py
BaseLoader
to add functionality for loading documents with semantic processing.libs/experimental/langchain_experimental/autonomous_agents/hugginggpt/task_executor.py
Task
and TaskExecutor
classes to manage task dependencies and execution logic.libs/core/langchain_core/output_parsers/list.py
libs/experimental/langchain_experimental/autonomous_agents/autogpt/agent.py
docs/docs/integrations/llms/ipex_llm.ipynb
Overall, the LangChain repository exhibits strong software engineering practices with a focus on modularity, extensibility, and maintainability.