The transformative framework that unlocks the true potential of language models like never before. Like a master key, LangChain seamlessly integrates external data sources, such as files, documents, applications, and API data, into powerful AI models. With the finesse of an experienced conductor, it empowers these models to interact with their environment and make informed decisions, offering a level of agency previously unexplored. This ingenious framework, available as a Python and TypeScript Package, allows you to create conversational AI systems beyond the ordinary. Just imagine crafting your personal assistant or a question-answering chatbot akin to the remarkable CHAT-GPT, all fueled by your unique datasets. LangChain is the bridge that brings the extraordinary within your reach.
Why LangChain?
LangChain offers several compelling reasons to be chosen as the framework for developing language model-powered applications. Firstly, it simplifies working with AI models by providing seamless integration capabilities. With LangChain, you can effortlessly incorporate external data sources, including files, documents, applications, and API data, into your large language models. This means that you can leverage your specific data or datasets to enhance the functionality and accuracy of your AI system.
Secondly, LangChain introduces the concept of agency to language models. This means that the language model, such as GPT-4, gains the ability to interact with its environment and make decisions based on the provided information. Using LangChain, you can enable your language model to go beyond passive responses and actively participate in decision-making. This opens up a new realm of possibilities, allowing the model to determine the most appropriate actions based on the context and input. LangChain is offered as a Python or TypeScript Package, making it easily accessible and compatible with popular programming languages. This availability ensures that developers can seamlessly integrate LangChain into their existing development workflows and leverage its features to create conversational AI systems based on their own datasets.
Imagine you are building an AI-powered customer support system for an e-commerce platform using LangChain. By leveraging the integration capabilities of this framework, you can seamlessly bring in various external data sources. For instance, you can incorporate product catalogs, customer reviews, and frequently asked questions as additional context for your language model. With LangChain's agency feature, your language model becomes more than just a passive responder. It gains the ability to interact with its environment and make informed decisions. Suppose a customer submits a query about a specific product's availability. Instead of providing a generic response, LangChain, with its decision-making capability, can analyze real-time inventory data, consider the customer's location, and offer personalized suggestions based on availability and shipping options.
Moreover, LangChain's flexibility allows you to tailor your conversational AI system to your needs. You can create a personal assistant that assists users with tasks like order tracking, account management, and product recommendations. By training the language model on your datasets, you can ensure that it understands your domain-specific terminology, providing accurate and relevant responses. In essence, LangChain empowers you to build sophisticated conversational AI systems that understand the natural language and interact with the world around them. Whether you're developing a customer support chatbot, a virtual shopping assistant, or any other language-powered application, LangChain's Python or TypeScript Package offers the tools you need to unlock the full potential of your data and deliver exceptional user experiences.
Frameworks and Tools supported by LangChain
Langchain allows users to connect with many different APIs and tools, and using them with the functionality of LangChain opens up a whole new world of possibilities. While there are too many APIs and tools to discuss, we will focus here on the most common ones.
- GoogleSerperAPIWrapper()- a tool that provides a simplified interface for accessing the Google Search Engine Results Page (SERP) API. This wrapper allows developers to easily integrate and retrieve search results from Google, enabling them to programmatically obtain relevant information, such as search rankings, snippets, and URLs. By encapsulating the complexities of interacting with the Google SERP API, this wrapper simplifies retrieving search data and enhances the efficiency of building applications that require accessing Google search results.
- WikipediaAPIWrapper()- a convenient tool facilitating interaction with the Wikipedia API. It simplifies the process of programmatically retrieving information from Wikipedia. Developers can utilize this wrapper to query Wikipedia for articles, summaries, historical data, and other types of content available on the platform. By providing an abstraction layer over the Wikipedia API, this wrapper simplifies the retrieval of Wikipedia data, allowing developers to integrate Wikipedia's vast knowledge base more efficiently into their applications.
- YouTubeSearchTool()- is a tool that allows users to perform advanced searches and retrieve relevant video content from YouTube. This tool provides a simplified interface to interact with the YouTube search functionality, enabling users to specify search parameters, such as keywords, duration, upload date, and more. By leveraging the YouTube API, this search tool enables users to discover and access specific videos or curated playlists based on their preferences, enhancing the overall user experience when exploring the vast content available on YouTube.
- GoogleDriveLoader()- -t is designed to facilitate the loading and retrieval of files from Google Drive. This loader simplifies programmatically accessing files stored in a user's Google Drive account. Developers can use this tool to authenticate with Google Drive, browse files and folders, upload and download files, and manage permissions. By abstracting the complexities of the Google Drive API, this loader streamlines file management tasks, enabling seamless integration of Google Drive functionality into applications that require efficient file handling and synchronization.
- DiscordChatLoader() enables developers to interact with the Discord API and retrieve chat data from Discord servers and channels. This loader simplifies the process of fetching messages, user information, channel metadata, and other relevant chat-related data from Discord. By encapsulating the intricacies of interacting with the Discord API, this loader provides an easy-to-use interface for developers to integrate Discord chat functionalities into their applications, such as chat analytics, moderation tools, or chat-based automation.
- UnstructuredEmailLoader()- a tool that allows users to extract and load data from unstructured email sources. With the increasing amount of information shared through email, this loader provides a convenient way to programmatically access and analyze email content. By leveraging APIs or parsing email files, the UnstructuredEmailLoader enables users to retrieve various elements from emails, such as sender information, recipients, subject lines, timestamps, and the actual email body. This tool proves particularly useful in applications that require processing and organizing email data, such as email analytics, automation, or integration with other systems.
- WhatsAppChatLoader()- used to retrieve and handle chat data from WhatsApp Messenger. As one of the most widely used messaging platforms, WhatsApp generates vast textual conversations with valuable insights. This provides an interface to access and analyze chat logs, including text messages, media files, participant information, timestamps, and more. By simplifying the process of extracting WhatsApp chat data, this loader enables developers to build applications that involve chat analysis, sentiment analysis, chat-based automation, or any other functionality that benefits from WhatsApp message data.
- ShellTool()- offers users a command-line interface to interact with the operating system's shell or command prompt. With the ShellTool, users can execute shell commands, run scripts, and perform various system operations directly from their code or application. Acting as a bridge between high-level programming languages and the underlying shell environment, this tool allows developers to leverage the extensive capabilities of the command-line interface within their applications. Whether executing system commands, managing files and directories, automating tasks, or accessing system resources, the ShellTool provides a flexible and efficient means to integrate shell functionality into software projects. By utilizing the ShellTool, developers can harness the power of the command line while enjoying the convenience and flexibility of a higher-level programming language.
Use of Langchain:
LangChain offers a wide range of applications that leverage its integration and agency features to enhance language model-powered applications. Here are a few examples:
- Conversational AI Systems: With LangChain, you can create sophisticated conversational AI systems that utilize your datasets. You can enhance the language model's understanding and response generation capabilities by integrating external data sources such as files, documents, and API data. This allows you to build chatbots, virtual assistants, or customer support systems that provide personalized and accurate conversational experiences.
- Personal Assistants: LangChain enables the development of powerful personal assistant applications. By connecting a language model like GPT-4 to your specific data, the language model gains the ability to understand and respond to user queries and commands. You can build personal assistants that assist users with scheduling appointments, managing tasks, providing recommendations, and more tailored to your dataset and requirements.
- Question-Answering Systems: LangChain is an excellent tool for building question-answering systems. By integrating your data, such as FAQs, knowledge bases, or domain-specific documents, you can create language models that efficiently answer user questions. These systems can be applied in various domains, such as customer support, education, and information retrieval.
- Data-driven Decision Making: LangChain's agency feature allows language models to interact with their environment and make informed decisions. The language model can analyze and refer to an entire dataset by integrating external data sources to inform its actions. This capability opens the door to applications where language models can assist in decision-making processes, such as suggesting product recommendations, predicting trends, or providing insights based on data analysis.
In summary, LangChain's versatile capabilities enable the development of conversational AI systems, personal assistants, question-answering systems, and data-driven decision-making applications. By leveraging the framework's integration and agency features, developers can create powerful and customized language model applications that harness the potential of their datasets.
Definition of chain and how it is created
Chains refer to combining different Language Model (LLM) calls and actions automatically. It involves using the output of one LLM as the input for another, creating a sequential flow of information and actions. Chains are designed to enhance the capabilities and performance of language models by leveraging their outputs in a structured manner. Chains offer several benefits and reasons to be used in language model applications. By implementing chains, you can break down complex tasks into smaller, manageable steps, allowing the LLM to focus on each task. This helps prevent the model from generating irrelevant or incorrect responses by applying a "chain of thought" prompting technique. Chains also enable the model to handle specific use cases more efficiently and effectively by guiding its decision-making process. Chains provide structure, improve the quality of generated outputs, and enhance the model's coherence and relevance. More on this later.
Quick start guide from set up and libraries imported
So to start with LangChain, you should first install the library to your system. This can be done with the following commands:
Using LangChain requires integrations with one or more model providers, data stores, APIs, etc. Below is an example of OpenAI's API that we will use for this guide.
First, you would install the OpenAI SDK – pip install OpenAI
Then we need to configure the environment variable in the command line.
To do this, use the following command: export OPENAI_API_KEY="..."
Alternatively, if you prefer to set the environment variable within a Jupyter Notebook or Python script, you can do it programmatically:
If you want to set the API key dynamically, you can utilize the openai_api_key parameter when initializing the OpenAI class. This allows each user to use their respective API key:
A simple Program to get started with LangChain:
Schema (Text, ChatMessages, Documents, Examples)
The schema serves as the backbone or foundation for working with LLMs, much like the nuts and bolts that hold a structure together. Below mentioned are the basic types of schema that are used in LangChain.
from langchain.schema import 'type_of _schema'
- Text: When working with language models, you primarily interact with them through text. It can be simplified as a "text in, text out" process. As a result, LangChain focuses on text-centric interfaces. Language models rely on textual input and generate textual output. Therefore, the interactions with LangChain revolve around text-based operations, such as providing input text and receiving corresponding model-generated text.
- ChatMessage: End users' primary mode of interaction is through a chat interface. Model providers have even started offering APIs that expect chat messages as input. These messages contain content (usually text) and are associated with a user. Supported users include System, Human, and AI.
SystemChatMessage: A chat message representing information that should be instructions to the AI system.
HumanChatMessage: A chat message representing information from a human interacting with the AI system.
AIChatMessage: A chat message representing information coming from the AI system. Here the AI may or may not have provided a response but it is sometimes used as an additional context to tell the AI 'How to answer' your questions. - Documents: Documents refer to unstructured data that includes page content (text) and metadata (descriptive attributes). The Document Library in LangChain allows you to organize large information repositories and perform operations on filtered documents based on their metadata. Imagine having a collection of documents with metadata, such as author, date, and topic. With LangChain's Document Library, you can filter out specific documents based on their metadata, enabling targeted operations or analysis on subsets of documents. Examples consist of input/output pairs that demonstrate the desired input and the expected output by the model. They play a role in both model training and evaluation.
- Example: Examples act as guiding benchmarks, like reference points that steer the behavior of language models. They comprise input/output pairs, serving a dual purpose in model training and evaluation. Input/output examples help refine (finetune) a model's performance by aligning its output with the desired output. Whether applied to a single model or an entire chain of models, examples aid in assessing the end-to-end system or even training a replacement model for the entire chain.
Models (Language, Chat, Text Embeddings)
LangChain is not a provider of LLMs, but rather provides a standard interface through which you can interact with various LLMs, and Models can be referred to as that interface to the AI Brains. Many different types of models are used in LangChain. Below we will go through the types of models in a bit of detail. from langchain import
- Large Language Model: This model, such as OpenAI, takes a text input and generates a corresponding text output. For example, OpenAI can process a text command or instruction and provide a textual response. To use this model, you can import it using the command: from langchain.llms import 'type of llm' (e.g., OpenAI).
- Chat Model: These models are typically based on a language model but have structured APIs. They accept a list of Chat Messages as input and return a Chat Message as output. Chat Messages can include SystemChatMessage, HumanChatMessage, and sometimes AIChatMessage. Although there is also a ChatMessage type that allows for a role parameter, it is less commonly used. To import and use this model type, you can use the command: from langchain.chat_models import 'type of chat model' (e.g., ChatOpenAI).
- Text Embedding Model: These models convert text input into floating-point vector values. When working with LLMs, they are helpful for tasks like similarity search or text comparison. The Embedding class in LangChain serves as an interface for various embedding providers (e.g., OpenAI, Cohere, Hugging Face), offering a standardized approach. Embeddings create vector representations of text, enabling operations like semantic search to find similar text pieces in the vector space. The Embedding class in LangChain provides two methods: embed_documents and embed_query. These methods have different interfaces; one works with multiple documents, while the other operates on a single document. Additionally, different methods are used because some embedding providers employ different techniques for embedding documents to be searched compared to the queries used for the search.
Prompts (PromptValue, PromptTemplate, Chat Prompt Template, ExampleSelector, OutputParser)
The process of programming models has evolved, and now it revolves around prompts. A "prompt" refers to the input given to the model, which is not fixed but is often created by combining different components. LangChain offers various classes and functions to simplify the construction and handling of prompts.
- Prompt Value: Learn how to use PromptTemplates to prompt Language Models effectively.
- Prompt Template: This class is responsible for creating a dynamic template that can incorporate different PromptValues by modifying specific arguments within the prompt. The PromptValue is not hard coded but generated based on user input, non-static information, and a fixed template string. To create your template, import the PromptTemplate using the code: from langchain import PromptTemplate. Then provide the 'input variables' and 'template' to the method, where input variables contain dynamic values and the template is the prompt string with variables enclosed in curly brackets {}. To assign values to the input variables, you can use the prompt.format method by passing 'variable name' = 'value of the variable'.
- Chat Prompt Template: Chat Models receive prompts through a list of chat messages. Unlike plain text strings in LLM models, each chat message has a specific role, such as AI, human, or system. System chat messages hold more importance for instructions. LangChain provides prompt templates for chat-related tasks to simplify prompt construction and interaction with chat models. Using these chat-related prompt templates instead of the generic PromptTemplate when working with chat models is recommended, as they maximize the potential of chat models and improve performance.
- Example Selectors: In-context learning is often necessary when constructing prompts. Example selectors offer a convenient way to choose from a series of examples, allowing users to incorporate context into the prompt dynamically. They are helpful when a task requires nuance or when numerous examples exist. The primary method exposed by example selectors is select_examples, which takes input variables and returns a list of examples. The specific implementation determines how the examples are selected. An example is the SemanticsSimilarityExampleSelector.
- Output Parser: Language models (including Chat Models) generate text as output, but sometimes you may need more structured information than just plain text. Output parsers handle this by instructing the model on how the output should be formatted and parsing it into the desired structure. Output parsers can even generate output in specific file formats like JSON. They consist of two main components:
. Format instructions: A method that provides a string containing instructions for formatting the language model's output.
. Parser: A method that extracts the model's text output and transforms it into the desired structure, such as a dictionary for JSON format. Since the model only returns a string, parsing is required to achieve the desired format.
. parse_with_prompt (optional): A method that takes the response from a language model (assumed to be a string) and the prompt that generated the response. It parses the response into a structure, and the prompt is provided in case the OutputParser needs to retry or modify the output based on the prompt information.
Indexes (Document Loaders, Splitters, Retrievers, Vector stores)
Indexes are used to organize documents to facilitate interaction with Large Language Models (LLMs). This module provides utility functions for working with documents, different types of indexes, and examples of using indexes in chains. The primary use of indexes in chains is the "retrieval" step, where relevant documents are retrieved based on a user's query. However, indexes can have other purposes, and retrieval can involve logic beyond just an index. The "Retriever" interface serves as a standard interface for most chains. There are four components of Indexes:
- Document Loaders: These enable combining language models with custom text data. The first step is to load the data into "Documents," which are pieces of text. Document loaders provide convenient ways to import data from various sources, such as HNLoader, which loads data from news articles. The loaded data may require reformatting for subsequent tasks, and document loaders can be categorized into three types.
- Transform Loaders: These loaders convert data from specific formats into Document format. For example, there are transformers for CSV and SQL. These loaders primarily process data from files but can also handle data from URLs. The Unstructured Python package significantly transforms various file types (text, PowerPoint, images, HTML, PDF) into text data.
- Public Dataset or Service Loaders: These loaders work with datasets and sources available in the public domain. They utilize queries to search and download the required documents. The mentioned example of HNLoader falls into this category.
- Proprietary Dataset or Service Loaders: These loaders handle datasets and services that are not publicly available. They typically transform data from specific formats used in applications or cloud services, such as Google Drive.
- Splitters are used when dealing with lengthy text to divide it into smaller, meaningful chunks. The goal is to keep semantically related pieces of text together, depending on the type of text. This approach helps generate more accurate output from models efficiently. Text splitters work by:
- Splitting the text into small, semantically meaningful chunks, often sentences.
- Combining these small chunks into larger ones until a specific size is reached (measured by some function).
- Treating the chunk as an independent text and creating a new chunk with some overlap to maintain context between chunks.
- There are two customizable aspects of text splitters:
- How the text is split.
- How the chunk size is determined.
- Retrievers are used to combine documents with Language Models. They store data in a format that a language model can query. The main requirement for this object is to expose a get_relevant_texts method, which takes a string and returns a list of Documents. Various types of retrievers are available, with VectoStoreRetriever being one of the widely supported ones.
- Vector Store is a table with each row representing an embedding and the associated metadata that comes with the embedding. There are two columns: 1st one is for embedding, and 2nd one is for metadata. They are referred to as Databases to store vectors. A crucial part of working with vector stores is creating the vector to put in them, usually created via embeddings. Some popular Vector Stores in LangChain are 'Pinecone' and 'Weaviate'.
Memory (Chat History)
By default, Chains and Agents operate without retaining any information between queries, similar to the underlying LLMs and chat models they utilize. However, in specific applications like chatbots, it is important to remember information from past interactions, both in the short and long term. To address this, LangChain introduces the concept of "Memory." LangChain offers memory components in two primary forms. Firstly, it provides convenient tools for managing and manipulating previous chat messages. These tools are adaptable and beneficial for various applications. Secondly, LangChain seamlessly integrates these tools into chains, allowing for easy incorporation of memory functionality. Different types of memory are available for specific use cases, such as ChatHistory for storing previous chat messages with the chatbot.
Chains (LLM, Simple, Summarize)
Using a single LLM may be sufficient for simple applications, but more complex ones often require chaining LLMs together, either with each other or with other experts. LangChain offers a standard interface for Chains and provides typical implementations of chains for convenience. Chains allow us to combine multiple components to create a cohesive application. For instance, we can create a chain that takes user input, formats it using a PromptTemplate, and then passes the formatted response to an LLM. We can build more intricate chains by combining multiple chains or integrating them with other components.
The flow chart above provides a basic understanding of how chains work. It begins with user input, a prompt, or input for a prompt template. This input is then processed by various LangChain modules, and the output from one module becomes the input for the next module. This process continues until the final output that satisfies the user's needs is generated.
There are different methods for combining chains, and we will discuss a few of them here:
- LLM Chain: The LLMChain is a simple chain that takes a prompt template, formats it with user input, and returns the response from an LLM. It allows us to generate responses from the LLM by utilizing a prompt template incorporating user input.
- Simple Sequential Chain: These chains execute their links in a predefined order. Specifically, we can use the SimpleSequentialChain, the most basic type of sequential chain. In this type of chain, each step has a single input and output, and the output of one step serves as the input to the next step. The order of the steps in the sequential chain is crucial, as the outputs are designed to be inputs for the subsequent modules. This chain type effectively breaks down tasks and enables the LLM to maintain focus. It also aids in applying the "chain of thought" prompting technique to achieve optimal output efficiently while reducing the likelihood of hallucinations.
- Summarization Chain: This type of chain facilitates the summarization of lengthy texts or documents. It involves dividing the large text or document into smaller, manageable chunks and creating summaries for each of these chunks. Finally, a concluding summary is generated using these summarized chunks. Summarization chains are valuable for quickly reviewing extensive documents and producing concise summaries. Different types of summarization chains exist for various document types.
Agents (Tools, Agents, Toolkits, Agent Executors)
In certain situations, there is a need for a flexible chain of interactions with LLMs and other tools, which can vary based on the user's input. An "agent" is employed to address this, equipped with a diverse set of tools. The agent analyzes the user's input and determines whether to utilize specific tools and which tools to employ. For example, when using a template that can summarize a document or provide further details based on user input, the agent decides whether to generate a summary or an expanded response. Essentially, the model is utilized not only for generating output but also for making decisions.
There are two primary types of agents:
- "Action Agents": These agents make decisions on individual actions and execute them step by step. They are commonly used for smaller tasks.
- "Plan-and-Execute Agents": These agents formulate a plan of action and then execute them sequentially. This approach is beneficial for complex or long-running tasks, as it helps maintain long-term objectives and focus. However, it typically involves more calls and higher latency.
These two types of agents are not mutually exclusive. It is often advantageous to have an Action Agent responsible for executing the plan devised by the Plan-and-Execute Agent. There are four different elements of Agents:
- Tools- It includes different types of tools LangChain supports. It can be considered to be the 'capability' of an agent. It provides an abstraction on top of a function that makes it easy for the models to interact with the outside world.
- Agents-This element performs the task of Decision Making. More specifically, an agent takes in an input and returns a response corresponding to its action. Example: conversation agent where the agent does not focus on providing the best response but instead focuses on the conversation setting using memory.
- ToolKits- It refers to the collection of tools that an agent can select from. An agent will have a toolkit of various tools to perform various tasks, and depending on which tool the agent decides to use, the necessary tool is called to and used.
- AgentExecutor- It calls the agent and tools in a loop. Agent executors take an agent and tools and use the agent to decide which tools to call and in what order.
Evaluation
The Challenge
Assessing the effectiveness of LangChain chains and agents is challenging. There are two primary reasons for this:
- Limited Data Availability-Typically, data is scarce to evaluate chains and agents at the initial stages of a project. This is primarily because Large Language Models, which form the core of these chains and agents, are proficient in learning from just a few or even zero examples. Consequently, it is often possible to commence a specific task (such as text-to-SQL or question-answering) without a large dataset of pre-existing examples. This starkly contrasts traditional machine learning, where collecting a substantial amount of data was a prerequisite before utilizing a model.
- Inadequate Evaluation Metrics-Many chains and agents perform tasks that lack well-defined metrics for evaluating their performance. For instance, when generating text, evaluating the quality of the generated output is considerably more complex than assessing the accuracy of a classification or numerical prediction.
The Solution
LangChain aims to address the challenges mentioned above, although it acknowledges that the solutions implemented are still in the early stages and may need improvement. The project highly values community feedback, contributions, integrations, and thoughts to improve its offerings further.
- Lack of data: LangChain has initiated LangChainDatasets, a Community space on Hugging Face. This space is a repository of open-source datasets designed to evaluate standard chains and agents. While LangChain has contributed five datasets, to begin with, the intention is to foster community participation. To contribute to a dataset, individuals need to join the community and gain the ability to upload datasets. Moreover, LangChain aims to facilitate the creation of custom datasets. As an initial step towards this, they have introduced the QAGenerationChain. This chain generates question-answer pairs based on a given document, which can be used for evaluating question-answering tasks in the future.
- Lack of metrics: LangChain offers two solutions to address the lack of metrics. The first solution relies on the visual inspection of results rather than specific metrics. By utilizing the tracing feature, which is a UI-based visualizer of chain and agent runs, users can assess the performance of their chains and agents. LangChain is committed to further developing this feature. The second solution LangChain suggests is employing Language Models to evaluate outputs. They provide several chains and prompts designed to tackle this issue, allowing users to leverage Language Models for evaluating the quality of generated text and other outputs.
STREAMLIT/GRADIO
We first import the necessary dependencies: Epooch_Api_Key, openai, and streamlit. Epooch_Api_Key is a custom module that contains the API key required to authenticate with the OpenAI API. openai is the official OpenAI Python library, and streamlit is a framework for building interactive web applications. The API key is retrieved from the Epooch_Api_Key module and assigned to openai.api_key. get_completion_from_messages takes a list of messages and uses the OpenAI API to generate a response based on those messages. It uses the openai.ChatCompletion.create() method to send the messages to the API and retrieve the response. The response is extracted from the returned object and returned as the function output. collect_messages() is responsible for collecting the user's messages, generating a response, and displaying the conversation in the UI. It retrieves the value of the 'inp' variable, representing the user's query. The user's message is then added to the context list with the role set to "user". The get_completion_from_messages function generates the assistant's response based on the accumulated messages in the context list. Then the assistant's message is added to the context list with the role set to "assistant". The code checks if the 'Let's Chat' button has been clicked using an if statement. If it has been clicked, the collect_messages function is called to execute the conversation and update the UI accordingly.
Acknowledgment: This guide was skillfully crafted with the help of Saud M.