Models in LangChain
LangChain is a framework for working with language models. LangChain is not a language model. In this chapter in my series on LangChain, we will look into the capabilities of LangChain with respect to models.
What is a model?
An AI model is a mathematical representation of data used by an algorithm that is designed to simulate human(-like) behavior. It is created using the machine learning technique of training it on a large amount of data to learn patterns, make predictions, or perform specific tasks.
(Large) Language models (LLMs), such as OpenAI’s GPT (Generative Pre-trained Transformer) models, are typically based on deep learning architectures called transformers. Transformers leverage self-attention mechanisms to capture dependencies between words in a sentence or a sequence of text, enabling them to understand the context and meaning of the words.
LangChain is a model-agnostic framework, meaning you can simply swap one of OpenAIs model for a model by AlephAlpha, without having to change your code, as the interface is identical. The results will of course depend on your used model. ;)
Which types of models exist?
LangChain allows you to work with two types of models: LLMs and Chat Models. While LLMs are basically “text in — matching text out”, a Chat Model is based on chat messages. You already know one chat model: ChatGPT. Your choice of either one will depend on your use case.
Large language models
LangChain is extended to support new llm providers on a regular basis. This list provides you with a quick overview on a subset on currently supported providers:
Aleph Alpha Azure OpenAI Cohere Hugging Face Hub, Local Pipelines, TextGen Inference OpenAI
For a complete and updated list see the docs : https://python.langchain.com/en/latest/reference/modules/llms.html.
The most basic options for using an LLM are with __call__
and generate
. Let's see how this works:
import os
os.environ['OPENAI_API_KEY'] = "..."
from langchain.llms import OpenAIllm = OpenAI() # giving no llm will default to text-davinci-003print(llm("Tell me a fun fact"))
# The average person laughs 15 times a day.llm_result = llm.generate(["Tell me a fun fact", "Tell me an animal name with 'Z'"]*5)len(llm_result.generations)
# 10print(llm_result.generations)
"""
[[Generation(text='\n\nThe average human body contains enough iron to make a 3 inch nail.', generation_info={'finish_reason': 'stop', 'logprobs': None})],
[Generation(text='\n\nZebra', generation_info={'finish_reason': 'stop', 'logprobs': None})],
[Generation(text='\n\nThe average person laughs 15 times a day!', generation_info={'finish_reason': 'stop', 'logprobs': None})],
[Generation(text='\n\nZebra', generation_info={'finish_reason': 'stop', 'logprobs': None})],
[Generation(text='\n\nThe longest word in the English language is "pneumonoultramicroscopicsilicovolcanoconiosis".', generation_info={'finish_reason': 'stop', 'logprobs': None})],
[Generation(text='\n\nZebra', generation_info={'finish_reason': 'stop', 'logprobs': None})],
[Generation(text='\n\nThe first “selfie” was taken in 1839, long before cell phones and digital cameras were invented!', generation_info={'finish_reason': 'stop', 'logprobs': None})],
[Generation(text='\n\nZebra', generation_info={'finish_reason': 'stop', 'logprobs': None})],
[Generation(text='\n\nThe average human body contains enough iron to make a small nail.', generation_info={'finish_reason': 'stop', 'logprobs': None})],
[Generation(text='\n\nZebra', generation_info={'finish_reason': 'stop', 'logprobs': None})]
]
"""print(llm_result.llm_output)
# {'token_usage': {'completion_tokens': 115, 'total_tokens': 185, 'prompt_tokens': 70}, 'model_name': 'text-davinci-003'}
Chat Models
The chat messages in a chat model have associated roles:
- AIMessage
- HumanMessage
- SystemMessage
- Chatmessage (can take arbitrary role parameters)
And again, we can use both __call__
and generate
. Let's see how chat models handle our questions:
from langchain.chat_models import ChatOpenAI
chat = ChatOpenAI()from langchain.schema import (
AIMessage,
HumanMessage,
SystemMessage
)
chat([HumanMessage(content="Tell me a fun fact.")])
AIMessage(content='A fun fact is that honey never spoils! Archaeologists have found pots of honey in ancient Egyptian tombs that are over 3,000 years old and still perfectly edible.', additional_kwargs={}, example=False)
messages = [
SystemMessage(content="You are a comedian who can tell a funny joke about every topic."),
HumanMessage(content="Programming")
]
chat(messages)
AIMessage(content='Why did the programmer go broke?\n\nBecause he kept trying to find the "root" of all his financial problems!', additional_kwargs={}, example=False)
messages = [
SystemMessage(content="You are a researcher who can name an animal starting with every letter of the alphabet."),
HumanMessage(content="X")
]
chat(messages)
AIMessage(content='Xenopus - a genus of aquatic frogs native to sub-Saharan Africa.', additional_kwargs={}, example=False)
Summary
Today we learnt the difference between LLMs and Chat Models in LangChain. LangChain provides model-agnostic interfaces, we compared both __call__
and generate
.