Models¶

LangGraph provides built-in support for LLMs (language models) via the LangChain library. This makes it easy to integrate various LLMs into your agents and workflows.

Initialize a model¶

Use init_chat_model to initialize models:

OpenAIAnthropicAzureGoogle GeminiAWS Bedrock

pip install -U "langchain[openai]"

import os
from langchain.chat_models import init_chat_model

os.environ["OPENAI_API_KEY"] = "sk-..."

llm = init_chat_model("openai:gpt-4.1")

👉 Read the OpenAI integration docs

pip install -U "langchain[anthropic]"

import os
from langchain.chat_models import init_chat_model

os.environ["ANTHROPIC_API_KEY"] = "sk-..."

llm = init_chat_model("anthropic:claude-3-5-sonnet-latest")

👉 Read the Anthropic integration docs

pip install -U "langchain[openai]"

import os
from langchain.chat_models import init_chat_model

os.environ["AZURE_OPENAI_API_KEY"] = "..."
os.environ["AZURE_OPENAI_ENDPOINT"] = "..."
os.environ["OPENAI_API_VERSION"] = "2025-03-01-preview"

llm = init_chat_model(
    "azure_openai:gpt-4.1",
    azure_deployment=os.environ["AZURE_OPENAI_DEPLOYMENT_NAME"],
)

👉 Read the Azure integration docs

pip install -U "langchain[google-genai]"

import os
from langchain.chat_models import init_chat_model

os.environ["GOOGLE_API_KEY"] = "..."

llm = init_chat_model("google_genai:gemini-2.0-flash")

👉 Read the Google GenAI integration docs

pip install -U "langchain[aws]"

from langchain.chat_models import init_chat_model

# Follow the steps here to configure your credentials:
# https://docs.aws.amazon.com/bedrock/latest/userguide/getting-started.html

llm = init_chat_model(
    "anthropic.claude-3-5-sonnet-20240620-v1:0",
    model_provider="bedrock_converse",
)

👉 Read the AWS Bedrock integration docs

Instantiate a model directly¶

If a model provider is not available via init_chat_model, you can instantiate the provider's model class directly. The model must implement the BaseChatModel interface and support tool calling:

^{API Reference: ChatAnthropic}

# Anthropic is already supported by `init_chat_model`,
# but you can also instantiate it directly.
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(
  model="claude-3-7-sonnet-latest",
  temperature=0,
  max_tokens=2048
)

Tool calling support

If you are building an agent or workflow that requires the model to call external tools, ensure that the underlying language model supports tool calling. Compatible models can be found in the LangChain integrations directory.

Use in an agent¶

When using create_react_agent you can specify the model by its name string, which is a shorthand for initializing the model using init_chat_model. This allows you to use the model without needing to import or instantiate it directly.

model namemodel instance

from langgraph.prebuilt import create_react_agent

create_react_agent(
   model="anthropic:claude-3-7-sonnet-latest",
   # other parameters
)

from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent

model = ChatAnthropic(
    model="claude-3-7-sonnet-latest",
    temperature=0,
    max_tokens=2048
)
# Alternatively
# model = init_chat_model("anthropic:claude-3-7-sonnet-latest")

agent = create_react_agent(
  model=model,
  # other parameters
)

Advanced model configuration¶

Disable streaming¶

To disable streaming of the individual LLM tokens, set disable_streaming=True when initializing the model:

init_chat_modelChatModel

from langchain.chat_models import init_chat_model

model = init_chat_model(
    "anthropic:claude-3-7-sonnet-latest",
    disable_streaming=True
)

from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(
    model="claude-3-7-sonnet-latest",
    disable_streaming=True
)

Refer to the API reference for more information on disable_streaming

Add model fallbacks¶

You can add a fallback to a different model or a different LLM provider using model.with_fallbacks([...]):

init_chat_modelChatModel

from langchain.chat_models import init_chat_model

model_with_fallbacks = (
    init_chat_model("anthropic:claude-3-5-haiku-latest")
    .with_fallbacks([
        init_chat_model("openai:gpt-4.1-mini"),
    ])
)

from langchain_anthropic import ChatAnthropic
from langchain_openai import ChatOpenAI

model_with_fallbacks = (
    ChatAnthropic(model="claude-3-5-haiku-latest")
    .with_fallbacks([
        ChatOpenAI(model="gpt-4.1-mini"),
    ])
)

See this guide for more information on model fallbacks.

Use the built-in rate limiter¶

Langchain includes a built-in in-memory rate limiter. This rate limiter is thread safe and can be shared by multiple threads in the same process.

^{API Reference: InMemoryRateLimiter | ChatAnthropic}

from langchain_core.rate_limiters import InMemoryRateLimiter
from langchain_anthropic import ChatAnthropic

rate_limiter = InMemoryRateLimiter(
    requests_per_second=0.1,  # <-- Super slow! We can only make a request once every 10 seconds!!
    check_every_n_seconds=0.1,  # Wake up every 100 ms to check whether allowed to make a request,
    max_bucket_size=10,  # Controls the maximum burst size.
)

model = ChatAnthropic(
   model_name="claude-3-opus-20240229", 
   rate_limiter=rate_limiter
)

See the LangChain docs for more information on how to handle rate limiting.

Bring your own model¶

If your desired LLM isn't officially supported by LangChain, consider these options:

Implement a custom LangChain chat model: Create a model conforming to the LangChain chat model interface. This enables full compatibility with LangGraph's agents and workflows but requires understanding of the LangChain framework.
Direct invocation with custom streaming: Use your model directly by adding custom streaming logic with StreamWriter. Refer to the custom streaming documentation for guidance. This approach suits custom workflows where prebuilt agent integration is not necessary.