Skip to main content

MCP Overview

LiteLLM Proxy provides an MCP Gateway that allows you to use a fixed endpoint for all MCP tools and control MCP access by Key, Team.

LiteLLM MCP Architecture: Use MCP tools with all LiteLLM supported models

Overview​

FeatureDescription
MCP Operations• List Tools
• Call Tools
Supported MCP Transports• Streamable HTTP
• SSE
• Standard Input/Output (stdio)
LiteLLM Permission Management• By Key
• By Team
• By Organization

Adding your MCP​

Prerequisites​

To store MCP servers in the database, you need to enable database storage:

Environment Variable:

export STORE_MODEL_IN_DB=True

OR in config.yaml:

general_settings:
store_model_in_db: true

Fine-grained Database Storage Control​

By default, when store_model_in_db is true, all object types (models, MCPs, guardrails, vector stores, etc.) are stored in the database. If you want to store only specific object types, use the supported_db_objects setting.

Example: Store only MCP servers in the database

config.yaml
general_settings:
store_model_in_db: true
supported_db_objects: ["mcp"] # Only store MCP servers in DB

model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: sk-xxxxxxx

See all available object types: Config Settings - supported_db_objects

If supported_db_objects is not set, all object types are loaded from the database (default behavior).

On the LiteLLM UI, Navigate to "MCP Servers" and click "Add New MCP Server".

On this form, you should enter your MCP Server URL and the transport you want to use.

LiteLLM supports the following MCP transports:

  • Streamable HTTP
  • SSE (Server-Sent Events)
  • Standard Input/Output (stdio)


Add HTTP MCP Server​

This video walks through adding and using an HTTP MCP server on LiteLLM UI and using it in Cursor IDE.



Add SSE MCP Server​

This video walks through adding and using an SSE MCP server on LiteLLM UI and using it in Cursor IDE.



Add STDIO MCP Server​

For stdio MCP servers, select "Standard Input/Output (stdio)" as the transport type and provide the stdio configuration in JSON format:

MCP Tool Filtering​

Control which tools are available from your MCP servers. You can either allow only specific tools or block dangerous ones.

Use allowed_tools to specify exactly which tools users can access. All other tools will be blocked.

config.yaml
mcp_servers:
github_mcp:
url: "https://api.githubcopilot.com/mcp"
auth_type: oauth2
authorization_url: https://github.com/login/oauth/authorize
token_url: https://github.com/login/oauth/access_token
client_id: os.environ/GITHUB_OAUTH_CLIENT_ID
client_secret: os.environ/GITHUB_OAUTH_CLIENT_SECRET
scopes: ["public_repo", "user:email"]
allowed_tools: ["list_tools"]
# only list_tools will be available

Use this when:

  • You want strict control over which tools are available
  • You're in a high-security environment
  • You're testing a new MCP server with limited tools

Important Notes​

  • If you specify both allowed_tools and disallowed_tools, the allowed list takes priority
  • Tool names are case-sensitive

MCP Server Access Control​

LiteLLM Proxy provides two methods for controlling access to specific MCP servers:

  1. URL-based Namespacing - Use URL paths to directly access specific servers or access groups
  2. Header-based Namespacing - Use the x-mcp-servers header to specify which servers to access

Method 1: URL-based Namespacing​

LiteLLM Proxy supports URL-based namespacing for MCP servers using the format /mcp/<servers or access groups>. This allows you to:

  • Direct URL Access: Point MCP clients directly to specific servers or access groups via URL
  • Simplified Configuration: Use URLs instead of headers for server selection
  • Access Group Support: Use access group names in URLs for grouped server access

URL Format​

<your-litellm-proxy-base-url>/mcp/<server_alias_or_access_group>

Examples:

  • /mcp/github - Access tools from the "github" MCP server
  • /mcp/zapier - Access tools from the "zapier" MCP server
  • /mcp/dev_group - Access tools from all servers in the "dev_group" access group
  • /mcp/github,zapier - Access tools from multiple specific servers

Usage Examples​

cURL Example with URL Namespacing
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "<your-litellm-proxy-base-url>/mcp/github",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'

This example uses URL namespacing to access only the "github" MCP server.

Benefits of URL Namespacing​

  • Direct Access: No need for additional headers to specify servers
  • Clean URLs: Self-documenting URLs that clearly indicate which servers are accessible
  • Access Group Support: Use access group names for grouped server access
  • Multiple Servers: Specify multiple servers in a single URL with comma separation
  • Simplified Configuration: Easier setup for MCP clients that prefer URL-based configuration

Method 2: Header-based Namespacing​

You can choose to access specific MCP servers and only list their tools using the x-mcp-servers header. This header allows you to:

  • Limit tool access to one or more specific MCP servers
  • Control which tools are available in different environments or use cases

The header accepts a comma-separated list of server aliases: "alias_1,Server2,Server3"

Notes:

  • If the header is not provided, tools from all available MCP servers will be accessible
  • This method works with the standard LiteLLM MCP endpoint
cURL Example with Header Namespacing
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "<your-litellm-proxy-base-url>/mcp/",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-servers": "alias_1"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'

In this example, the request will only have access to tools from the "alias_1" MCP server.


Comparison: Header vs URL Namespacing​

FeatureHeader NamespacingURL Namespacing
MethodUses x-mcp-servers headerUses URL path /mcp/<servers>
EndpointStandard litellm_proxy endpointCustom /mcp/<servers> endpoint
ConfigurationRequires additional headerSelf-contained in URL
Multiple ServersComma-separated in headerComma-separated in URL path
Access GroupsSupported via headerSupported via URL path
Client SupportWorks with all MCP clientsWorks with URL-aware MCP clients
Use CaseDynamic server selectionFixed server configuration
cURL Example with Server Segregation
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "<your-litellm-proxy-base-url>/mcp/",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-servers": "alias_1"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'

In this example, the request will only have access to tools from the "alias_1" MCP server.

Grouping MCPs (Access Groups)​

MCP Access Groups allow you to group multiple MCP servers together for easier management.

1. Create an Access Group​

A. Creating Access Groups using Config:​
Creating access groups for MCP using the config
mcp_servers:
"deepwiki_mcp":
url: https://mcp.deepwiki.com/mcp
transport: "http"
auth_type: "none"
access_groups: ["dev_group"]

While adding mcp_servers using the config:

  • Pass in a list of strings inside access_groups
  • These groups can then be used for segregating access using keys, teams and MCP clients using headers
B. Creating Access Groups using UI​

To create an access group:

  • Go to MCP Servers in the LiteLLM UI
  • Click "Add a New MCP Server"
  • Under "MCP Access Groups", create a new group (e.g., "dev_group") by typing it
  • Add the same group name to other servers to group them together

2. Use Access Group in Cursor​

Include the access group name in the x-mcp-servers header:

Cursor Configuration with Access Groups
{
"mcpServers": {
"LiteLLM": {
"url": "litellm_proxy",
"headers": {
"x-litellm-api-key": "Bearer $LITELLM_API_KEY",
"x-mcp-servers": "dev_group"
}
}
}
}

This gives you access to all servers in the "dev_group" access group.

  • Which means that if deepwiki server (and any other servers) which have the access group dev_group assigned to them will be available for tool calling

Advanced: Connecting Access Groups to API Keys​

When creating API keys, you can assign them to specific access groups for permission management:

  • Go to "Keys" in the LiteLLM UI and click "Create Key"
  • Select the desired MCP access groups from the dropdown
  • The key will have access to all MCP servers in those groups
  • This is reflected in the Test Key page

Forwarding Custom Headers to MCP Servers​

LiteLLM supports forwarding additional custom headers from MCP clients to backend MCP servers using the extra_headers configuration parameter. This allows you to pass custom authentication tokens, API keys, or other headers that your MCP server requires.

Configuration​

Configure extra_headers in your MCP server configuration to specify which header names should be forwarded:

config.yaml with extra_headers
mcp_servers:
github_mcp:
url: "https://api.githubcopilot.com/mcp"
auth_type: "bearer_token"
auth_value: "ghp_default_token"
extra_headers: ["custom_key", "x-custom-header", "Authorization"]
description: "GitHub MCP server with custom header forwarding"

Client Usage​

When connecting from MCP clients, include the custom headers that match the extra_headers configuration:

FastMCP Client with Custom Headers
from fastmcp import Client
import asyncio

# MCP client configuration with custom headers
config = {
"mcpServers": {
"github": {
"url": "http://localhost:4000/github_mcp/mcp",
"headers": {
"x-litellm-api-key": "Bearer sk-1234",
"Authorization": "Bearer gho_token",
"custom_key": "custom_value",
"x-custom-header": "additional_data"
}
}
}
}

# Create a client that connects to the server
client = Client(config)

async def main():
async with client:
# List available tools
tools = await client.list_tools()
print(f"Available tools: {tools}")

# Call a tool if available
if tools:
result = await client.call_tool(tools[0].name, {})
print(f"Tool result: {result}")

# Run the client
asyncio.run(main())

How It Works​

  1. Configuration: Define extra_headers in your MCP server config with the header names you want to forward
  2. Client Headers: Include the corresponding headers in your MCP client requests
  3. Header Forwarding: LiteLLM automatically forwards matching headers to the backend MCP server
  4. Authentication: The backend MCP server receives both the configured auth headers and the custom headers

Use Cases​

  • Custom Authentication: Forward custom API keys or tokens required by specific MCP servers
  • Request Context: Pass user identification, session data, or request tracking headers
  • Third-party Integration: Include headers required by external services that your MCP server integrates with
  • Multi-tenant Systems: Forward tenant-specific headers for proper request routing

Security Considerations​

  • Only headers listed in extra_headers are forwarded to maintain security
  • Sensitive headers should be passed through environment variables when possible
  • Consider using server-specific auth headers for better security isolation

MCP Oauth​

LiteLLM v 1.77.6 added support for OAuth 2.0 Client Credentials for MCP servers.

This configuration is currently available on the config.yaml, with UI support coming soon.

mcp_servers:
github_mcp:
url: "https://api.githubcopilot.com/mcp"
auth_type: oauth2
authorization_url: https://github.com/login/oauth/authorize
token_url: https://github.com/login/oauth/access_token
client_id: os.environ/GITHUB_OAUTH_CLIENT_ID
client_secret: os.environ/GITHUB_OAUTH_CLIENT_SECRET
scopes: ["public_repo", "user:email"]

Using your MCP with client side credentials​

Use this if you want to pass a client side authentication token to LiteLLM to then pass to your MCP to auth to your MCP.

You can specify MCP auth tokens using server-specific headers in the format x-mcp-{server_alias}-{header_name}. This allows you to use different authentication for different MCP servers.

Benefits:

  • Server-specific authentication: Each MCP server can use different auth methods
  • Better security: No need to share the same auth token across all servers
  • Flexible header names: Support for different auth header types (authorization, x-api-key, etc.)
  • Clean separation: Each server's auth is clearly identified

Legacy Auth Header (Deprecated)​

You can also specify your MCP auth token using the header x-mcp-auth. This will be forwarded to all MCP servers and is deprecated in favor of server-specific headers.

Connect via OpenAI Responses API with Server-Specific Auth​

Use the OpenAI Responses API and include server-specific auth headers:

cURL Example with Server-Specific Auth
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "litellm_proxy",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-github-authorization": "Bearer YOUR_GITHUB_TOKEN",
"x-mcp-zapier-x-api-key": "YOUR_ZAPIER_API_KEY"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'

Connect via OpenAI Responses API with Legacy Auth​

Use the OpenAI Responses API and include the x-mcp-auth header for your MCP server authentication:

cURL Example with Legacy MCP Auth
curl --location 'https://api.openai.com/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "litellm_proxy",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"x-mcp-auth": YOUR_MCP_AUTH_TOKEN
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'

Customize the MCP Auth Header Name​

By default, LiteLLM uses x-mcp-auth to pass your credentials to MCP servers. You can change this header name in one of the following ways:

  1. Set the LITELLM_MCP_CLIENT_SIDE_AUTH_HEADER_NAME environment variable
Environment Variable
export LITELLM_MCP_CLIENT_SIDE_AUTH_HEADER_NAME="authorization"
  1. Set the mcp_client_side_auth_header_name in the general settings on the config.yaml file
config.yaml
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: sk-xxxxxxx

general_settings:
mcp_client_side_auth_header_name: "authorization"

Using the authorization header​

In this example the authorization header will be passed to the MCP server for authentication.

cURL with authorization header
curl --location '<your-litellm-proxy-base-url>/v1/responses' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer $LITELLM_API_KEY" \
--data '{
"model": "gpt-4o",
"tools": [
{
"type": "mcp",
"server_label": "litellm",
"server_url": "litellm_proxy",
"require_approval": "never",
"headers": {
"x-litellm-api-key": "Bearer YOUR_LITELLM_API_KEY",
"authorization": "Bearer sk-zapier-token-123"
}
}
],
"input": "Run available tools",
"tool_choice": "required"
}'

LiteLLM Proxy - Walk through MCP Gateway​

LiteLLM exposes an MCP Gateway for admins to add all their MCP servers to LiteLLM. The key benefits of using LiteLLM Proxy with MCP are:

  1. Use a fixed endpoint for all MCP tools
  2. MCP Permission management by Key, Team, or User

This video demonstrates how you can onboard an MCP server to LiteLLM Proxy, use it and set access controls.

LiteLLM Python SDK MCP Bridge​

LiteLLM Python SDK acts as a MCP bridge to utilize MCP tools with all LiteLLM supported models. LiteLLM offers the following features for using MCP

  • List Available MCP Tools: OpenAI clients can view all available MCP tools
    • litellm.experimental_mcp_client.load_mcp_tools to list all available MCP tools
  • Call MCP Tools: OpenAI clients can call MCP tools
    • litellm.experimental_mcp_client.call_openai_tool to call an OpenAI tool on an MCP server

1. List Available MCP Tools​

In this example we'll use litellm.experimental_mcp_client.load_mcp_tools to list all available MCP tools on any MCP server. This method can be used in two ways:

  • format="mcp" - (default) Return MCP tools
    • Returns: mcp.types.Tool
  • format="openai" - Return MCP tools converted to OpenAI API compatible tools. Allows using with OpenAI endpoints.
    • Returns: openai.types.chat.ChatCompletionToolParam
MCP Client List Tools
# Create server parameters for stdio connection
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import os
import litellm
from litellm import experimental_mcp_client


server_params = StdioServerParameters(
command="python3",
# Make sure to update to the full absolute path to your mcp_server.py file
args=["./mcp_server.py"],
)

async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()

# Get tools
tools = await experimental_mcp_client.load_mcp_tools(session=session, format="openai")
print("MCP TOOLS: ", tools)

messages = [{"role": "user", "content": "what's (3 + 5)"}]
llm_response = await litellm.acompletion(
model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY"),
messages=messages,
tools=tools,
)
print("LLM RESPONSE: ", json.dumps(llm_response, indent=4, default=str))

2. List and Call MCP Tools​

In this example we'll use

  • litellm.experimental_mcp_client.load_mcp_tools to list all available MCP tools on any MCP server
  • litellm.experimental_mcp_client.call_openai_tool to call an OpenAI tool on an MCP server

The first llm response returns a list of OpenAI tools. We take the first tool call from the LLM response and pass it to litellm.experimental_mcp_client.call_openai_tool to call the tool on the MCP server.

How litellm.experimental_mcp_client.call_openai_tool works​

  • Accepts an OpenAI Tool Call from the LLM response
  • Converts the OpenAI Tool Call to an MCP Tool
  • Calls the MCP Tool on the MCP server
  • Returns the result of the MCP Tool call
MCP Client List and Call Tools
# Create server parameters for stdio connection
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import os
import litellm
from litellm import experimental_mcp_client


server_params = StdioServerParameters(
command="python3",
# Make sure to update to the full absolute path to your mcp_server.py file
args=["./mcp_server.py"],
)

async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# Initialize the connection
await session.initialize()

# Get tools
tools = await experimental_mcp_client.load_mcp_tools(session=session, format="openai")
print("MCP TOOLS: ", tools)

messages = [{"role": "user", "content": "what's (3 + 5)"}]
llm_response = await litellm.acompletion(
model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY"),
messages=messages,
tools=tools,
)
print("LLM RESPONSE: ", json.dumps(llm_response, indent=4, default=str))

openai_tool = llm_response["choices"][0]["message"]["tool_calls"][0]
# Call the tool using MCP client
call_result = await experimental_mcp_client.call_openai_tool(
session=session,
openai_tool=openai_tool,
)
print("MCP TOOL CALL RESULT: ", call_result)

# send the tool result to the LLM
messages.append(llm_response["choices"][0]["message"])
messages.append(
{
"role": "tool",
"content": str(call_result.content[0].text),
"tool_call_id": openai_tool["id"],
}
)
print("final messages with tool result: ", messages)
llm_response = await litellm.acompletion(
model="gpt-4o",
api_key=os.getenv("OPENAI_API_KEY"),
messages=messages,
tools=tools,
)
print(
"FINAL LLM RESPONSE: ", json.dumps(llm_response, indent=4, default=str)
)