Elegant 3D visualization of neural networks showcasing abstract connections in a digital space.
News

CVE-2026-42271: How a LiteLLM MCP Endpoint Chained with a Starlette Bug Became a CVSS 10.0 AI Gateway Takeover

LiteLLM CVE-2026-42271 chains with CVE-2026-48710 to reach unauthenticated RCE with CVSS 10.0. Patch to version 1.83.7 immediately.
Sami Malik
Copywriter

On 8 June 2026, CISA added CVE-2026-42271 to its Known Exploited Vulnerabilities catalogue after confirming active exploitation in the wild. The vulnerability is a command injection flaw in LiteLLM, an open-source AI gateway and proxy server used by organisations to route API traffic to large language model providers including OpenAI, Anthropic, Google, and dozens of others. The flaw carries a CVSS score of 8.7 on its own, but when chained with a second vulnerability, CVE-2026-48710, a host-header authentication bypass in the Starlette web framework, the combined attack reaches a CVSS score of 10.0 and allows completely unauthenticated remote code execution from any network-reachable host. A successful attack can steal every AI provider API key stored by the gateway, execute arbitrary commands on the proxy host, and open a lateral movement path into the AI infrastructure connected through the gateway. The fix requires upgrading LiteLLM to version 1.83.7 and Starlette to version 1.0.1.

What LiteLLM Is and Why It Is Deployed Widely

LiteLLM is an open-source Python library and proxy server that allows organisations to send requests to multiple large language model providers through a single OpenAI-compatible API endpoint. Instead of configuring each application to use provider-specific SDKs and API keys, a development team deploys one LiteLLM proxy, configures it with credentials for OpenAI, Anthropic, Azure OpenAI, Google Gemini, Cohere, and any other providers they use, and then routes all LLM calls through the proxy. This architecture simplifies cost tracking, rate limit management, model routing, and fallback logic.

The centralisation that makes LiteLLM operationally convenient is precisely what makes it a high-value target. The LiteLLM proxy is, by design, the single location where every AI provider API key an organisation uses is stored and managed. A compromise of the LiteLLM host is not a compromise of one application's credentials: it is a compromise of the credentials for every AI service the organisation uses, across every application that routes through the gateway. For organisations that spend significantly on commercial AI APIs, the financial exposure from a credential theft is direct and immediate, independent of any ransomware or extortion demand from the attacker.

MCP Test Endpoints: A Debugging Feature That Became an Attack Surface

The vulnerable endpoints are part of LiteLLM's support for the Model Context Protocol (MCP), a standard for integrating external tools and data sources with AI agents. LiteLLM added test endpoints that allow users to preview an MCP server configuration before saving it to the proxy's configuration file. These test endpoints, POST /mcp-rest/test/connection and POST /mcp-rest/test/tools/list, were designed to let users verify that a new MCP server configuration was correctly formed and reachable before committing it.

The implementation of these test endpoints accepted a full MCP server configuration in the request body, including the command, args, and env fields used by the stdio transport, and then spawned that input as a subprocess on the proxy host. The subprocess was given the full privileges of the LiteLLM proxy process. There was no validation of the command against an allowlist, no sandboxing of the executed process, and no administrative role requirement to call these endpoints. The design assumption appears to have been that only authenticated administrators with legitimate access to the LiteLLM configuration interface would call these test endpoints, but the authentication mechanism protecting them was insufficient for that assumption to hold.

CVE-2026-42271: Command Injection with No Guardrails

Horizon3.ai's research describes CVE-2026-42271 as command injection, where user-supplied input is passed to an operating system command execution function without adequate validation. Any caller who can reach the MCP test endpoints with a valid LiteLLM proxy API key can supply a command field containing any executable on the host, with any arguments, and have it run with the privileges of the proxy process. The attacker does not need administrative privileges within LiteLLM itself: any API key that can authenticate to the proxy is sufficient to call the test endpoints.

The practical impact is broad. An attacker who has obtained a single LiteLLM API key, which may be embedded in client application code, transmitted in API requests without TLS, or present in a code repository, can immediately escalate to arbitrary code execution on the underlying host. From that position, the attacker can read the LiteLLM configuration file, which contains all provider API keys in plaintext, enumerate and extract any secrets stored as environment variables in the proxy process, install persistent backdoors or reverse shells, and begin lateral movement into other systems reachable from the proxy host network.

CVE-2026-48710: The Host Header Bypass That Removes Authentication Entirely

CVE-2026-48710 is a vulnerability in Starlette, the ASGI web framework that LiteLLM uses as its HTTP server. The vulnerability is known as "BadHost" and involves how Starlette validates the HTTP Host header for authentication purposes. A crafted Host header value in an HTTP request to a LiteLLM endpoint causes Starlette's authentication middleware to skip its validation checks, treating the request as if it came from a trusted internal source rather than from an external client.

When CVE-2026-48710 is chained with CVE-2026-42271, the authentication requirement that would normally protect the MCP test endpoints is bypassed entirely. An attacker who sends a request with the appropriate crafted Host header does not need a valid LiteLLM API key to call the test endpoints. The entire attack chain from unauthenticated network access to arbitrary code execution on the LiteLLM host collapses into a single HTTP POST request, requiring no prior credentials, no pre-existing knowledge of the target organisation's configuration, and no user interaction on the part of any legitimate user.

Horizon3.ai demonstrated this chain and assigned it a combined CVSS score of 10.0, the maximum possible value. The score reflects the combination of network accessibility, no authentication requirement, no user interaction, and complete compromise of confidentiality, integrity, and availability on the affected host.

What an Attacker Gets: API Keys, Cloud Credentials, and Lateral Access

The immediate consequence of a successful exploit is arbitrary command execution on the LiteLLM proxy host with the privileges of the proxy process. In most deployments, the proxy process runs with sufficient privileges to read its own configuration file, which contains every AI provider API key the organisation has configured. Stealing these keys allows the attacker to make API calls to OpenAI, Anthropic, Azure, or other providers at the organisation's expense, up to and including calls that consume the entirety of the organisation's monthly AI API budget before any billing alert fires.

Beyond API key theft, many LiteLLM deployments are configured with environment variables or configuration files containing cloud provider credentials, database connection strings, or other infrastructure secrets. These credentials may grant access to the same cloud environment where the LiteLLM proxy is running, opening a path from the compromised proxy host to cloud storage, databases, or other services in the same environment. The LiteLLM proxy, because of its role as a centralised gateway for AI operations, often has broad network access to the AI-enabled applications it serves, making it a useful pivot point for lateral movement within the application infrastructure. Monitoring for indicators of compromise on AI infrastructure hosts requires looking beyond the proxy itself to the downstream systems it can reach.

How Horizon3.ai Demonstrated the Full Exploit

Horizon3.ai's public disclosure of the exploit chain is unusually detailed for a CISA KEV-listed vulnerability. The research team provided a step-by-step description of the attack chain, including the specific HTTP request structure for the BadHost bypass and the payload format for the command injection. The research notes that the attack can be demonstrated against a default LiteLLM installation with no custom configuration required beyond what a standard deployment would have, meaning proof-of-concept exploitation is accessible to any attacker who can read the Horizon3.ai blog post.

The combination of CISA's KEV addition, confirmed in-the-wild exploitation, and a public proof-of-concept means that the vulnerability is being actively scanned for and exploited by a range of threat actors beyond those who independently discovered it. Any internet-accessible LiteLLM instance running a version between 1.74.2 and 1.83.6 should be treated as compromised or under active attack until it has been patched and its environment has been reviewed for signs of prior compromise. Understanding your organisation's attack surface for AI infrastructure requires accounting for any externally reachable LiteLLM or similar AI proxy as a high-priority perimeter component.

CISA's Response and Confirmed Exploitation

CISA's decision to add CVE-2026-42271 to the KEV catalogue on 8 June 2026, the day after Horizon3.ai's public disclosure, reflects a rapid response informed by confirmed exploitation that pre-dated the public disclosure. In-the-wild exploitation is a prerequisite for KEV inclusion, meaning that CISA had evidence of active attacks against LiteLLM deployments before the vulnerability became public knowledge. This pre-disclosure exploitation period creates the same risk profile seen in other KEV-listed vulnerabilities: organisations that relied on patching after public disclosure were already exposed during the exploitation window.

The CISA advisory urges all organisations using LiteLLM to upgrade immediately and to review their environments for signs of prior compromise, including reviewing logs for unexpected calls to the MCP test endpoints, checking for new processes spawned by the LiteLLM process, and auditing all AI provider API keys for unauthorised usage. API provider dashboards for OpenAI, Anthropic, and other services typically retain usage logs that can reveal whether API keys were used from IP addresses or at times inconsistent with normal application behaviour, which is one of the fastest ways to determine whether credentials were stolen and used before the compromise was detected.

Fixing It: Update Both LiteLLM and Starlette

The fix requires two separate upgrades. LiteLLM must be upgraded to version 1.83.7 or later, which introduces additional authorisation controls on the MCP test endpoints, restricting them to authenticated administrators rather than any API key holder. Starlette must be upgraded to version 1.0.1 or later, which addresses the BadHost host-header validation bypass. Both upgrades are required: patching only LiteLLM leaves the host-header bypass in place, and patching only Starlette leaves the underlying command injection flaw reachable by any authenticated API key holder.

After upgrading, organisations should rotate all AI provider API keys that were stored in the LiteLLM configuration, on the assumption that any instance running a vulnerable version may have had its configuration read by an attacker at some point during the exploitation window. Key rotation is a precautionary measure but is strongly recommended given that the attack provides direct read access to the configuration file. Any application that used the rotated keys should be updated with the new keys, and the old keys should be revoked at the provider dashboard rather than simply replaced in the LiteLLM configuration. Monitoring the risk from leaked credentials stored in AI infrastructure is an ongoing operational requirement, not a one-time remediation step.

Frequently Asked Questions

Does CVE-2026-42271 affect LiteLLM when deployed as a Python library rather than as a proxy?

The vulnerability is specifically in the LiteLLM proxy server's MCP test endpoints. If your organisation uses LiteLLM as a Python library imported directly into your application code, without running the proxy server component, you are not exposed to this specific vulnerability through that deployment pattern. However, if your application also runs the proxy server, or if you have deployed the proxy server separately in the same environment, both need to be assessed and upgraded.

Can we mitigate without upgrading by disabling MCP endpoints?

Disabling the MCP test endpoints removes the specific attack surface for CVE-2026-42271, but the BadHost vulnerability in Starlette (CVE-2026-48710) affects other LiteLLM endpoints as well. Disabling MCP alone does not address the broader authentication bypass that CVE-2026-48710 enables. The recommended remediation is to upgrade both packages rather than attempting to selectively disable features, as configuration-based mitigations may not address all paths through which these vulnerabilities are exploitable.

How do we know if our LiteLLM instance was compromised before we patched?

Review your LiteLLM proxy logs for any POST requests to /mcp-rest/test/connection or /mcp-rest/test/tools/list with unusual Host header values. Review process spawning logs on the proxy host for any unexpected processes created by the LiteLLM parent process. Check your AI provider dashboards for API usage from unexpected IP addresses or at unexpected times. If you find evidence of unexpected process execution or API key usage, treat the host as compromised and begin an incident response process that includes rotating all credentials stored on or accessible from the host.

Does this affect other AI gateways that use similar MCP support?

CVE-2026-42271 is specific to LiteLLM's implementation of MCP test endpoints. Other AI gateways that implement MCP support, such as LangServe, OpenRouter, or custom proxy implementations, should be reviewed for similar patterns, but they do not share LiteLLM's codebase and are not directly affected by CVE-2026-42271. If your organisation uses multiple AI gateways, each should be assessed independently against its own vendor advisories.

Is the CVSS 10.0 score for the combined chain or for each CVE individually?

The CVSS 10.0 score applies to the combined exploit chain of CVE-2026-42271 and CVE-2026-48710. CVE-2026-42271 individually scores 8.7, which reflects the command injection with authenticated access. CVE-2026-48710 individually scores at a lower severity because it is a host-header validation bypass. The chained score of 10.0 reflects the combined effect of authentication bypass plus arbitrary command execution, which together constitute a complete, unauthenticated takeover of the affected system.

Should we restrict LiteLLM to internal-only access as an immediate step?

If your LiteLLM proxy is currently accessible from the public internet and you cannot patch immediately, restricting its network access to internal-only traffic is the most effective compensating control available. The attack chain requires network reachability as a prerequisite. An instance that is only accessible from trusted internal networks or through a VPN is significantly harder to attack than one with a public endpoint. However, this should be treated as a temporary measure: apply the patches as soon as possible rather than relying on network-layer controls indefinitely.

What if we use a managed LiteLLM service rather than self-hosting?

If your organisation uses a managed platform built on LiteLLM or that incorporates LiteLLM components, contact the service provider to confirm which version they are running and whether they have applied the CVE-2026-42271 and CVE-2026-48710 fixes. The risk to your API keys and infrastructure depends on whether the provider's deployment is vulnerable and how it is configured. You should also rotate your AI provider API keys as a precaution, since these keys may have been stored in the managed service's environment during any period when it was running a vulnerable version.

AI Infrastructure as an Emerging Attack Surface

CVE-2026-42271 represents a category of vulnerability that is likely to become more common as AI infrastructure matures. The LiteLLM proxy is a relatively new category of software: it did not exist in its current form three years ago, and the Model Context Protocol that the vulnerable endpoints implement was standardised only in the past eighteen months. Security research and threat actor capability development typically lag behind software adoption by a similar interval, which means that AI gateway software is now entering the period where security researchers are conducting serious audits and threat actors are beginning to develop working exploits against it.

The security model of AI gateways is also genuinely novel in ways that create risks that are not immediately obvious from prior experience with other types of middleware. An AI gateway stores credentials for commercial AI providers that may have per-call billing implications in the thousands of dollars per hour at high usage rates. It may have access to the input and output of every AI query made through it, potentially including sensitive business data, customer information, or confidential internal communications that users have included in their AI prompts. And it often has broad network access to the AI-enabled applications it serves, making it a useful lateral movement platform for an attacker who compromises it.

Organisations that have deployed AI infrastructure should apply the same security scrutiny to AI gateways and proxies that they apply to other middleware components that hold sensitive credentials and have broad internal network access. This includes maintaining an inventory of all AI gateway deployments, applying vendor patches on an emergency timeline for critical vulnerabilities, restricting network access to AI gateways to trusted client systems, monitoring API usage for patterns inconsistent with normal application behaviour, and ensuring that AI provider API keys can be quickly rotated across all dependent applications in the event of a suspected compromise. Understanding the full scope of your organisation's AI attack surface is a prerequisite for managing it effectively.

How Defendis Can Help

Incidents like this one rarely announce themselves through official channels first. Indicators of active exploitation, compromised infrastructure, and stolen credentials circulate in closed forums and private channels well before any public advisory reaches your security team. By the time a vulnerability makes it into a published report, organisations without early visibility are already operating behind the curve.

Defendis gives your security team that early visibility. We monitor the dark web, underground forums, and threat actor channels so your team receives relevant intelligence before it becomes breaking news, with context about emerging threats matched against your organisation's exposure, without requiring your analysts to spend time in places they should not have to go.

Book a demo

About the author
Sami Malik is a copywriter passionate about crafting clear, engaging, and impactful content that helps brands connect with their audience through storytelling and strategy.

Related Articles

Discover simplified
Cyber Risk Management
Request access and learn how we can help you prevent cyberattacks proactively.