A year ago, people were debating whether MCP would become the standard for connecting LLMs to tools. That debate is over — MCP won. What nobody prepared for is that winning the standard means inheriting every scaling problem at once. The 2026 roadmap, released earlier this month, reads less like a victory lap and more like a triage list.

The Stateful Session Problem

MCP was born local-first. The original transport was stdio — a pipe between a client and a server running on the same machine. Beautiful for development, terrible for production.

Streamable HTTP fixed the "runs on localhost" part, but it inherited a core assumption: sessions are stateful. The server remembers who you are between requests. This is fine when you're running one instance. The moment you put it behind a load balancer, everything breaks.

The problem is sticky routing. A stateful session means your load balancer has to pin each client to a specific server instance. Auto-scaling? Now you have orphaned sessions. Rolling deploys? Connection drops. One developer trying to run stateless MCP across multiple Kubernetes pods with Redis reported that the SDK provides no reliable way to map client session IDs to the server's internal event streams. That's not a corner case — it's the default deployment model for any serious backend.

Think about what this means in practice. You've got a team running three MCP server instances behind an nginx ingress. Client A connects, gets routed to instance 2, starts a session. Instance 2 goes down for a rolling deploy. The session state is gone. The client reconnects, lands on instance 3, and has to start from scratch — re-negotiating capabilities, re-sending context, burning tokens. Multiply that by hundreds of concurrent clients and you're looking at cascading reconnection storms every time you push a deployment. The teams that have solved this are all doing it differently: some externalize session state to Redis, some run session-affinity hacks at the ingress level, some just gave up and run single-instance deployments with all the fragility that implies.

The roadmap's answer: evolve Streamable HTTP to run statelessly across multiple server instances. No new transports — just making the existing one actually work behind proxies and load balancers. The right call, but the fix isn't here yet.

Enterprise Auth Is a Mess

Static client secrets. That's MCP's auth story today. No SSO integration in the spec, no PKCE flows, no way for IT admins to control deployments from existing identity consoles. WorkOS has shipped OAuth 2.1 middleware as a stopgap. The roadmap references "paved paths" toward SSO-integrated flows — but every team is inventing their own auth layer in the meantime, which is exactly the fragmentation a protocol standard is supposed to prevent.

The Gateway Gap

Production deployments run through intermediaries. Gateways, proxies, WAFs — the usual infrastructure stack. MCP has no specification for how any of them should behave.

Three questions that don't have answers yet:

  1. How does a downstream MCP server know what permissions a client has when the request arrives through a gateway?

  2. How do you maintain session semantics across multiple intermediaries?

  3. What is a gateway allowed to inspect or modify in a tool call payload?

Every enterprise team deploying MCP behind a gateway is answering these questions independently. The results are predictably incompatible.

Here's where the roadmap stands on the major production gaps:

Problem Today Roadmap Status
Stateless horizontal scaling Custom workarounds per team Active working group
SSO / identity integration Static secrets or DIY middleware "Paved paths" referenced
Gateway & proxy behavior No spec — each team invents Enterprise WG forming
Audit trails & observability No standardized mechanism Planned, no timeline
Config portability across clients Tied to specific client apps Acknowledged
Multi-tenancy isolation Not addressed "On the Horizon"

That rightmost column is the honest part. Most of these are in "acknowledged but not solved" territory.

Knowing When to Skip MCP Entirely

Here's the take the roadmap won't give you: MCP isn't always the right answer, even after it became the standard.

Research has shown MCP can inflate input tokens by 3.25x to 236.5x while reducing accuracy by 9.5% when large tool catalogs get exposed without filtering. A security scan earlier this year found 1,862 exposed MCP servers, and every single one of the 119 manually verified instances allowed unauthenticated access to internal tool listings.

The pattern I keep seeing in teams that ship successfully:

Use MCP when you're connecting to tools you don't control and need genuine cross-vendor interoperability. That's the problem it was designed to solve. If your agent needs to talk to Slack, a CRM, a monitoring stack, and a deployment pipeline — all from different vendors — MCP saves you from writing and maintaining four bespoke integrations. The tool schema negotiation and capability discovery actually earn their token cost when the alternative is hand-rolling each connector and hoping the vendor doesn't change their API next quarter.

Skip MCP when you own both sides of the integration. A direct API call is simpler, faster, and doesn't carry the token overhead of tool schema serialization. Also skip it when your tool catalog is large and granular — the context window cost compounds quickly. Some teams are reverting to CLI-based interfaces where stable command-line tools already exist. Others use internal API gateways for integrations they fully own. Eric Holmes made the argument recently that CLIs provide more composable, reliable interfaces in those cases. He's not wrong. A well-structured CLI with --json output gives you deterministic parsing, no schema negotiation overhead, and the ability to pipe results through standard Unix tools before the LLM ever sees them. For internal tooling you control end-to-end, that's often the better trade.

So What Do You Build On Today?

MCP is in its awkward adolescence. It won the standards war at age one and is now discovering everything that comes after. If you're building on it today, budget for workarounds and treat it as one integration pattern among several — not the universal adapter for everything. The working groups are active. The fixes aren't landing this quarter.