Module 34: Deploying an MCP Server to Cloud Run
Theoryβ
The Challenge of State in a Serverless Worldβ
In the previous lab, you built a stateful MCP server. Its stateβthe contents of the shopping cartsβwas stored in a simple Python dictionary: SESSION_CARTS = {}. This works perfectly as long as the server is a single, long-running process on your local machine.
However, when we move to a cloud platform like Google Cloud Run, this model breaks down. Cloud Run is a stateless and serverless environment. This means:
- Stateless: Each incoming request could be handled by a completely different container instance. An instance that handles a user's "add item" request might be shut down before the user makes a "view cart" request, which would then be routed to a brand new instance with no memory of the previous one. The in-memory
SESSION_CARTSdictionary would be lost. - Serverless: You don't have a single, persistent "server." Cloud Run automatically starts and stops container instances based on traffic. It might run ten instances to handle a spike in traffic and then scale down to zero when idle.
If we deploy our simple, in-memory MCP server directly to Cloud Run, every user's shopping cart would be wiped out between requests. We need a new architecture.