Host AI agents on Cloud Run

This page highlights use cases for hosting AI agents on Cloud Run.

AI agents are autonomous software entities that use LLM-powered systems to perceive, decide, and act to achieve goals. As more autonomous agents are built, their ability to communicate and collaborate becomes crucial.

For an introduction to AI agents, see What is an AI agent .

Use cases for AI agents on Cloud Run

You can implement AI agents as Cloud Run services to orchestrate a set of asynchronous tasks and provide information through multiple request-response interactions.

A Cloud Run service is a scalable API endpoint for your application's core logic. It efficiently manages multiple concurrent users through automatic, on-demand, and rapid scaling of instances.

AI agent on Cloud Run architecture

A typical AI agent architecture deployed on Cloud Run can involve several components from Google Cloud as well as outside of Google Cloud:

The four components of AI agent hosted on Cloud Run. — **Figure 1.** Architecture of an AI agent on Cloud Run.

The diagram shows the following:

Hosting platform: Cloud Run is a hosting platform for running agents and it offers the following benefits:
- Supports running any agent framework to build different types of agents and agentic architectures. Examples of agent frameworks include Agent Development Kit (ADK) , Dify , and LangGraph , and n8n .
- Provides built-in features for managing your agent. For example, Cloud Run provides a built-in service identity that you can use as the agent identity for calling Google Cloud APIs with secure and automatic credentials.
- Supports connecting your agent framework to other services. You can connect your agent to first-party or third-party tools deployed on Cloud Run. For example, to gain visibility into your agent's tasks and executions, you can deploy and use tools like Langfuse and Arize .
Agent interactions: Cloud Run supports streaming HTTP responses back to the user, and WebSockets for real-time interactions.
GenAI models: The orchestration layer calls models for reasoning capabilities. These models can be hosted on services, such as the following:
- Gemini API for Google's generative AI models.
- Vertex AI endpoints for custom models or other foundation models.
- GPU-enabled-Cloud Run service for your own fine-tuned models.
Memory: Agents often need memory to retain context and learn from past interactions. You can use the following services:
- Memorystore for Redis for short-term memory.
- Firestore for long-term memory, such as storing the conversational history or remembering the user's preferences.
Vector database: For Retrieval-Augmented Generation (RAG) or fetching structured data, use a vector database to query specific entity information or perform a vector search over embeddings. Use the pgvector extension with the following services:
- Cloud SQL for PostgreSQL
- AlloyDB for PostgreSQL
Tools:The orchestrator uses tools to perform specific tasks to interact with external services, APIs, or websites. This can include:
- Model Context Protocol (MCP): Use this standardized protocol to communicate with external tools that are executed through an MCP server .
- Basic utilities: Precise math calculations, time conversions, or other similar utilities.
- API calling: Make calls to other internal or third-party APIs (read or write access).
- Image or chart generation: Quickly and effectively create visual content.
- Browser and OS automation: Run a headless or a full graphical Operating System within container instances to allow the agent to browse the web, extract information from websites, or perform actions using clicks and keyboard input.
- Code execution: Execute code in a secure environment with multi-layered sandboxing, with minimal or no IAM permissions .

What's next

Watch Build AI agents on Cloud Run .
Try the codelab for learning how to build and deploy a LangChain app to Cloud Run .
Learn how to deploy Agent Development Kit (ADK) to Cloud Run .
Try the codelab for using an MCP server on Cloud Run with an ADK agent .
Try the codelab for deploying your ADK agent to Cloud Run with GPU .
Find ready-to-use agent samples in Agent Development Kit (ADK) samples .
Host Model Context Protocol (MCP) servers on Cloud Run .

Host AI agents on Cloud Run Stay organized with collections Save and categorize content based on your preferences.

Use cases for AI agents on Cloud Run

AI agent on Cloud Run architecture

What's next

Host AI agents on Cloud Run