Agent Platform Memory Bank quickstart with Agent Development Kit

Agent Platform Memory Bank allows your agents to manage long-term memories across sessions. When used with the Agent Development Kit (ADK), your agent can automatically orchestrate calls to Memory Bank to store and retrieve memories based on user interactions.

This document explains how to create an ADK agent, configure it to use Memory Bank, and interact with it to generate and access memories.

For information on making direct calls to the API without ADK, see the Memory Bank API quickstart .

Manage memories with ADK memory service and Memory Bank

VertexAiMemoryBankService is an ADK wrapper around Memory Bank that is defined by ADK's BaseMemoryService . You can define callbacks and tools that interact with the memory service to read and write memories.

The VertexAiMemoryBankService interface includes:

  • memory_service.add_session_to_memory triggers a GenerateMemories request to Memory Bank using all of the events in the provided adk.Session as the source content. You can orchestrate calls to this method using callback_context.add_session_to_memory in your callbacks.

      from 
      
     google.adk.agents.callback_context 
      
     import 
     CallbackContext 
     async 
     def 
      
     add_session_to_memory_callback 
     ( 
     callback_context 
     : 
     CallbackContext 
     ): 
     await 
     callback_context 
     . 
     add_session_to_memory 
     () 
     return 
     None 
     
    
  • memory_service.add_events_to_memory which triggers a GenerateMemories request to Memory Bank using a subset of events. You can orchestrate calls to this method using callback_context.add_events_to_memory in your callbacks.

      from 
      
     google.adk.agents.callback_context 
      
     import 
     CallbackContext 
     async 
     def 
      
     add_events_to_memory_callback 
     ( 
     callback_context 
     : 
     CallbackContext 
     ): 
     await 
     callback_context 
     . 
     add_events_to_memory 
     ( 
     events 
     = 
     callback_context 
     . 
     session 
     . 
     events 
     [ 
     - 
     5 
     : 
     - 
     1 
     ]) 
     return 
     None 
     
    
  • memory_service.search_memory triggers a RetrieveMemories request to Memory Bank to fetch relevant memories for the current user_id and app_name . You can orchestrate calls to this method using built-in memory tools ( LoadMemoryTool or PreloadMemoryTool ) or a custom tool that invokes tool_context.search_memory .

Before you begin

To complete the steps demonstrated in this tutorial, you must first follow the steps in the getting started section of the Set up Memory Bank page.

Set environment variables

To use ADK, set your environment variables:

  import 
  
 os 
 os 
 . 
 environ 
 [ 
 "GOOGLE_GENAI_USE_VERTEXAI" 
 ] 
 = 
 "TRUE" 
 os 
 . 
 environ 
 [ 
 "GOOGLE_CLOUD_PROJECT" 
 ] 
 = 
 " PROJECT_ID 
" 
 os 
 . 
 environ 
 [ 
 "GOOGLE_CLOUD_LOCATION" 
 ] 
 = 
 " LOCATION 
" 
 

Replace the following:

  • PROJECT_ID : Your project ID.
  • LOCATION : Your region. See the supported regions for Memory Bank.

Create your ADK agent

To create a memory-enabled agent, set up tools and callbacks that orchestrate calls to your memory service.

Define a memory generation callback

To orchestrate calls for memory generation, create a callback function that triggers memory generation. You can either send a subset of events (with callback_context.add_events_to_memory ) or all of the events in a session (with callback_context.add_session_to_memory ) to be processed in the background:

  from 
  
 google.adk.agents.callback_context 
  
 import 
 CallbackContext 
 async 
 def 
  
 generate_memories_callback 
 ( 
 callback_context 
 : 
 CallbackContext 
 ): 
 # Option 1 (Recommended): Send events to Memory Bank for memory generation, 
 # which is ideal for incremental processing of events. 
 await 
 callback_context 
 . 
 add_events_to_memory 
 ( 
 events 
 = 
 callback_context 
 . 
 session 
 . 
 events 
 [ 
 - 
 5 
 : 
 - 
 1 
 ]) 
 # Option 2: Send the full session to Memory Bank for memory generation. 
 # It's recommended to only call this at the end of a session to minimize 
 # how many times a single event is re-processed. 
 await 
 callback_context 
 . 
 add_session_to_memory 
 () 
 return 
 None 
 

Define a memory retrieval tool

When developing your ADK agent , include a memory tool that controls when the agent retrieves memories and how memories are included in the prompt.

If you use PreloadMemoryTool , your agent will retrieve memories at the start of each turn and include the retrieved memories in the system instruction, which is good for establishing baseline context about the user. If you use LoadMemoryTool , the model will call this tool when it decides that memories are necessary to answer the user query.

  from 
  
 google 
  
 import 
 adk 
 from 
  
 google.adk.tools.load_memory_tool 
  
 import 
 LoadMemoryTool 
 from 
  
 google.adk.tools.preload_memory_tool 
  
 import 
 PreloadMemoryTool 
 memory_retrieval_tools 
 = 
 [ 
 # Option 1: Retrieve memories at the start of every turn. 
 PreloadMemoryTool 
 (), 
 # Option 2: Retrieve memories via tool calls. The model will only call this tool 
 # when it decides that memories are necessary to respond to the user query. 
 LoadMemoryTool 
 () 
 ] 
 agent 
 = 
 adk 
 . 
 Agent 
 ( 
 model 
 = 
 "gemini-2.5-flash" 
 , 
 name 
 = 
 'stateful_agent' 
 , 
 instruction 
 = 
 """You are a Vehicle Voice Agent, designed to assist users with information and in-vehicle actions. 
 1.  **Direct Action:** If a user requests a specific vehicle function (e.g., "turn on the AC"), execute it immediately using the corresponding tool. You don't have the outcome of the actual tool execution, so provide a hypothetical tool execution outcome. 
 2.  **Information Retrieval:** Respond concisely to general information requests with your own knowledge (e.g., restaurant recommendation). 
 3.  **Clarity:** When necessary, try to seek clarification to better understand the user's needs and preference before taking an action. 
 4.  **Brevity:** Limit responses to under 30 words. 
 """ 
 , 
 tools 
 = 
 memory_retrieval_tools 
 , 
 after_agent_callback 
 = 
 generate_memories_callback 
 ) 
 

Alternatively, you can create your own custom tool to retrieve memories, which is helpful for when you want to provide instructions to your agent on when to retrieve memories:

  from 
  
 google 
  
 import 
 adk 
 from 
  
 google.adk.tools 
  
 import 
 ToolContext 
 , 
 FunctionTool 
 async 
 def 
  
 search_memories 
 ( 
 query 
 : 
 str 
 , 
 tool_context 
 : 
 ToolContext 
 ): 
  
 """Query this tool when you need to fetch information about user preferences.""" 
 return 
 await 
 tool_context 
 . 
 search_memory 
 ( 
 query 
 ) 
 agent 
 = 
 adk 
 . 
 Agent 
 ( 
 model 
 = 
 "gemini-2.5-flash" 
 , 
 name 
 = 
 'stateful_agent' 
 , 
 instruction 
 = 
 """...""" 
 , 
 tools 
 = 
 [ 
 FunctionTool 
 ( 
 func 
 = 
 search_memories 
 )], 
 after_agent_callback 
 = 
 generate_memories_callback 
 ) 
 

Define an ADK Memory Bank memory service and runtime

After you've created your memory-enabled agent, you need to link it to a memory service. The process of configuring your ADK memory service depends on where your ADK agent runs , which orchestrates the execution of your agents, tools, and callbacks.

Create an Agent Runtime instance

You first need to create an Agent Runtime instance to use for Memory Bank. This step is optional if you're using Agent Runtime Runtime to deploy your agent. For more information on customizing your Memory Bank behavior, see the Configure your Agent Runtime instance for Memory Bank section on the Set up Memory Bank page.

  import 
  
  vertexai 
 
 client 
 = 
  vertexai 
 
 . 
 Client 
 ( 
 project 
 = 
 " PROJECT_ID 
" 
 , 
 location 
 = 
 " LOCATION 
" 
 ) 
 # If you don't have an Agent Runtime instance already, create a Agent Platform 
 # Memory Bank instance using the default configuration. 
 agent_engine 
 = 
 client 
 . 
  agent_engines 
 
 . 
 create 
 () 
 # Optionally, print out the resource name. You will need the 
 # resource name if you want to interact with your Runtime instance later on. 
 print 
 ( 
  agent_engine 
 
 . 
  api_resource 
 
 . 
 name 
 ) 
 agent_engine_id 
 = 
  agent_engine 
 
 . 
  api_resource 
 
 . 
 name 
 . 
 split 
 ( 
 "/" 
 )[ 
 - 
 1 
 ] 
 

Replace the following:

  • PROJECT_ID : Your project ID.
  • LOCATION : Your region. See the supported regions for Memory Bank.

Create an ADK runtime

Pass the Agent Runtime ID to the runtime or deployment scripts so that your agent uses Memory Bank as the ADK memory service.

Local runner

adk.Runner is generally used in a local environment, like Colab. In this case, you need to directly create the memory service and runner.

  import 
  
 asyncio 
 from 
  
 google.adk.memory 
  
 import 
 VertexAiMemoryBankService 
 from 
  
 google.adk.sessions 
  
 import 
 VertexAiSessionService 
 from 
  
 google.genai 
  
 import 
 types 
 memory_service 
 = 
 VertexAiMemoryBankService 
 ( 
 project 
 = 
 " PROJECT_ID 
" 
 , 
 location 
 = 
 " LOCATION 
" 
 , 
 agent_engine_id 
 = 
 " AGENT_ENGINE_ID 
" 
 , 
 ) 
 # You can use any ADK session service. This example uses Sessions. 
 session_service 
 = 
 VertexAiSessionService 
 ( 
 project 
 = 
 " PROJECT_ID 
" 
 , 
 location 
 = 
 " LOCATION 
" 
 , 
 agent_engine_id 
 = 
 " AGENT_ENGINE_ID 
" 
 , 
 ) 
 runner 
 = 
 adk 
 . 
 Runner 
 ( 
 agent 
 = 
 agent 
 , 
 app_name 
 = 
 " APP_NAME 
" 
 , 
 session_service 
 = 
 session_service 
 , 
 memory_service 
 = 
 memory_service 
 ) 
 async 
 def 
  
 call_agent 
 ( 
 query 
 , 
 session 
 , 
 user_id 
 ): 
 content 
 = 
 types 
 . 
 Content 
 ( 
 role 
 = 
 'user' 
 , 
 parts 
 = 
 [ 
 types 
 . 
 Part 
 ( 
 text 
 = 
 query 
 )]) 
 events 
 = 
 runner 
 . 
 run_async 
 ( 
 user_id 
 = 
 user_id 
 , 
 session_id 
 = 
 session 
 , 
 new_message 
 = 
 content 
 ) 
 async 
 for 
 event 
 in 
 events 
 : 
 if 
 event 
 . 
 is_final_response 
 (): 
 final_response 
 = 
 event 
 . 
 content 
 . 
 parts 
 [ 
 0 
 ] 
 . 
 text 
 print 
 ( 
 "Agent Response: " 
 , 
 final_response 
 ) 
 

Replace the following:

  • PROJECT_ID : Your project ID.
  • LOCATION : Your region. See the supported regions for Memory Bank.
  • APP_NAME : ADK app name. The app name will be included in the generated memories' scope dictionary so that memories are isolated across both users and apps.
  • AGENT_ENGINE_ID : The Agent Runtime ID to use for Memory Bank and Agent Platform Sessions. For example, 456 in projects/my-project/locations/us-central1/reasoningEngines/456 .

Agent Runtime

The Agent Runtime ADK template ( AdkApp ) can be used both locally and to deploy an ADK agent to Agent Runtime. When deployed on Agent Platform, the Agent Runtime ADK template uses VertexAiMemoryBankService as the default memory service, using the same Runtime instance for Memory Bank as the runtime. So, you can create your Memory Bank instance and deploy to a runtime in a single step.

See Configure Agent Runtime for more details on setting up your Agent Runtime instance, including how to customize the behavior of your Memory Bank.

Use the following code to deploy your memory-enabled ADK agent to Agent Runtime:

  import 
  
 asyncio 
 import 
  
  vertexai 
 
 from 
  
 vertexai.agent_engines 
  
 import 
  AdkApp 
 
 client 
 = 
  vertexai 
 
 . 
 Client 
 ( 
 project 
 = 
 " PROJECT_ID 
" 
 , 
 location 
 = 
 " LOCATION 
" 
 ) 
 adk_app 
 = 
 AdkApp 
 ( 
 agent 
 = 
 agent 
 ) 
 # Create a new resource with your agent deployed to Agent Runtime. 
 # The Agent Runtime instance will also include an empty Memory Bank. 
 agent_engine 
 = 
 client 
 . 
  agent_engines 
 
 . 
 create 
 ( 
 agent_engine 
 = 
 adk_app 
 , 
 config 
 = 
 { 
 "staging_bucket" 
 : 
 " STAGING_BUCKET 
" 
 , 
 "requirements" 
 : 
 [ 
 "google-cloud-aiplatform[agent_engines,adk]" 
 ] 
 } 
 ) 
 # Alternatively, update an existing resource to deploy your agent to Agent Platform. 
 # Your agent will have access to the Runtime instance's existing memories. 
 agent_engine 
 = 
 client 
 . 
  agent_engines 
 
 . 
 update 
 ( 
 name 
 = 
  agent_engine 
 
 . 
  api_resource 
 
 . 
 name 
 , 
 agent_engine 
 = 
 adk_app 
 , 
 config 
 = 
 { 
 "staging_bucket" 
 : 
 " STAGING_BUCKET 
" 
 , 
 "requirements" 
 : 
 [ 
 "google-cloud-aiplatform[agent_engines,adk]" 
 ] 
 } 
 ) 
 async 
 def 
  
 call_agent 
 ( 
 query 
 , 
 session_id 
 , 
 user_id 
 ): 
 async 
 for 
 event 
 in 
  agent_engine 
 
 . 
  async_stream_query 
 
 ( 
 user_id 
 = 
 user_id 
 , 
 session_id 
 = 
 session_id 
 , 
 message 
 = 
 query 
 , 
 ): 
 print 
 ( 
 event 
 ) 
 

Replace the following:

  • PROJECT_ID : Your project ID.
  • LOCATION : Your region. See the supported regions for Memory Bank.
  • STAGING_BUCKET : Your Cloud Storage bucket to use for staging your Agent Runtime.

When run locally, the ADK template uses InMemoryMemoryService as the default memory service. However, you can override the default memory service to use VertexAiMemoryBankService :

  def 
  
 memory_bank_service_builder 
 (): 
 return 
 VertexAiMemoryBankService 
 ( 
 project 
 = 
 " PROJECT_ID 
" 
 , 
 location 
 = 
 " LOCATION 
" 
 , 
 agent_engine_id 
 = 
 " AGENT_ENGINE_ID 
" 
 ) 
 adk_app 
 = 
 AdkApp 
 ( 
 agent 
 = 
 adk_agent 
 , 
 # Override the default memory service. 
 memory_service_builder 
 = 
 memory_bank_service_builder 
 ) 
 async 
 def 
  
 call_agent 
 ( 
 query 
 , 
 session_id 
 , 
 user_id 
 ): 
 # adk_app is a local agent. If you want to deploy it to Agent Runtime, 
 # use `client.agent_engines.create(...)` or `client.agent_engines.update(...)` 
 # and call the returned Agent Runtime instance instead. 
 async 
 for 
 event 
 in 
 adk_app 
 . 
 async_stream_query 
 ( 
 user_id 
 = 
 user_id 
 , 
 session_id 
 = 
 session_id 
 , 
 message 
 = 
 query 
 , 
 ): 
 print 
 ( 
 event 
 ) 
 

Replace the following:

  • PROJECT_ID : Your project ID.
  • LOCATION : Your region. See the supported regions for Memory Bank.
  • AGENT_ENGINE_ID : The Agent Runtime ID to use for Memory Bank. For example, 456 in projects/my-project/locations/us-central1/reasoningEngines/456 .

Cloud Run

To deploy your agent to Cloud Run, refer to the instructions in the ADK documentation to learn how to define your agent to deploy to Cloud Run.

 adk  
deploy  
cloud_run  
 \ 
  
...  
--memory_service_uri = 
agentengine:// AGENT_ENGINE_ID 
 

GKE

To deploy your agent to Google Kubernetes Engine (GKE), refer to the instructions in the ADK documentation to learn how to define your agent to deploy to GKE.

 adk  
deploy  
gke  
 \ 
  
...  
--memory_service_uri = 
agentengine:// AGENT_ENGINE_ID 
 

ADK Web

The ADK web interface lets you test your agents directly in the browser.

  export 
  
 GOOGLE_CLOUD_PROJECT 
 = 
 " PROJECT_ID 
" 
 export 
  
 GOOGLE_CLOUD_LOCATION 
 = 
 " LOCATION 
" 
adk  
web  
--memory_service_uri = 
agentengine:// AGENT_ENGINE_ID 
 

Replace the following:

  • PROJECT_ID : Your project ID.
  • LOCATION : Your region. See the supported regions for Memory Bank.
  • AGENT_ENGINE_ID : The Agent Runtime ID to use for Memory Bank. For example, 456 in projects/my-project/locations/us-central1/reasoningEngines/456 .

Interact with your agent

After defining your agent and setting up Memory Bank, you can interact with your agent. If you provided a callback to trigger memory generation when initializing your agent, memories generation will be triggered every time that the agent is invoked.

Memories will be stored using the scope {"user_id": USER_ID, "app_name": APP_NAME} corresponding to the user ID and app name used to execute your agent.

The method of interacting with your agent depends on its execution environment:

Local runner

  # Use `asyncio.run(session_service.create(...))` if you're running this 
 # code as a standard Python script. 
 session 
 = 
 await 
 session_service 
 . 
 create_session 
 ( 
 app_name 
 = 
 " APP_NAME 
" 
 , 
 user_id 
 = 
 " USER_ID 
" 
 ) 
 # Use `asyncio.run(call_agent(...))` if you're running this code as a 
 # standard Python script. 
 await 
 call_agent 
 ( 
 "Can you fix the temperature?" 
 , 
 session 
 . 
 id 
 , 
 " USER_ID 
" 
 ) 
 

Replace the following:

  • APP_NAME : App name for your runner.
  • USER_ID : An identifier for your user. Memories generated from this session are keyed by this opaque identifier. The generated memories' scope is stored as {"user_id": " USER_ID "} .

Agent Runtime

When using the ADK template, you can call your Agent Runtime to interact with memory and sessions.

  # Use `asyncio.run(agent_engine.async_create_session(...))` if you're 
 # running this code as a standard Python script. 
 session 
 = 
 await 
 agent_engine 
 . 
 async_create_session 
 ( 
 user_id 
 = 
 " USER_ID 
" 
 ) 
 # Use `asyncio.run(call_agent(...))` if you're running this code as a 
 # standard Python script. 
 await 
 call_agent 
 ( 
 "Can you fix the temperature?" 
 , 
 session 
 . 
 get 
 ( 
 "id" 
 ), 
 " USER_ID 
" 
 ) 
 

Replace the following:

  • USER_ID : An identifier for your user. Memories generated from this session are keyed by this opaque identifier. The generated memories' scope is stored as {"user_id": " USER_ID "} .

Cloud Run

Refer to the Testing your agent section of the ADK Cloud Run deployment documentation.

GKE

Refer to the Testing your agent section of the ADK GKE deployment documentation.

ADK Web

To use ADK Web, navigate to the local server at http://localhost:8000 .

By default, ADK Web will set the user ID to user . To override the default user ID, include userId in the query parameters, like http://localhost:8000?userId=YOUR_USER_ID .

For more information, refer to the ADK Web page in the ADK documentation.

Example interaction

First session

If you used the PreloadMemoryTool , the agent will try to retrieve memories at the beginning of each turn to access preferences the user previously communicated to the agent. During the agent's first interaction with the user, there are no available memories to be retrieved. So, the agent doesn't know any user preferences, such as their preferred temperature, as shown in the following example:

  1. First turn:

    • User: "Can you fix the temperature?"

    • (Tool Call): ADK tries to fetch memories; no memories are available.

    • Model: "What temperature do you prefer?"

    • (Callback): ADK triggers memory generation. No memories are extracted.

  2. Second turn:

    • User: I'm comfortable at 71 degrees.

    • (Tool Call): ADK tries to fetch memories; no memories are available.

    • Model: Ok, I've updated the temperature to 71 degrees.

    • (Callback): ADK triggers memory generation. Memory "I like the temperature 71 degrees" is created.

Second session

The extracted memory will be available for the next session for the same app name and user ID. If the user provides similar or contradictory information to existing memories, the new information will be consolidated with the existing memories .

  1. First turn

    • User: Fix the temperature. It's so uncomfortable!

    • (Tool Call): ADK tries to fetch memories. Memory "I like the temperature 71 degrees" is retrieved.

    • Model: Ok, I've updated the temperature to 71 degrees.

    • (Callback): ADK triggers memory generation. No memories are extracted, because the user didn't share anything meaningful to persist.

  2. Second turn

    • User: Actually, I prefer it to be warmer in the mornings.

    • (Tool Call): ADK tries to fetch memories. Memory "I like the temperature 71 degrees" is retrieved.

    • Model: Ok, I've made the temperature warmer.

    • (Callback): ADK triggers memory generation. The existing memory "I like the temperature 71 degrees" is updated to "I generally like the temperature to be 71 degrees, but I like it to be warmer in the mornings".

Clean up

To clean up all resources used in this project, you can delete the Google Cloud project you used for the quickstart.

Otherwise, you can delete the individual resources you created in this tutorial, as follows:

  1. Use the following code sample to delete the Agent Runtime instance, which also deletes any sessions or memories belonging to that runtime.

      agent_engine 
     . 
     delete 
     ( 
     force 
     = 
     True 
     ) 
     
    
  2. Delete any locally created files.

What's next

Quickstart

Get started with the Memory Bank API to manage long-term memories.

Design a Mobile Site
View Site in Mobile | Classic
Share by: