Set up Agent Platform Memory Bank

To use Agent Platform Memory Bank, you must first create and configure a Gemini Enterprise Agent Platform instance. This instance manages your memories and can be integrated with your agents across various runtimes.

This document explains how to set up your Google Cloud project, install the required libraries, and create or update an instance with custom configurations like topics and TTL.

Get started

Before you work with Memory Bank, you must set up your environment.

Set up your Google Cloud project

Every project can be identified in two ways: the project number or the project ID. The PROJECT_NUMBER is automatically created when you create the project, whereas the PROJECT_ID is created by you, or whoever created the project. To set up a project:

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project : Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project : To create a project, you need the Project Creator role ( roles/resourcemanager.projectCreator ), which contains the resourcemanager.projects.create permission. Learn how to grant roles .

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project .

  4. Enable the Agent Platform API.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role ( roles/serviceusage.serviceUsageAdmin ), which contains the serviceusage.services.enable permission. Learn how to grant roles .

    Enable the API

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project : Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project : To create a project, you need the Project Creator role ( roles/resourcemanager.projectCreator ), which contains the resourcemanager.projects.create permission. Learn how to grant roles .

    Go to project selector

  6. Verify that billing is enabled for your Google Cloud project .

  7. Enable the Agent Platform API.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role ( roles/serviceusage.serviceUsageAdmin ), which contains the serviceusage.services.enable permission. Learn how to grant roles .

    Enable the API

Get the required roles

To get the permissions that you need to use Memory Bank, ask your administrator to grant you the following IAM roles on your project:

For more information about granting roles, see Manage access to projects, folders, and organizations .

You might also be able to get the required permissions through custom roles or other predefined roles .

If you're making requests to Memory Bank from an agent deployed on Google Kubernetes Engine or Cloud Run, make sure that your service account has the necessary permissions. The Reasoning Engine Service Agent already has the necessary permissions to read and write memories, so outbound requests from Agent Runtime should already have permission to access Memory Bank.

Install libraries

This section assumes that you have set up a Python development environment , or are using a runtime with a Python development environment (such as Colab).

Install the Agent Platform SDK:

 pip  
install  
google-cloud-aiplatform> = 
 1 
.111.0 

Authentication

Follow the instructions at Authenticate to Vertex AI .

Set up a Agent Platform SDK client

Run the following code to set up a Agent Platform SDK client:

Agent Platform SDK

  import 
  
  vertexai 
 
 client 
 = 
  vertexai 
 
 . 
 Client 
 ( 
 project 
 = 
 " PROJECT_ID 
" 
 , 
 location 
 = 
 " LOCATION 
" 
 , 
 ) 
 

where

Create or update an Agent Platform instance

To get started with Memory Bank, you first need an Agent Platform instance. If you don't already have an instance, you can create it using the default configuration:

  agent_engine 
 = 
 client 
 . 
 agent_engines 
 . 
 create 
 () 
 # Optionally, print out the Agent Platform resource name. You will need the 
 # resource name to interact with your Agent Platform instance later on. 
 print 
 ( 
 agent_engine 
 . 
 api_resource 
 . 
 name 
 ) 
 

If you want to customize the configuration of your new or existing Memory Bank instance's behavior, refer to Configure your Agent Platform instance for Memory Bank . For example, you can specify what information Memory Bank considers meaningful to persist.

Your Agent Platform instance supports Sessions and Memory Bank out-of-the-box. No agent is deployed when you create the instance. To use Agent Runtime, you must provide the agent that should be deployed when creating or updating your Agent Platform instance.

Once you have an Agent Platform instance, you can use the name of the instance to read or write memories. For example:

  # Generate memories using your Memory Bank instance. 
 client 
 . 
 agent_engines 
 . 
 memories 
 . 
 generate 
 ( 
 # `name` should have the format `projects/.../locations/.../reasoningEngines/...`. 
 name 
 = 
 agent_engine 
 . 
 api_resource 
 . 
 name 
 , 
 ... 
 ) 
 

Use with Agent Runtime

Although Memory Bank can be used in any runtime, you can also use Memory Bank with Agent Runtime to read and write memories from your deployed agent.

To deploy an agent with Memory Bank on Agent Platform, first set up your environment for Agent Runtime . Then, prepare your agent to be deployed on Agent Runtime with memory integration. Your deployed agent should make calls to read and write memories as needed.

AdkApp

If you're using the Agent Platform Agent Development Kit template , the agent uses the VertexAiMemoryBankService by default when deployed to Agent Platform. This means that the ADK Memory tools read memories from Memory Bank.

  from 
  
 google.adk.agents 
  
 import 
 Agent 
 from 
  
 vertexai.preview.reasoning_engines 
  
 import 
 AdkApp 
 # Develop an agent using the ADK template. 
 agent 
 = 
 Agent 
 ( 
 ... 
 ) 
 adk_app 
 = 
 AdkApp 
 ( 
 agent 
 = 
 adk_agent 
 , 
 ... 
 ) 
 # Deploy the agent to Agent Runtime. 
 agent_engine 
 = 
 client 
 . 
 agent_engines 
 . 
 create 
 ( 
 agent_engine 
 = 
 adk_app 
 , 
 config 
 = 
 { 
 "staging_bucket" 
 : 
 " STAGING_BUCKET 
" 
 , 
 "requirements" 
 : 
 [ 
 "google-cloud-aiplatform[agent_engines,adk]" 
 ], 
 # Optional. 
 ** 
 context_spec 
 } 
 ) 
 # Update an existing Agent Runtime to add or modify the Runtime. 
 agent_engine 
 = 
 client 
 . 
 agent_engines 
 . 
 update 
 ( 
 name 
 = 
 agent_engine 
 . 
 api_resource 
 . 
 name 
 , 
 agent 
 = 
 adk_app 
 , 
 config 
 = 
 { 
 "staging_bucket" 
 : 
 " STAGING_BUCKET 
" 
 , 
 "requirements" 
 : 
 [ 
 "google-cloud-aiplatform[agent_engines,adk]" 
 ], 
 # Optional. 
 ** 
 context_spec 
 } 
 ) 
 

Replace the following:

  • STAGING_BUCKET : Your Cloud Storage bucket to use for staging your Agent Runtime.

For more information about using Memory Bank with ADK, refer to the Quickstart with Agent Development Kit .

Custom agent

You can use Memory Bank with your custom agent deployed on Agent Runtime. In this case, your agent should orchestrate calls to Memory Bank to trigger memory generation and memory retrieval calls.

Your application deployed to Agent Runtime can read the environment variables GOOGLE_CLOUD_PROJECT , GOOGLE_CLOUD_LOCATION , GOOGLE_CLOUD_AGENT_ENGINE_ID to infer the Agent Runtime name from the environment:

  project 
 = 
 os 
 . 
 environ 
 . 
 get 
 ( 
 "GOOGLE_CLOUD_PROJECT" 
 ) 
 location 
 = 
 os 
 . 
 environ 
 . 
 get 
 ( 
 "GOOGLE_CLOUD_LOCATION" 
 ) 
 agent_engine_id 
 = 
 os 
 . 
 environ 
 . 
 get 
 ( 
 "GOOGLE_CLOUD_AGENT_ENGINE_ID" 
 ) 
 agent_engine_name 
 = 
 f 
 "projects/ 
 { 
 project 
 } 
 /locations/ 
 { 
 location 
 } 
 /reasoningEngines/ 
 { 
 agent_engine_id 
 } 
 " 
 

If you're using the default service agent for your agent on Agent Runtime, your agent already has permission to read and write memories. If you're using a customer service account , you need to grant permissions to your service account to read and write memories. The required permissions depend on what operations your agent should be able to perform. If you only want your agent to retrieve and generate memories, aiplatform.memories.generate and aiplatform.memories.retrieve are sufficient.

Use in all other runtimes

If you want to use Memory Bank in a different environment, like Cloud Run or Colab, create an Agent Runtime without providing an agent . If you don't provide a configuration , Memory Bank is created with the default settings for managing memory generation and retrieval.

  agent_engine 
 = 
 client 
 . 
 agent_engines 
 . 
 create 
 () 
 

If you've used Agent Platform before, creating a new Agent Platform instance without a runtime should only take a few seconds. If this is the first time you're using Agent Platform, it may take longer (1-2 minutes).

If you want to configure behavior, provide a Memory Bank configuration :

Create

  agent_engine 
 = 
 client 
 . 
 agent_engines 
 . 
 create 
 ( 
 config 
 = 
 { 
 "context_spec" 
 : 
 { 
 "memory_bank_config" 
 : 
 ... 
 } 
 } 
 ) 
 

Update

If you want to change your Memory Bank configuration , you can update your Agent Platform instance.

  agent_engine 
 = 
 client 
 . 
 agent_engines 
 . 
 update 
 ( 
 # You can access the name using `agent_engine.api_resource.name` for an AgentEngine object. 
 name 
 = 
 " AGENT_ENGINE_NAME 
" 
 , 
 config 
 = 
 { 
 "context_spec" 
 : 
 { 
 "memory_bank_config" 
 : 
 ... 
 } 
 } 
 ) 
 

Replace the following:

  • RUNTIME_NAME : The name of the Agent Runtime. It should be in the format projects/.../locations/.../reasoningEngines/... . See the supported regions for Memory Bank.

You can use Memory Bank in any environment that has permission to read and write memories. For example, to use Memory Bank with Cloud Run, grant permissions to the Cloud Run service identity to read and write memories. The required permissions depend on what operations your agent should be able to perform. If you only want your agent to retrieve and generate memories, aiplatform.memories.generate and aiplatform.memories.retrieve are sufficient.

Configure your Agent Platform instance for Memory Bank

You can configure your Memory Bank to customize how memories are generated and managed. If you don't provide the configuration, then Memory Bank uses the default settings for each type of configuration.

You can configure the following Memory Bank settings for your instance:

  • Customization configuration : Configures how memories are extracted from source data and consolidated with existing memories.
  • Similarity search configuration : Specifies which embedding model Memory Bank uses for similarity search. Defaults to text-embedding-005 .
  • Generation configuration : Configures which LLM Memory Bank uses for memory generation. Defaults to gemini-2.5-flash .
  • TTL configuration : Configures how TTL is automatically set for created or updated memories. Defaults to no TTL.

The following sample shows the default Memory Bank:

Dictionary

  memory_bank_config 
 = 
 { 
 "generation_config" 
 : 
 { 
 # `gemini-2.5-flash` will be used to extract and consolidate memories. 
 # Note: The global endpoint will be used for regions that don't have a 
 # regional endpoint available. 
 "model" 
 : 
 "projects/ 
 {PROJECT} 
 /locations/ 
 {LOCATION} 
 /publishers/google/models/gemini-2.5-flash" 
 }, 
 "similarity_search_config" 
 : 
 { 
 # `text-embedding-005` will be used for similarity search, including 
 # during consolidation. Consolidation uses similarity search to find 
 # candidate memories that may be updated with new information. 
 "embedding_model" 
 : 
 "projects/ 
 {PROJECT} 
 /locations/ 
 {LOCATION} 
 /publishers/google/models/text-embedding-005" 
 }, 
 "ttl_config" 
 : 
 { 
 # Default TTL for memory revisions is 365 days. 
 "memory_revision_default_ttl" 
 : 
 f 
 " 
 { 
 365 
  
 * 
  
 24 
  
 * 
  
 60 
  
 * 
  
 60 
 } 
 s" 
 }, 
 "customization_configs" 
 : 
 [ 
 { 
 # Extract user information, preferences, key conversation details, 
 # and information that the user explicitly asked to be remembered. 
 "memory_topics" 
 : 
 [ 
 { 
 "managed_memory_topic" 
 : 
 "USER_PERSONAL_INFO" 
 }, 
 { 
 "managed_memory_topic" 
 : 
 "USER_PREFERENCES" 
 }, 
 { 
 "managed_memory_topic" 
 : 
 "KEY_CONVERSATION_DETAILS" 
 }, 
 { 
 "managed_memory_topic" 
 : 
 "EXPLICIT_INSTRUCTIONS" 
 } 
 ], 
 "consolidation_config" 
 : 
 { 
 # Only use the latest memory revision of each candidate memory during 
 # consolidation. 
 "revisions_per_candidate_count" 
 : 
 1 
 }, 
 # Only use the pre-defined set of examples. 
 "generate_memories_examples" 
 : 
 [], 
 # Generate memories in the first person. 
 "enable_third_person_memories" 
 : 
 False 
 } 
 ], 
 # Memory revisions will be persisted. This can be overridden on a request-level. 
 "disable_memory_revisions" 
 : 
 False 
 } 
 

Class-based

  from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfig 
 as 
 CustomizationConfig 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigConsolidationConfig 
 as 
 ConsolidationConfig 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigMemoryTopic 
 as 
 MemoryTopic 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigMemoryTopicManagedMemoryTopic 
 as 
 ManagedMemoryTopic 
 from 
  
 vertexai.types 
  
 import 
 ManagedTopicEnum 
 from 
  
 vertexai.types 
  
 import 
 ReasoningEngineContextSpecMemoryBankConfig 
 as 
 MemoryBankConfig 
 from 
  
 vertexai.types 
  
 import 
 ReasoningEngineContextSpecMemoryBankConfigGenerationConfig 
 as 
 GenerationConfig 
 from 
  
 vertexai.types 
  
 import 
 ReasoningEngineContextSpecMemoryBankConfigSimilaritySearchConfig 
 as 
 SimilaritySearchConfig 
 from 
  
 vertexai.types 
  
 import 
 ReasoningEngineContextSpecMemoryBankConfigTtlConfig 
 as 
 TtlConfig 
 memory_bank_config 
 = 
 MemoryBankConfig 
 ( 
 generation_config 
 = 
 GenerationConfig 
 ( 
 # `gemini-2.5-flash` will be used to extract and consolidate memories. 
 # Note: The global endpoint will be used for regions that don't have a 
 # regional endpoint available. 
 model 
 = 
 "projects/ 
 {PROJECT} 
 /locations/ 
 {LOCATION} 
 /publishers/google/models/gemini-2.5-flash" 
 ), 
 similarity_search_config 
 = 
 SimilaritySearchConfig 
 ( 
 # `text-embedding-005` will be used for similarity search, including 
 # during consolidation. Consolidation uses similarity search to find 
 # candidate memories that may be updated with new information. 
 embedding_model 
 = 
 "projects/ 
 {PROJECT} 
 /locations/ 
 {LOCATION} 
 /publishers/google/models/text-embedding-005" 
 ), 
 ttl_config 
 = 
 TtlConfig 
 ( 
 # Default TTL for memory revisions is 365 days. 
 memory_revision_default_ttl 
 = 
 f 
 " 
 { 
 365 
  
 * 
  
 24 
  
 * 
  
 60 
  
 * 
  
 60 
 } 
 s" 
 ), 
 customization_configs 
 = 
 [ 
 CustomizationConfig 
 ( 
 # Extract personal information, preferences, key conversation details, 
 # and information that the user explicitly asked to be remembered. 
 memory_topics 
 = 
 [ 
 MemoryTopic 
 ( 
 managed_memory_topic 
 = 
 ManagedMemoryTopic 
 ( 
 managed_topic_enum 
 = 
 ManagedTopicEnum 
 . 
 USER_PERSONAL_INFO 
 )), 
 MemoryTopic 
 ( 
 managed_memory_topic 
 = 
 ManagedMemoryTopic 
 ( 
 managed_topic_enum 
 = 
 ManagedTopicEnum 
 . 
 USER_PREFERENCES 
 )), 
 MemoryTopic 
 ( 
 managed_memory_topic 
 = 
 ManagedMemoryTopic 
 ( 
 managed_topic_enum 
 = 
 ManagedTopicEnum 
 . 
 KEY_CONVERSATION_DETAILS 
 )), 
 MemoryTopic 
 ( 
 managed_memory_topic 
 = 
 ManagedMemoryTopic 
 ( 
 managed_topic_enum 
 = 
 ManagedTopicEnum 
 . 
 EXPLICIT_INSTRUCTIONS 
 )) 
 ], 
 # Only use the pre-defined set of examples. 
 generate_memories_examples 
 = 
 [], 
 consolidation_config 
 = 
 ConsolidationConfig 
 ( 
 # Only use the latest memory revision of each candidate memory during 
 # consolidation. 
 revisions_per_candidate_count 
 = 
 1 
 ), 
 # Generate memories in the first person. 
 enable_third_person_memories 
 = 
 False 
 , 
 ) 
 ], 
 # Memory revisions will be persisted. This can be overridden on a request-level. 
 disable_memory_revisions 
 = 
 False 
 ) 
 

You can adjust the Memory Bank configuration when you create or update your Agent Platform instance. The following example demonstrates how to create or update an instance with a specific Memory Bank configuration.

  client 
 . 
 agent_engines 
 . 
 create 
 ( 
 ... 
 , 
 config 
 = 
 { 
 "context_spec" 
 : 
 { 
 "memory_bank_config" 
 : 
 memory_bank_config 
 } 
 } 
 ) 
 # Alternatively, update an existing Agent Platform instance's Memory Bank config. 
 agent_engine 
 = 
 client 
 . 
 agent_engines 
 . 
 update 
 ( 
 name 
 = 
 agent_engine 
 . 
 api_resource 
 . 
 name 
 , 
 config 
 = 
 { 
 "context_spec" 
 : 
 { 
 "memory_bank_config" 
 : 
 memory_bank_config 
 } 
 } 
 ) 
 

Natural language memory customization configuration

To customize how Memory Bank extracts natural language memories, configure the extraction behavior when you set up your instance. Use the following options to customize the behavior:

  • Configuring memory topics : Define the type of information that Memory Bank should consider meaningful to persist. Only information that fits one of these memory topics will be persisted by Memory Bank.
  • Providing few-shot examples : Demonstrate expected behavior for memory extraction to Memory Bank.
  • Configuring the memory perspective : Configure whether memories should be generated in the first person (default) or third person.
  • Configuring consolidation : Configure how many memory revisions Memory Bank considers when consolidating each memory candidate.

You can think of customizing your Memory Bank's extraction behavior in two steps: Telling and Showing. Memory Topics tell Memory Bank what information to persist. Few-shots show Memory Bank what kind of information should result in a specific memory, helping it learn the patterns, nuance, and phrasing that you expect it to understand.

You can optionally configure different behavior for different scope-levels. For example, the topics that are meaningful for session-level memories may not be meaningful for user-level memories (across multiple sessions). To configure behavior for a certain subset of memories, set the scope keys of the customization configuration. Only GenerateMemories requests that include those scope keys will use that configuration. You can also configure default behavior (applying to all sets of scope keys) by omitting the scope_key field. This configuration will apply to all requests that don't have a configuration that exactly match the scope keys for another customization configuration.

For example, the user_level_config would only apply to GenerateMemories requests that exactly use the scope key user_id (i.e. scope={"user_id": "123"} with no additional keys). default_config would apply to other requests:

Dictionary

  user_level_config 
 = 
 { 
 "scope_keys" 
 : 
 [ 
 "user_id" 
 ], 
 "memory_topics" 
 : 
 [ 
 ... 
 ], 
 "generate_memories_examples" 
 : 
 [ 
 ... 
 ] 
 } 
 default_config 
 = 
 { 
 "memory_topics" 
 : 
 [ 
 ... 
 ], 
 "generate_memories_examples" 
 : 
 [ 
 ... 
 ] 
 } 
 memory_bank_config 
 = 
 { 
 "customization_configs" 
 : 
 [ 
 user_level_config 
 , 
 default_config 
 ] 
 } 
 

Class-based

  from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfig 
 as 
 CustomizationConfig 
 user_level_config 
 = 
 CustomizationConfig 
 ( 
 scope_keys 
 = 
 [ 
 "user_id" 
 ], 
 memory_topics 
 = 
 [ 
 ... 
 ], 
 generate_memories_examples 
 = 
 [ 
 ... 
 ] 
 ) 
 

Configuring memory topics

"Memory topics" identify what information Memory Bank considers to be meaningful and should thus be persisted as generated memories . Memory Bank supports two types of memory topics:

  • Managed topics: Label and instructions are defined by Memory Bank. You only need to provide the name of the managed topic. For example,

    Dictionary

      memory_topic 
     = 
     { 
     "managed_memory_topic" 
     : 
     { 
     "managed_topic_enum" 
     : 
     "USER_PERSONAL_INFO" 
     } 
     } 
     
    

    Class-based

      from 
      
     vertexai.types 
      
     import 
     ManagedTopicEnum 
     from 
      
     vertexai.types 
      
     import 
     MemoryBankCustomizationConfigMemoryTopic 
     as 
     MemoryTopic 
     from 
      
     vertexai.types 
      
     import 
     MemoryBankCustomizationConfigMemoryTopicManagedMemoryTopic 
     as 
     ManagedMemoryTopic 
     memory_topic 
     = 
     MemoryTopic 
     ( 
     managed_memory_topic 
     = 
     ManagedMemoryTopic 
     ( 
     managed_topic_enum 
     = 
     ManagedTopicEnum 
     . 
     USER_PERSONAL_INFO 
     ) 
     ) 
     
    

    The following managed topics are supported by Memory Bank:

    • User information( USER_PERSONAL_INFO ): Significant information about the user, like names, relationships, hobbies, and important dates. For example, "I work at Google" or "My wedding anniversary is on December 31".
    • User preferences( USER_PREFERENCES ): Stated or implied likes, dislikes, preferred styles, or patterns. For example, "I prefer the middle seat."
    • Key conversation events and task outcomes( KEY_CONVERSATION_DETAILS ): Important milestones or conclusions within the dialogue. For example, "I booked plane tickets for a round trip between JFK and SFO. I leave on June 1, 2025 and return on June 7, 2025."
    • Explicit remember / forget instructions( EXPLICIT_INSTRUCTIONS ): Information that the user explicitly asks the agent to remember or forget. For example, if the user says "Remember that I primarily use Python," Memory Bank generates a memory such as "I primarily use Python."
  • Custom topics: Label and instructions are defined by you when setting up your Memory Bank instance. They will be used in the prompt for Memory Bank's extraction step. For example,

    Dictionary

      memory_topic 
     = 
     { 
     "custom_memory_topic" 
     : 
     { 
     "label" 
     : 
     "business_feedback" 
     , 
     "description" 
     : 
     """Specific user feedback about their experience at 
     the coffee shop. This includes opinions on drinks, food, pastries, ambiance, 
     staff friendliness, service speed, cleanliness, and any suggestions for 
     improvement.""" 
     } 
     } 
     
    

    Class-based

      from 
      
     vertexai.types 
      
     import 
     MemoryBankCustomizationConfigMemoryTopic 
     as 
     MemoryTopic 
     from 
      
     vertexai.types 
      
     import 
     MemoryBankCustomizationConfigMemoryTopicCustomMemoryTopic 
     as 
     CustomMemoryTopic 
     memory_topic 
     = 
     MemoryTopic 
     ( 
     custom_memory_topic 
     = 
     CustomMemoryTopic 
     ( 
     label 
     = 
     "business_feedback" 
     , 
     description 
     = 
     """Specific user feedback about their experience at 
     the coffee shop. This includes opinions on drinks, food, pastries, ambiance, 
     staff friendliness, service speed, cleanliness, and any suggestions for 
     improvement.""" 
     ) 
     ) 
     
    

    When using custom topics, it's recommended to also provide few-shot examples demonstrating how memories should be extracted from your conversation.

With customization, you can use any combination of memory topics. For example, you can use a subset of the available managed memory topics:

Dictionary

  customization_config 
 = 
 { 
 "memory_topics" 
 : 
 [ 
 { 
 "managed_memory_topic" 
 : 
 { 
 "managed_topic_enum" 
 : 
 "USER_PERSONAL_INFO" 
 } 
 }, 
 { 
 "managed_memory_topic" 
 : 
 { 
 "managed_topic_enum" 
 : 
 "USER_PREFERENCES" 
 } 
 } 
 ] 
 } 
 

Class-based

  from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfig 
 as 
 CustomizationConfig 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigMemoryTopic 
 as 
 MemoryTopic 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigMemoryTopicManagedMemoryTopic 
 as 
 ManagedMemoryTopic 
 from 
  
 vertexai.types 
  
 import 
 ManagedTopicEnum 
 customization_config 
 = 
 CustomizationConfig 
 ( 
 memory_topics 
 = 
 [ 
 MemoryTopic 
 ( 
 managed_memory_topic 
 = 
 ManagedMemoryTopic 
 ( 
 managed_topic_enum 
 = 
 ManagedTopicEnum 
 . 
 USER_PERSONAL_INFO 
 ) 
 ), 
 MemoryTopic 
 ( 
 managed_memory_topic 
 = 
 ManagedMemoryTopic 
 ( 
 managed_topic_enum 
 = 
 ManagedTopicEnum 
 . 
 USER_PREFERENCES 
 ) 
 ), 
 ] 
 ) 
 

You can also use a combination of managed and custom topics (or only use custom topics):

Dictionary

  customization_config 
 = 
 { 
 "memory_topics" 
 : 
 [ 
 { 
 "managed_memory_topic" 
 : 
 { 
 "managed_topic_enum" 
 : 
 "USER_PERSONAL_INFO" 
 } 
 }, 
 { 
 "custom_memory_topic" 
 : 
 { 
 "label" 
 : 
 "business_feedback" 
 , 
 "description" 
 : 
 """Specific user feedback about their experience at 
 the coffee shop. This includes opinions on drinks, food, pastries, ambiance, 
 staff friendliness, service speed, cleanliness, and any suggestions for 
 improvement.""" 
 } 
 } 
 ] 
 } 
 

Class-based

  from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfig 
 as 
 CustomizationConfig 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigMemoryTopic 
 as 
 MemoryTopic 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigMemoryTopicCustomMemoryTopic 
 as 
 CustomMemoryTopic 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigMemoryTopicManagedMemoryTopic 
 as 
 ManagedMemoryTopic 
 from 
  
 vertexai.types 
  
 import 
 ManagedTopicEnum 
 customization_config 
 = 
 CustomizationConfig 
 ( 
 memory_topics 
 = 
 [ 
 MemoryTopic 
 ( 
 managed_memory_topic 
 = 
 ManagedMemoryTopic 
 ( 
 managed_topic_enum 
 = 
 ManagedTopicEnum 
 . 
 USER_PERSONAL_INFO 
 ) 
 ), 
 MemoryTopic 
 ( 
 custom_memory_topic 
 = 
 CustomMemoryTopic 
 ( 
 label 
 = 
 "business_feedback" 
 , 
 description 
 = 
 """Specific user feedback about their experience at 
 the coffee shop. This includes opinions on drinks, food, pastries, ambiance, 
 staff friendliness, service speed, cleanliness, and any suggestions for 
 improvement.""" 
 ) 
 ) 
 ] 
 ) 
 

Few-shot examples

Few-shot examples allow you to demonstrate expected memory extraction behavior to Memory Bank. For example, you can provide a sample input conversation and the memories that are expected to be extracted from that conversation.

We recommend always using few-shots with custom topics so that Memory Bank can learn the intended behavior. Few-shots are optional when using managed topics since Memory Bank defines examples for each topic. Demonstrate conversations that are not expected to result in memories by providing an empty generated_memories list.

For example, you can provide few-shot examples that demonstrate how to extract feedback about your business from customer messages:

Dictionary

  example 
 = 
 { 
 "conversationSource" 
 : 
 { 
 "events" 
 : 
 [ 
 { 
 "content" 
 : 
 { 
 "role" 
 : 
 "model" 
 , 
 "parts" 
 : 
 [{ 
 "text" 
 : 
 "Welcome back to The Daily Grind! We'd love to hear your feedback on your visit." 
 }] 
 } 
 }, 
 { 
 "content" 
 : 
 { 
 "role" 
 : 
 "user" 
 , 
 "parts" 
 : 
 [{ 
 "text" 
 : 
 "Hey. The drip coffee was a bit lukewarm today, which was a bummer. Also, the music was way too loud, I could barely hear my friend." 
 }] 
 } 
 } 
 ] 
 }, 
 "generatedMemories" 
 : 
 [ 
 { 
 "fact" 
 : 
 "The user reported that the drip coffee was lukewarm." 
 }, 
 { 
 "fact" 
 : 
 "The user felt the music in the shop was too loud." 
 } 
 ] 
 } 
 

Class-based

  from 
  
 google.genai.types 
  
 import 
 Content 
 , 
 Part 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigGenerateMemoriesExample 
 as 
 GenerateMemoriesExample 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigGenerateMemoriesExampleConversationSource 
 as 
 ConversationSource 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigGenerateMemoriesExampleConversationSourceEvent 
 as 
 ConversationSourceEvent 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigGenerateMemoriesExampleGeneratedMemory 
 as 
 ExampleGeneratedMemory 
 example 
 = 
 GenerateMemoriesExample 
 ( 
 conversation_source 
 = 
 ConversationSource 
 ( 
 events 
 = 
 [ 
 ConversationSourceEvent 
 ( 
 content 
 = 
 Content 
 ( 
 role 
 = 
 "model" 
 , 
 parts 
 = 
 [ 
 Part 
 ( 
 text 
 = 
 "Welcome back to The Daily Grind! We'd love to hear your feedback on your visit." 
 )] 
 ) 
 ), 
 ConversationSourceEvent 
 ( 
 content 
 = 
 Content 
 ( 
 role 
 = 
 "user" 
 , 
 parts 
 = 
 [ 
 Part 
 ( 
 text 
 = 
 "Hey. The drip coffee was a bit lukewarm today, which was a bummer. Also, the music was way too loud, I could barely hear my friend." 
 )] 
 ) 
 ) 
 ] 
 ), 
 generated_memories 
 = 
 [ 
 ExampleGeneratedMemory 
 ( 
 fact 
 = 
 "The user reported that the drip coffee was lukewarm." 
 ), 
 ExampleGeneratedMemory 
 ( 
 fact 
 = 
 "The user felt the music in the shop was too loud." 
 ) 
 ] 
 ) 
 

You can also provide examples of conversations that shouldn't result in any generated memories by providing an empty list for the expected output ( generated_memories ):

Dictionary

  example 
 = 
 { 
 "conversationSource" 
 : 
 { 
 "events" 
 : 
 [ 
 { 
 "content" 
 : 
 { 
 "role" 
 : 
 "model" 
 , 
 "parts" 
 : 
 [{ 
 "text" 
 : 
 "Good morning! What can I get for you at The Daily Grind?" 
 }] 
 } 
 }, 
 { 
 "content" 
 : 
 { 
 "role" 
 : 
 "user" 
 , 
 "parts" 
 : 
 [{ 
 "text" 
 : 
 "Thanks for the coffee." 
 }] 
 } 
 } 
 ] 
 }, 
 "generatedMemories" 
 : 
 [] 
 } 
 

Class-based

  from 
  
 google.genai.types 
  
 import 
 Content 
 , 
 Part 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigGenerateMemoriesExample 
 as 
 GenerateMemoriesExample 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigGenerateMemoriesExampleConversationSource 
 as 
 ConversationSource 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigGenerateMemoriesExampleConversationSourceEvent 
 as 
 ConversationSourceEvent 
 example 
 = 
 GenerateMemoriesExample 
 ( 
 conversation_source 
 = 
 ConversationSource 
 ( 
 events 
 = 
 [ 
 ConversationSourceEvent 
 ( 
 content 
 = 
 Content 
 ( 
 role 
 = 
 "model" 
 , 
 parts 
 = 
 [ 
 Part 
 ( 
 text 
 = 
 "Welcome back to The Daily Grind! We'd love to hear your feedback on your visit." 
 )] 
 ) 
 ), 
 ConversationSourceEvent 
 ( 
 content 
 = 
 Content 
 ( 
 role 
 = 
 "user" 
 , 
 parts 
 = 
 [ 
 Part 
 ( 
 text 
 = 
 "Thanks for the coffee!" 
 )] 
 ) 
 ) 
 ] 
 ), 
 generated_memories 
 = 
 [] 
 ) 
 

Memory perspective

By default, memories are generated in the first person (e.g. "I use Memory Bank for memory management."). You can configure Memory Bank to generate in the third person (e.g., "The user uses Memory Bank for memory management.") using the enable_third_person_memories parameter.

Dictionary

  customization_config 
 = 
 { 
 "enable_third_person_memories" 
 : 
 True 
 } 
 

Class-based

  from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfig 
 as 
 CustomizationConfig 
 customization_config 
 = 
 CustomizationConfig 
 ( 
 enable_third_person_memories 
 = 
 True 
 ) 
 

Consolidation customization

During consolidation , Memory Bank determines how to integrate newly acquired information into your existing memory set. Memory Bank evaluates whether to ADD new memories, UPDATE existing memories with additional context, or DELETE obsolete memories.

To ensure high-quality, corroborated memories, Memory Bank can optionally analyze a memory's history to distinguish long-term trends from one-time outliers.

By default, Memory Bank only compares new information to the most recent snapshot of a candidate memory (a "memory revision" ). To increase the depth of this analysis, configure the revisions_per_candidate_count parameter. This parameter defines how many previous revisions of each "candidate memory" (the specific record being evaluated for an update) Memory Bank considers during consolidation.

Dictionary

  customization_config 
 = 
 { 
 "consolidation_customization" 
 : 
 { 
 "revisions_per_candidate_count" 
 : 
 10 
 } 
 } 
 

Class-based

  from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfig 
 as 
 CustomizationConfig 
 from 
  
 vertexai.types 
  
 import 
 MemoryBankCustomizationConfigConsolidationConfig 
 as 
 ConsolidationConfig 
 customization_config 
 = 
 CustomizationConfig 
 ( 
 consolidation_customization 
 = 
 ConsolidationConfig 
 ( 
 revisions_per_candidate_count 
 = 
 10 
 ) 
 ) 
 

Increasing revisions_per_candidate_count results in more consistent and corroborated memories by accounting for the repetition of ingested information. However, a higher count increases token consumption during the consolidation process.

Similarity search configuration

The similarity search configuration controls which embedding model is used by your instance for similarity search. Similarity search is used for identifying which memories should be candidates for consolidation and for similarity search-based memory retrieval . If this configuration is not provided, Memory Bank uses text-embedding-005 as the default model.

If you expect user conversations to be in non-English languages, use a model that supports multiple languages, such as gemini-embedding-001 or text-multilingual-embedding-002 , to improve retrieval quality.

Dictionary

  memory_bank_config 
 = 
 { 
 "similarity_search_config" 
 : 
 { 
 "embedding_model" 
 : 
 " EMBEDDING_MODEL 
" 
 , 
 } 
 } 
 

Class-based

  from 
  
 vertexai.types 
  
 import 
 ReasoningEngineContextSpecMemoryBankConfig 
 as 
 MemoryBankConfig 
 from 
  
 vertexai.types 
  
 import 
 ReasoningEngineContextSpecMemoryBankConfigSimilaritySearchConfig 
 as 
 SimilaritySearchConfig 
 memory_bank_config 
 = 
 MemoryBankConfig 
 ( 
 similarity_search_config 
 = 
 SimilaritySearchConfig 
 ( 
 embedding_model 
 = 
 " EMBEDDING_MODEL 
" 
 ) 
 ) 
 

Replace the following:

  • EMBEDDING_MODEL : The Google text embedding model to use for similarity search, in the format projects/{project}/locations/{location}/publishers/google/models/{model} .

Generation configuration

The generation configuration controls which LLM is used for generating memories , including extracting memories and consolidating new memories with existing memories.

Memory Bank uses gemini-2.5-flash as the default model. For regions that don't have regional Gemini availability , the global endpoint is used.

Dictionary

  memory_bank_config 
 = 
 { 
 "generation_config" 
 : 
 { 
 "model" 
 : 
 " LLM_MODEL 
" 
 , 
 } 
 } 
 

Class-based

  from 
  
 vertexai.types 
  
 import 
 ReasoningEngineContextSpecMemoryBankConfig 
 as 
 MemoryBankConfig 
 from 
  
 vertexai.types 
  
 import 
 ReasoningEngineContextSpecMemoryBankConfigGenerationConfig 
 as 
 GenerationConfig 
 memory_bank_config 
 = 
 MemoryBankConfig 
 ( 
 generation_config 
 = 
 GenerationConfig 
 ( 
 model 
 = 
 " LLM_MODEL 
" 
 ) 
 ) 
 

Replace the following:

  • LLM_MODEL : The Google LLM model to use for extracting and consolidating memories, in the format projects/{project}/locations/{location}/publishers/google/models/{model} .

Time to live (TTL) configuration

The TTL configuration controls how Memory Bank should dynamically set memories' expiration time. After their expiration time elapses, memories won't be available for retrieval and will be deleted.

If the configuration is not provided, expiration time won't be dynamically set for created or updated memories, so memories won't expire unless their expiration time is manually set.

There are two options for the TTL configuration:

  • Default TTL: The TTL will be applied to all operations that create or update a memory, including UpdateMemory , CreateMemory , and GenerateMemories .

    Dictionary

      memory_bank_config 
     = 
     { 
     "ttl_config" 
     : 
     { 
     "default_ttl" 
     : 
     f 
     " TTL 
    s" 
     } 
     } 
     
    

    Class-based

      from 
      
     vertexai.types 
      
     import 
     ReasoningEngineContextSpecMemoryBankConfig 
     as 
     MemoryBankConfig 
     from 
      
     vertexai.types 
      
     import 
     ReasoningEngineContextSpecMemoryBankConfigTtlConfig 
     as 
     TtlConfig 
     memory_bank_config 
     = 
     MemoryBankConfig 
     ( 
     ttl_config 
     = 
     TtlConfig 
     ( 
     default_ttl 
     = 
     f 
     " TTL 
    s" 
     ) 
     ) 
     
    

    Replace the following:

    • TTL : The duration in seconds for the TTL. For updated memories, the newly calculated expiration time (now + TTL) will overwrite the Memory's previous expiration time.
  • Granular (per-operation) TTL: The TTL is calculated based on which operation created or updated the Memory. If not set for a given operation, then the operation won't update the Memory's expiration time.

    Dictionary

      memory_bank_config 
     = 
     { 
     "ttl_config" 
     : 
     { 
     "granular_ttl" 
     : 
     { 
     "create_ttl" 
     : 
     f 
     " CREATE_TTL 
    s" 
     , 
     "generate_created_ttl" 
     : 
     f 
     " GENERATE_CREATED_TTL 
    s" 
     , 
     "generate_updated_ttl" 
     : 
     f 
     " GENERATE_UPDATED_TTL 
    s" 
     } 
     } 
     } 
     
    

    Class-based

      from 
      
     vertexai.types 
      
     import 
     ReasoningEngineContextSpecMemoryBankConfig 
     as 
     MemoryBankConfig 
     from 
      
     vertexai.types 
      
     import 
     ReasoningEngineContextSpecMemoryBankConfigTtlConfig 
     as 
     TtlConfig 
     from 
      
     vertexai.types 
      
     import 
     ReasoningEngineContextSpecMemoryBankConfigTtlConfigGranularTtlConfig 
     as 
     GranularTtlConfig 
     memory_bank_config 
     = 
     MemoryBankConfig 
     ( 
     ttl_config 
     = 
     TtlConfig 
     ( 
     granular_ttl_config 
     = 
     GranularTtlConfig 
     ( 
     create_ttl 
     = 
     f 
     " CREATE_TTL 
    s" 
     , 
     generate_created_ttl 
     = 
     f 
     " GENERATE_CREATED_TTL 
    s" 
     , 
     generate_updated_ttl 
     = 
     f 
     " GENERATE_UPDATED_TTL 
    s" 
     , 
     ) 
     ) 
     ) 
     
    

    Replace the following:

    • CREATE_TTL : The duration in seconds for the TTL for memories created using CreateMemory .
    • GENERATE_CREATED_TTL : The duration in seconds for the TTL for memories created using GenerateMemories .
    • GENERATE_UPDATED_TTL : The duration in seconds for the TTL for memories updated using GenerateMemories . The newly calculated expiration time (now + TTL) will overwrite the Memory's previous expiration time.

What's next

Quickstart

Get started with the Memory Bank API to manage long-term memories.

Quickstart

Get started with the Agent Development Kit (ADK).

Create a Mobile Website
View Site in Mobile | Classic
Share by: