The latest Gemini models, like Gemini 3.5 Flash , are available to use with Firebase AI Logic! Learn more.

Gemini 2.0 Flash and Flash-Lite models were shut down on June 1, 2026 . To avoid service disruption, update to a newer model like gemini-3.1-flash-lite . Learn more.

All Imagen models will shut down on June 24, 2026 . Learn about migrating your apps to use Nano Banana.

Thinking

Gemini 2.5 and later models can use an internal "thinking process" that significantly improves their reasoning and multi-step planning abilities, making them highly effective for complex tasks such as coding, advanced mathematics, and data analysis.

Thinking models offer the following configurations and options:

Control the amount of thinking
You can configure how much "thinking" that a model can do. This configuration is particularly important if reducing latency or cost is a priority. Also, review the comparison of task difficulties to decide how much a model might need its thinking capability.

Control this configuration either with thinking levels ( Gemini 3.x and later models) or with thinking budgets ( Gemini 2.5 models) .
Get thought summaries
You can enable thought summaries to include with the generated response. These summaries are synthesized versions of the model's raw thoughts and offer insights into the model's internal reasoning process.
Handle thought signatures
The Firebase AI Logic SDKs automatically handle thought signatures for you, which ensures that the model has access to the thought context from previous turns specifically when using function calling.

Make sure to review the best practices and prompting guidance for using thinking models.

Use a thinking model

Use a thinking model just like you'd use any other Gemini model.

To get the most out of thinking models, review Best practices & prompting guidance for using thinking models later on this page.

Models that support this capability

Only Gemini 3.x and Gemini 2.5 models support this capability.

gemini-3.1-pro-preview
gemini-3.5-flash
gemini-3.1-flash-lite
gemini-3-pro-image-preview (aka "Nano Banana Pro")
gemini-3.1-flash-image-preview (aka "Nano Banana 2")
gemini-2.5-pro
gemini-2.5-flash
gemini-2.5-flash-lite

Best practices & prompting guidance for using thinking models

We recommend testing your prompt in Google AI Studio or Vertex AI Studio where you can view the full thinking process. You can identify any areas where the model may have gone astray so that you can refine your prompts to get more consistent and accurate responses.

Begin with a general prompt that describes the desired outcome, and observe the model's initial thoughts on how it determines its response. If the response isn't as expected, help the model generate a better response by using any of the following prompting techniques :

Provide step-by-step instructions
Provide several examples of input-output pairs
Provide guidance for how the output and responses should be phrased and be formatted
Provide specific verification steps

In addition to prompting, consider using these recommendations:

Set system instructions , which are like a "preamble" that you add before the model gets exposed to any further instructions from the prompt or end user. They let you steer the behavior of the model based on your specific needs and use cases.
Set a thinking level (or thinking budget for Gemini 2.5 models) to control how much thinking the model can do. If you set it high, then the model can think more, if needed. If you set it lower, then the model won't "overthink" its response, and it also reserves more of the total token output limit for the actual response and can help reduce latency and cost.
Enable AI monitoring in the Firebase console to monitor the count of thinking tokens and the latency of your requests that have thinking enabled. And if you have thought summaries enabled, they will display in the console where you can inspect the model's detailed reasoning to help you debug and refine your prompts.

Control the amount of thinking

You can configure how much "thinking" and reasoning that a model can do before it returns a response. This configuration is particularly important if reducing latency or cost is a priority.

Make sure to review the comparison of task difficulties to decide how much a model might need its thinking capability. Here's some high-level guidance:

Set a lower thinking value for less complex tasks or if reducing latency or cost is a priority for you.
Set a higher thinking value for more complex tasks.

Control this configuration either with thinking levels ( Gemini 3 and later models) or with thinking budgets ( Gemini 2.5 models) .

Thinking levels ( Gemini 3.x and later models)

To control how much thinking a Gemini 3.x and later model can do to generate its response, you can specify a thinking level for the amount of thinking tokens that it's allowed to use.

Set the thinking level

Click your Gemini API provider to view provider-specific content and code on this page.

Set the thinking level in a GenerationConfig as part of creating the GenerativeModel instance. The configuration is maintained for the lifetime of the instance. If you want to use different thinking levels for different requests, then create GenerativeModel instances configured with each level.

Learn about supported values for thinking level later in this section.

Swift

Set the thinking level in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 // Set the thinking configuration 
 // Use a thinking level value appropriate for your model (example value shown here) 
 let 
  
 generationConfig 
  
 = 
  
 GenerationConfig 
 ( 
  
 thinkingConfig 
 : 
  
 ThinkingConfig 
 ( 
 thinkingLevel 
 : 
  
 . 
 low 
 ) 
 ) 
 // Specify the config as part of creating the `GenerativeModel` instance 
 let 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 firebaseAI 
 ( 
 backend 
 : 
  
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
 : 
  
 " GEMINI_3_MODEL_NAME 
" 
 , 
  
 generationConfig 
 : 
  
 generationConfig 
 ) 
 // ...

Kotlin

Set the values of the parameters in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 // Set the thinking configuration 
 // Use a thinking level value appropriate for your model (example value shown here) 
 val 
  
 generationConfig 
  
 = 
  
 generationConfig 
  
 { 
  
 thinkingConfig 
  
 = 
  
 thinkingConfig 
  
 { 
  
 thinkingLevel 
  
 = 
  
 ThinkingLevel 
 . 
 LOW 
  
 } 
 } 
 // Specify the config as part of creating the `GenerativeModel` instance 
 val 
  
 model 
  
 = 
  
 Firebase 
 . 
 ai 
 ( 
 backend 
  
 = 
  
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
  
 = 
  
 " GEMINI_3_MODEL_NAME 
" 
 , 
  
 generationConfig 
 , 
 ) 
 // ...

Java

Set the values of the parameters in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 // Set the thinking configuration 
 // Use a thinking level value appropriate for your model (example value shown here) 
 ThinkingConfig 
  
 thinkingConfig 
  
 = 
  
 new 
  
 ThinkingConfig 
 . 
 Builder 
 () 
  
 . 
 setThinkingLevel 
 ( 
 ThinkingLevel 
 . 
 LOW 
 ) 
  
 . 
 build 
 (); 
 GenerationConfig 
  
 generationConfig 
  
 = 
  
 GenerationConfig 
 . 
 builder 
 () 
  
 . 
 setThinkingConfig 
 ( 
 thinkingConfig 
 ) 
  
 . 
 build 
 (); 
 // Specify the config as part of creating the `GenerativeModel` instance 
 GenerativeModelFutures 
  
 model 
  
 = 
  
 GenerativeModelFutures 
 . 
 from 
 ( 
  
 FirebaseAI 
 . 
 getInstance 
 ( 
 GenerativeBackend 
 . 
 googleAI 
 ()) 
  
 . 
 generativeModel 
 ( 
  
 /* modelName */ 
  
 " GEMINI_3_MODEL_NAME 
" 
 , 
  
 /* generationConfig */ 
  
 generationConfig 
  
 ); 
 ); 
 // ...

Web

Set the values of the parameters in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 const 
  
 ai 
  
 = 
  
 getAI 
 ( 
 firebaseApp 
 , 
  
 { 
  
 backend 
 : 
  
 new 
  
 GoogleAIBackend 
 () 
  
 }); 
 // Set the thinking configuration 
 // Use a thinking level value appropriate for your model (example value shown here) 
 const 
  
 generationConfig 
  
 = 
  
 { 
  
 thinkingConfig 
 : 
  
 { 
  
 thinkingLevel 
 : 
  
 ThinkingLevel 
 . 
 LOW 
  
 } 
 }; 
 // Specify the config as part of creating the `GenerativeModel` instance 
 const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 model 
 : 
  
 " GEMINI_3_MODEL_NAME 
" 
 , 
  
 generationConfig 
  
 }); 
 // ...

Dart

Set the values of the parameters in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 // Set the thinking configuration 
 // Use a thinking level value appropriate for your model (example value shown here) 
 final 
  
 thinkingConfig 
  
 = 
  
 ThinkingConfig 
 . 
 withThinkingLevel 
 ( 
 ThinkingLevel 
 . 
 low 
 ); 
 final 
  
 generationConfig 
  
 = 
  
 GenerationConfig 
 ( 
  
 thinkingConfig: 
  
 thinkingConfig 
 ); 
 // Specify the config as part of creating the `GenerativeModel` instance 
 final 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 googleAI 
 (). 
 generativeModel 
 ( 
  
 model: 
  
 ' GEMINI_3_MODEL_NAME 
' 
 , 
  
 config: 
  
 generationConfig 
 , 
 ); 
 // ...

Unity

Set the values of the parameters in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 // Set the thinking configuration 
 // Use a thinking level value appropriate for your model (example value shown here) 
 var 
  
 thinkingConfig 
  
 = 
  
 new 
  
 ThinkingConfig 
 ( 
 thinkingLevel 
 : 
  
 ThinkingLevel 
 . 
 Low 
 ); 
 var 
  
 generationConfig 
  
 = 
  
 new 
  
 GenerationConfig 
 ( 
  
 thinkingConfig 
 : 
  
 thinkingConfig 
 ); 
 // Specify the config as part of creating the `GenerativeModel` instance 
 var 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 GetInstance 
 ( 
 FirebaseAI 
 . 
 Backend 
 . 
 GoogleAI 
 ()). 
 GetGenerativeModel 
 ( 
  
 modelName 
 : 
  
 " GEMINI_3_MODEL_NAME 
" 
 , 
  
 generationConfig 
 : 
  
 generationConfig 
 ); 
 // ...

Supported thinking level values

The following table lists the thinking level values that you can set for each model by configuring the model's thinkingLevel .

	`MINIMAL`	`LOW`	`MEDIUM`	`HIGH`
	Model uses as few tokens as possible; close to no thinking Low-complexity tasks	Model uses fewer tokens; minimizes latency & cost Simple tasks and high-throughput tasks	Model uses a balanced approach Moderate complexity tasks	Model uses tokens up to its maximum level Complex prompts that require deep reasoning
`gemini-3.1-pro-preview`				(default)
`gemini-3.5-flash`			(default)
`gemini-3.1-flash-lite`	(default)
`gemini-3-pro-image-preview` ("Nano Banana Pro")				(default)
`gemini-3.1-flash-image-preview` ("Nano Banana 2")				(default)

Thinking budgets ( Gemini 2.5 models)

To control how much thinking a Gemini 2.5 model can do to generate its response, you can specify a thinking budget for the amount of thinking tokens that it's allowed to use.

Set the thinking budget

Click your Gemini API provider to view provider-specific content and code on this page.

Set the thinking budget in a GenerationConfig as part of creating the GenerativeModel instance for a Gemini 2.5 model. The configuration is maintained for the lifetime of the instance. If you want to use different thinking budgets for different requests, then create GenerativeModel instances configured with each budget.

Learn about supported values for thinking budget later in this section.

Swift

Set the thinking budget in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 // Set the thinking configuration 
 // Use a thinking budget value appropriate for your model (example value shown here) 
 let 
  
 generationConfig 
  
 = 
  
 GenerationConfig 
 ( 
  
 thinkingConfig 
 : 
  
 ThinkingConfig 
 ( 
 thinkingBudget 
 : 
  
 1024 
 ) 
 ) 
 // Specify the config as part of creating the `GenerativeModel` instance 
 let 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 firebaseAI 
 ( 
 backend 
 : 
  
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
 : 
  
 " GEMINI_2.5_MODEL_NAME 
" 
 , 
  
 generationConfig 
 : 
  
 generationConfig 
 ) 
 // ...

Kotlin

Set the values of the parameters in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 // Set the thinking configuration 
 // Use a thinking budget value appropriate for your model (example value shown here) 
 val 
  
 generationConfig 
  
 = 
  
 generationConfig 
  
 { 
  
 thinkingConfig 
  
 = 
  
 thinkingConfig 
  
 { 
  
 thinkingBudget 
  
 = 
  
 1024 
  
 } 
 } 
 // Specify the config as part of creating the `GenerativeModel` instance 
 val 
  
 model 
  
 = 
  
 Firebase 
 . 
 ai 
 ( 
 backend 
  
 = 
  
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
  
 = 
  
 " GEMINI_2.5_MODEL_NAME 
" 
 , 
  
 generationConfig 
 , 
 ) 
 // ...

Java

Set the values of the parameters in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 // Set the thinking configuration 
 // Use a thinking budget value appropriate for your model (example value shown here) 
 ThinkingConfig 
  
 thinkingConfig 
  
 = 
  
 new 
  
 ThinkingConfig 
 . 
 Builder 
 () 
  
 . 
 setThinkingBudget 
 ( 
 1024 
 ) 
  
 . 
 build 
 (); 
 GenerationConfig 
  
 generationConfig 
  
 = 
  
 GenerationConfig 
 . 
 builder 
 () 
  
 . 
 setThinkingConfig 
 ( 
 thinkingConfig 
 ) 
  
 . 
 build 
 (); 
 // Specify the config as part of creating the `GenerativeModel` instance 
 GenerativeModelFutures 
  
 model 
  
 = 
  
 GenerativeModelFutures 
 . 
 from 
 ( 
  
 FirebaseAI 
 . 
 getInstance 
 ( 
 GenerativeBackend 
 . 
 googleAI 
 ()) 
  
 . 
 generativeModel 
 ( 
  
 /* modelName */ 
  
 " GEMINI_2.5_MODEL_NAME 
" 
 , 
  
 /* generationConfig */ 
  
 generationConfig 
  
 ); 
 ); 
 // ...

Web

Set the values of the parameters in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 const 
  
 ai 
  
 = 
  
 getAI 
 ( 
 firebaseApp 
 , 
  
 { 
  
 backend 
 : 
  
 new 
  
 GoogleAIBackend 
 () 
  
 }); 
 // Set the thinking configuration 
 // Use a thinking budget value appropriate for your model (example value shown here) 
 const 
  
 generationConfig 
  
 = 
  
 { 
  
 thinkingConfig 
 : 
  
 { 
  
 thinkingBudget 
 : 
  
 1024 
  
 } 
 }; 
 // Specify the config as part of creating the `GenerativeModel` instance 
 const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 model 
 : 
  
 " GEMINI_2.5_MODEL_NAME 
" 
 , 
  
 generationConfig 
  
 }); 
 // ...

Dart

Set the values of the parameters in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 // Set the thinking configuration 
 // Use a thinking budget value appropriate for your model (example value shown here) 
 final 
  
 thinkingConfig 
  
 = 
  
 ThinkingConfig 
 . 
 withThinkingBudget 
 ( 
 1024 
 ); 
 final 
  
 generationConfig 
  
 = 
  
 GenerationConfig 
 ( 
  
 thinkingConfig: 
  
 thinkingConfig 
 ); 
 // Specify the config as part of creating the `GenerativeModel` instance 
 final 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 googleAI 
 (). 
 generativeModel 
 ( 
  
 model: 
  
 ' GEMINI_2.5_MODEL_NAME 
' 
 , 
  
 config: 
  
 generationConfig 
 , 
 ); 
 // ...

Unity

Set the values of the parameters in a GenerationConfig as part of creating a GenerativeModel instance.

  // ... 
 // Set the thinking configuration 
 // Use a thinking budget value appropriate for your model (example value shown here) 
 var 
  
 thinkingConfig 
  
 = 
  
 new 
  
 ThinkingConfig 
 ( 
 thinkingBudget 
 : 
  
 1024 
 ); 
 var 
  
 generationConfig 
  
 = 
  
 new 
  
 GenerationConfig 
 ( 
  
 thinkingConfig 
 : 
  
 thinkingConfig 
 ); 
 // Specify the config as part of creating the `GenerativeModel` instance 
 var 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 GetInstance 
 ( 
 FirebaseAI 
 . 
 Backend 
 . 
 GoogleAI 
 ()). 
 GetGenerativeModel 
 ( 
  
 modelName 
 : 
  
 " GEMINI_2.5_MODEL_NAME 
" 
 , 
  
 generationConfig 
 : 
  
 generationConfig 
 ); 
 // ...

Supported thinking budget values

The following table lists the thinking budget values that you can set for each model by configuring the model's thinkingBudget .

Model	Default value	Available range for thinking budget		Value to disable thinking	Value to enable dynamic thinking
Model	Default value			Value to disable thinking	Value to enable dynamic thinking	Minimum value	Maximum value
Gemini 2.5 Pro	`8,192`	`128`	`32,768`	cannot be disabled	`-1`
Gemini 2.5 Flash	`8,192`	`1`	`24,576`	`0`	`-1`
Gemini 2.5 Flash‑Lite	`0` (thinking is disabled by default)	`512`	`24,576`	`0` (or don't configure thinking budget at all)	`-1`

Disable thinking for Gemini 2.5 models

For some easier tasks , the thinking capability isn't as necessary, and traditional inference is sufficient. Also, if reducing latency or cost is a priority, you may not want the model to take any more time or cost more than necessary to generate a response.

In these situations, you can disable (or turn off) thinking for some models:

Gemini 2.5 Pro : thinking cannot be disabled
Gemini 2.5 Flash : disable thinking by setting thinkingBudget to 0 tokens
Gemini 2.5 Flash‑Lite : thinking is disabled by default (so don't set thinkingBudget explicitly or just set it to 0 )

Note that for all Gemini 3.x models, thinking cannot be disabled.

Thinking Stay organized with collections Save and categorize content based on your preferences.

Use a thinking model

Models that support this capability

Best practices & prompting guidance for using thinking models

Control the amount of thinking

Thinking levels ( Gemini 3.x and later models)

Set the thinking level

Swift

Kotlin

Java

Web

Dart

Unity

Supported thinking level values

Thinking budgets ( Gemini 2.5 models)

Set the thinking budget

Swift

Kotlin

Java

Web

Dart

Unity

Supported thinking budget values

Disable thinking for Gemini 2.5 models

Enable dynamic thinking for Gemini 2.5 models

Task complexity for all thinking models

Thought summaries

Enable thought summaries

Swift

Kotlin

Java

Web

Dart

Unity

Stream thought summaries

Swift

Kotlin

Java

Web

Dart

Unity

Thought signatures

Pricing and counting thinking tokens

Swift

Kotlin

Java

Web

Dart

Unity

Thinking