The latest Gemini models, likeGemini 3.1 Flash Image(Nano Banana 2), are available to use with Firebase AI Logic on all platforms!
Gemini 2.0 Flash and Flash-Lite models will be retired onJune 1, 2026. To avoid service disruption, update to a newer model likegemini-2.5-flash-lite. Also,Gemini 3 Pro Preview(gemini-3-pro-preview) will be retired onMarch 9, 2026(update toGemini 3.1 Pro Preview:gemini-3.1-pro-preview).Learn more.
ThinkingStay organized with collectionsSave and categorize content based on your preferences.
Gemini 2.5and later models can use an internal "thinking process" that
significantly improves their reasoning and multi-step planning abilities, making
them highly effective for complex tasks such as coding, advanced mathematics,
and data analysis.
Thinking models offer the following configurations and options:
Control the amount of thinking You can configure how much "thinking" that a model can do. This configuration
is particularly important if reducing latency or cost is a priority. Also,
review thecomparison of task difficultiesto decide how
much a model might need its thinking capability.
Get thought summaries You can enablethought summariesto include with
the generated response. These summaries are synthesized versions of the
model's raw thoughts and offer insights into the model's internal reasoning
process.
Handle thought signatures TheFirebase AI Logic SDKs automatically handlethought signaturesfor you, which ensures that
the model has access to the thought context from previous turns specifically
when using function calling.
Best practices & prompting guidance for using thinking models
We recommend testing your prompt inGoogle AI StudioorVertex AI Studiowhere you can view the full thinking process. You can identify any areas where
the model may have gone astray so that you can refine your prompts to get more
consistent and accurate responses.
Begin with a general prompt that describes the desired outcome, and observe the
model's initial thoughts on how it determines its response. If the response
isn't as expected, help the model generate a better response by using any of the
followingprompting techniques:
Provide step-by-step instructions
Provide several examples of input-output pairs
Provide guidance for how the output and responses should be phrased and
be formatted
Provide specific verification steps
In addition to prompting, consider using these recommendations:
Setsystem instructions,
which are like a "preamble" that you add before the model gets exposed to
any further instructions from the prompt or end user. They let you steer
the behavior of the model based on your specific needs and use cases.
Set athinking level(orthinking budgetforGemini2.5 models)
to control how much thinking the model can do. If you set it high, then the
model can think more, if needed. If you set it lower, then the model won't
"overthink" its response, and it also reserves more of the total token
output limit for the actual response and can help reduce latency and cost.
EnableAI monitoring in theFirebaseconsoleto monitor the count of thinking tokens and the latency of your requests
that have thinking enabled. And if you havethought summariesenabled, they will display in the
console where you can inspect the model's detailed reasoning to help you
debug and refine your prompts.
Control the amount of thinking
You can configure how much "thinking" and reasoning that a model can do before
it returns a response. This configuration is particularly important if reducing
latency or cost is a priority.
Make sure to review thecomparison of task difficultiesto decide how much a model
might need its thinking capability. Here's some high-level guidance:
Set a lower thinking value for less complex tasks or if reducing latency or
cost is a priority for you.
Set a higher thinking value for more complex tasks.
To control how much thinking aGemini 3and later model can do to
generate its response, you can specify athinking levelfor the amount of
thinking tokens that it's allowed to use.
Set the thinking level
Click yourGemini APIprovider to view provider-specific content
and code on this page.
Set the thinking level in aGenerationConfigas part of creating theGenerativeModelinstance. The configuration is maintained for the lifetime of
the instance. If you want to use different thinking levels for different
requests, then createGenerativeModelinstances configured with each level.
Set the thinking level in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Use a thinking level value appropriate for your model (example value shown here)letgenerationConfig=GenerationConfig(thinkingConfig:ThinkingConfig(thinkingLevel:.low))// Specify the config as part of creating the `GenerativeModel` instanceletmodel=FirebaseAI.firebaseAI(backend:.googleAI()).generativeModel(modelName:"GEMINI_3_MODEL_NAME",generationConfig:generationConfig)// ...
Kotlin
Set the values of the parameters in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Use a thinking level value appropriate for your model (example value shown here)valgenerationConfig=generationConfig{thinkingConfig=thinkingConfig{thinkingLevel=ThinkingLevel.LOW}}// Specify the config as part of creating the `GenerativeModel` instancevalmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel(modelName="GEMINI_3_MODEL_NAME",generationConfig,)// ...
Java
Set the values of the parameters in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Use a thinking level value appropriate for your model (example value shown here)ThinkingConfigthinkingConfig=newThinkingConfig.Builder().setThinkingLevel(ThinkingLevel.LOW).build();GenerationConfiggenerationConfig=GenerationConfig.builder().setThinkingConfig(thinkingConfig).build();// Specify the config as part of creating the `GenerativeModel` instanceGenerativeModelFuturesmodel=GenerativeModelFutures.from(FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(/* modelName */"GEMINI_3_MODEL_NAME",/* generationConfig */generationConfig););// ...
Web
Set the values of the parameters in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});// Set the thinking configuration// Use a thinking level value appropriate for your model (example value shown here)constgenerationConfig={thinkingConfig:{thinkingLevel:ThinkingLevel.LOW}};// Specify the config as part of creating the `GenerativeModel` instanceconstmodel=getGenerativeModel(ai,{model:"GEMINI_3_MODEL_NAME",generationConfig});// ...
Dart
Set the values of the parameters in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Use a thinking level value appropriate for your model (example value shown here)finalthinkingConfig=ThinkingConfig.withThinkingLevel(ThinkingLevel.low);finalgenerationConfig=GenerationConfig(thinkingConfig:thinkingConfig);// Specify the config as part of creating the `GenerativeModel` instancefinalmodel=FirebaseAI.googleAI().generativeModel(model:'GEMINI_3_MODEL_NAME',config:generationConfig,);// ...
Unity
Set the values of the parameters in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Use a thinking level value appropriate for your model (example value shown here)varthinkingConfig=newThinkingConfig(thinkingLevel:ThinkingLevel.Low);vargenerationConfig=newGenerationConfig(thinkingConfig:thinkingConfig);// Specify the config as part of creating the `GenerativeModel` instancevarmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(modelName:"GEMINI_3_MODEL_NAME",generationConfig:generationConfig);// ...
Model uses as few tokens as possible; close to no thinking
Low-complexity tasks
Model uses fewer tokens; minimizes latency & cost
Simple tasks and high-throughput tasks
Model uses a balanced approach
Moderate complexity tasks
Model uses tokens up to its maximum level
Complex prompts that require deep reasoning
gemini-3.1-pro-preview
(default)
gemini-3-flash-preview
(default)
gemini-3.1-flash-lite-preview
(default)
gemini-3-pro-image-preview("Nano Banana Pro")
(default)
gemini-3.1-flash-image-preview("Nano Banana 2")
(default)
Thinking budgets(Gemini 2.5models)
To control how much thinking aGemini 2.5model can do to generate its
response, you can specify athinking budgetfor the amount of thinking
tokens that it's allowed to use.
Set the thinking budget
Click yourGemini APIprovider to view provider-specific content
and code on this page.
Set the thinking budget in aGenerationConfigas part of creating theGenerativeModelinstance for aGemini 2.5model. The configuration is
maintained for the lifetime of the instance. If you want to use different
thinking budgets for different requests, then createGenerativeModelinstances
configured with each budget.
Set the thinking budget in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Use a thinking budget value appropriate for your model (example value shown here)letgenerationConfig=GenerationConfig(thinkingConfig:ThinkingConfig(thinkingBudget:1024))// Specify the config as part of creating the `GenerativeModel` instanceletmodel=FirebaseAI.firebaseAI(backend:.googleAI()).generativeModel(modelName:"GEMINI_2.5_MODEL_NAME",generationConfig:generationConfig)// ...
Kotlin
Set the values of the parameters in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Use a thinking budget value appropriate for your model (example value shown here)valgenerationConfig=generationConfig{thinkingConfig=thinkingConfig{thinkingBudget=1024}}// Specify the config as part of creating the `GenerativeModel` instancevalmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel(modelName="GEMINI_2.5_MODEL_NAME",generationConfig,)// ...
Java
Set the values of the parameters in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Use a thinking budget value appropriate for your model (example value shown here)ThinkingConfigthinkingConfig=newThinkingConfig.Builder().setThinkingBudget(1024).build();GenerationConfiggenerationConfig=GenerationConfig.builder().setThinkingConfig(thinkingConfig).build();// Specify the config as part of creating the `GenerativeModel` instanceGenerativeModelFuturesmodel=GenerativeModelFutures.from(FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(/* modelName */"GEMINI_2.5_MODEL_NAME",/* generationConfig */generationConfig););// ...
Web
Set the values of the parameters in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});// Set the thinking configuration// Use a thinking budget value appropriate for your model (example value shown here)constgenerationConfig={thinkingConfig:{thinkingBudget:1024}};// Specify the config as part of creating the `GenerativeModel` instanceconstmodel=getGenerativeModel(ai,{model:"GEMINI_2.5_MODEL_NAME",generationConfig});// ...
Dart
Set the values of the parameters in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Use a thinking budget value appropriate for your model (example value shown here)finalthinkingConfig=ThinkingConfig.withThinkingBudget(1024);finalgenerationConfig=GenerationConfig(thinkingConfig:thinkingConfig);// Specify the config as part of creating the `GenerativeModel` instancefinalmodel=FirebaseAI.googleAI().generativeModel(model:'GEMINI_2.5_MODEL_NAME',config:generationConfig,);// ...
Unity
Set the values of the parameters in aGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Use a thinking budget value appropriate for your model (example value shown here)varthinkingConfig=newThinkingConfig(thinkingBudget:1024);vargenerationConfig=newGenerationConfig(thinkingConfig:thinkingConfig);// Specify the config as part of creating the `GenerativeModel` instancevarmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(modelName:"GEMINI_2.5_MODEL_NAME",generationConfig:generationConfig);// ...
For someeasier tasks, the thinking capability isn't as
necessary, and traditional inference is sufficient. Also, if reducing latency or
cost is a priority, you may not want the model to take any more time or cost
more than necessary to generate a response.
In these situations, you can disable (or turn off) thinking for some models:
Gemini 2.5 Pro: thinkingcannotbe disabled
Gemini 2.5 Flash: disable thinking by settingthinkingBudgetto0tokens
Gemini 2.5 Flash‑Lite: thinking is disabled by default
(so don't setthinkingBudgetexplicitly or just set it to0)
Note that for allGemini 3models, thinkingcannotbe disabled.
Enable dynamic thinking forGemini 2.5models
Withdynamic thinking, the model decides when and how much it thinks (up
to a max thinking budget, as described below).
Enable dynamic thinking by setting thethinkingBudgetto-1.
When dynamic thinking is enabled, the max thinking tokens will always be
8,192 tokens.
Note that allGemini 3models always use dynamic thinking.
Task complexity for all thinking models
Easy tasks — thinking isn't as necessaryStraightforward requests where complex reasoning isn't required, such as
fact retrieval or classification. Examples:
"Where was DeepMind founded?"
"Is this email asking for a meeting or just providing information?"
Moderate tasks — some thinking is likely necessaryCommon requests that benefit from a degree of step-by-step processing or
deeper understanding. Examples:
"Create an analogy between photosynthesis and growing up."
"Compare and contrast electric cars and hybrid cars."
Hard tasks — maximum thinking may be necessaryTruly complex challenges, such as solving complex math problems or coding
tasks. These types of tasks require the model to engage its full reasoning
and planning capabilities, often involving many internal steps before
providing an answer. Examples:
"Solve problem 1 in AIME 2025: Find the sum of all integer
bases b > 9 for which 17b is a divisor of 97b."
"Write Python code for a web application that visualizes
real-time stock market data, including user authentication. Make it as
efficient as possible."
Thought summaries
Thought summariesare synthesized versions of the model's raw thoughts
and offer insights into the model's internal reasoning process.
Here are some reasons to include thought summaries in responses:
You can display the thought summary in your app's UI or make them accessible
to your users. The thought summary is returned as a separate part in the
response so that you have more control over how it's used in your app.
If you also enableAI monitoring in theFirebaseconsole,
then thought summaries display in the console where you can inspect the
model's detailed reasoning to help you debug and refine your prompts.
Here are some key notes about thought summaries:
Thought summaries arenotcontrolled bythinking budgets(budgetsonlyapply to the model's raw
thoughts). However, ifthinking is disabled, then the
model won't return a thought summary.
Thought summaries are considered part of the model's regular generated-text
response and count as output tokens.
Enable thought summaries
Click yourGemini APIprovider to view provider-specific content
and code on this page.
You can enable thought summaries by settingincludeThoughtsto true in your
model configuration. You can then access the summary by checking thethoughtSummaryfield from the response.
Here's an example demonstrating how to enable and retrieve thought summaries
with the response:
Swift
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)letgenerationConfig=GenerationConfig(thinkingConfig:ThinkingConfig(includeThoughts:true))// Specify the config as part of creating the `GenerativeModel` instanceletmodel=FirebaseAI.firebaseAI(backend:.googleAI()).generativeModel(modelName:"GEMINI_MODEL_NAME",generationConfig:generationConfig)letresponse=tryawaitmodel.generateContent("solve x^2 + 4x + 4 = 0")// Handle the response that includes thought summariesifletthoughtSummary=response.thoughtSummary{print("Thought Summary:\(thoughtSummary)")}guardlettext=response.textelse{fatalError("No text in response.")}print("Answer:\(text)")
Kotlin
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)valgenerationConfig=generationConfig{thinkingConfig=thinkingConfig{includeThoughts=true}}// Specify the config as part of creating the `GenerativeModel` instancevalmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel(modelName="GEMINI_MODEL_NAME",generationConfig,)valresponse=model.generateContent("solve x^2 + 4x + 4 = 0")// Handle the response that includes thought summariesresponse.thoughtSummary?.let{println("Thought Summary:$it")}response.text?.let{println("Answer:$it")}
Java
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)ThinkingConfigthinkingConfig=newThinkingConfig.Builder().setIncludeThoughts(true).build();GenerationConfiggenerationConfig=GenerationConfig.builder().setThinkingConfig(thinkingConfig).build();// Specify the config as part of creating the `GenerativeModel` instanceGenerativeModelFuturesmodel=GenerativeModelFutures.from(FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(/* modelName */"GEMINI_MODEL_NAME",/* generationConfig */generationConfig););// Handle the response that includes thought summariesListenableFutureresponseFuture=model.generateContent("solve x^2 + 4x + 4 = 0");Futures.addCallback(responseFuture,newFutureCallback(){@OverridepublicvoidonSuccess(GenerateContentResponseresponse){if(response.getThoughtSummary()!=null){System.out.println("Thought Summary: "+response.getThoughtSummary());}if(response.getText()!=null){System.out.println("Answer: "+response.getText());}}@OverridepublicvoidonFailure(Throwablet){// Handle error}},MoreExecutors.directExecutor());
Web
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)constgenerationConfig={thinkingConfig:{includeThoughts:true}};// Specify the config as part of creating the `GenerativeModel` instanceconstmodel=getGenerativeModel(ai,{model:"GEMINI_MODEL_NAME",generationConfig});constresult=awaitmodel.generateContent("solve x^2 + 4x + 4 = 0");constresponse=result.response;// Handle the response that includes thought summariesif(response.thoughtSummary()){console.log(`Thought Summary:${response.thoughtSummary()}`);}consttext=response.text();console.log(`Answer:${text}`);
Dart
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)finalthinkingConfig=ThinkingConfig(includeThoughts:true);finalgenerationConfig=GenerationConfig(thinkingConfig:thinkingConfig);// Specify the config as part of creating the `GenerativeModel` instancefinalmodel=FirebaseAI.googleAI().generativeModel(model:'GEMINI_MODEL_NAME',generationConfig:generationConfig,);finalresponse=awaitmodel.generateContent('solve x^2 + 4x + 4 = 0');// Handle the response that includes thought summariesif(response.thoughtSummary!=null){print('Thought Summary:${response.thoughtSummary}');}if(response.text!=null){print('Answer:${response.text}');}
Unity
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)varthinkingConfig=newThinkingConfig(includeThoughts:true);vargenerationConfig=newGenerationConfig(thinkingConfig:thinkingConfig);// Specify the config as part of creating the `GenerativeModel` instancevarmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(modelName:"GEMINI_MODEL_NAME",generationConfig:generationConfig);varresponse=awaitmodel.GenerateContentAsync("solve x^2 + 4x + 4 = 0");// Handle the response that includes thought summariesif(response.ThoughtSummary!=null){Debug.Log($"Thought Summary: {response.ThoughtSummary}");}if(response.Text!=null){Debug.Log($"Answer: {response.Text}");}
View the response and thought summary
# Example Response:# Okay, let's solve the quadratic equation x² + 4x + 4 = 0.# ...# **Answer:**# The solution to the equation x² + 4x + 4 = 0 is x = -2. This is a repeated root (or a root with multiplicity 2).
# Example Thought Summary:# **My Thought Process for Solving the Quadratic Equation**## Alright, let's break down this quadratic, x² + 4x + 4 = 0. First things first:# it's a quadratic; the x² term gives it away, and we know the general form is# ax² + bx + c = 0.## So, let's identify the coefficients: a = 1, b = 4, and c = 4. Now, what's the# most efficient path to the solution? My gut tells me to try factoring; it's# often the fastest route if it works. If that fails, I'll default to the quadratic# formula, which is foolproof. Completing the square? It's good for deriving the# formula or when factoring is difficult, but not usually my first choice for# direct solving, but it can't hurt to keep it as an option.## Factoring, then. I need to find two numbers that multiply to 'c' (4) and add# up to 'b' (4). Let's see... 1 and 4 don't work (add up to 5). 2 and 2? Bingo!# They multiply to 4 and add up to 4. This means I can rewrite the equation as# (x + 2)(x + 2) = 0, or more concisely, (x + 2)² = 0. Solving for x is now# trivial: x + 2 = 0, thus x = -2.## Okay, just to be absolutely certain, I'll run the quadratic formula just to# double-check. x = [-b ± √(b² - 4ac)] / 2a. Plugging in the values, x = [-4 ±# √(4² - 4 * 1 * 4)] / (2 * 1). That simplifies to x = [-4 ± √0] / 2. So, x =# -2 again - a repeated root. Nice.## Now, let's check via completing the square. Starting from the same equation,# (x² + 4x) = -4. Take half of the b-value (4/2 = 2), square it (2² = 4), and# add it to both sides, so x² + 4x + 4 = -4 + 4. Which simplifies into (x + 2)²# = 0. The square root on both sides gives us x + 2 = 0, therefore x = -2, as# expected.## Always, *always* confirm! Let's substitute x = -2 back into the original# equation: (-2)² + 4(-2) + 4 = 0. That's 4 - 8 + 4 = 0. It checks out.## Conclusion: the solution is x = -2. Confirmed.
Stream thought summaries
You can also view thought summaries if you choose to stream a response usinggenerateContentStream. This will return rolling, incremental summaries during
the response generation.
Swift
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)letgenerationConfig=GenerationConfig(thinkingConfig:ThinkingConfig(includeThoughts:true))// Specify the config as part of creating the `GenerativeModel` instanceletmodel=FirebaseAI.firebaseAI(backend:.googleAI()).generativeModel(modelName:"GEMINI_MODEL_NAME",generationConfig:generationConfig)letstream=trymodel.generateContentStream("solve x^2 + 4x + 4 = 0")// Handle the streamed response that includes thought summariesvarthoughts=""varanswer=""fortryawaitresponseinstream{ifletthought=response.thoughtSummary{ifthoughts.isEmpty{print("--- Thoughts Summary ---")}print(thought)thoughts+=thought}iflettext=response.text{ifanswer.isEmpty{print("--- Answer ---")}print(text)answer+=text}}
Kotlin
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)valgenerationConfig=generationConfig{thinkingConfig=thinkingConfig{includeThoughts=true}}// Specify the config as part of creating the `GenerativeModel` instancevalmodel=Firebase.ai(backend=GenerativeBackend.googleAI()).generativeModel(modelName="GEMINI_MODEL_NAME",generationConfig,)// Handle the streamed response that includes thought summariesvarthoughts=""varanswer=""model.generateContentStream("solve x^2 + 4x + 4 = 0").collect{response->response.thoughtSummary?.let{if(thoughts.isEmpty()){println("--- Thoughts Summary ---")}print(it)thoughts+=it}response.text?.let{if(answer.isEmpty()){println("--- Answer ---")}print(it)answer+=it}}
Java
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)ThinkingConfigthinkingConfig=newThinkingConfig.Builder().setIncludeThoughts(true).build();GenerationConfiggenerationConfig=GenerationConfig.builder().setThinkingConfig(thinkingConfig).build();// Specify the config as part of creating the `GenerativeModel` instanceGenerativeModelFuturesmodel=GenerativeModelFutures.from(FirebaseAI.getInstance(GenerativeBackend.googleAI()).generativeModel(/* modelName */"GEMINI_MODEL_NAME",/* generationConfig */generationConfig););// Streaming with Java is complex and depends on the async library used.// This is a conceptual example using a reactive stream.FlowableresponseStream=model.generateContentStream("solve x^2 + 4x + 4 = 0");// Handle the streamed response that includes thought summariesStringBuilderthoughts=newStringBuilder();StringBuilderanswer=newStringBuilder();responseStream.subscribe(response->{if(response.getThoughtSummary()!=null){if(thoughts.length()==0){System.out.println("--- Thoughts Summary ---");}System.out.print(response.getThoughtSummary());thoughts.append(response.getThoughtSummary());}if(response.getText()!=null){if(answer.length()==0){System.out.println("--- Answer ---");}System.out.print(response.getText());answer.append(response.getText());}},throwable->{// Handle error});
Web
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...constai=getAI(firebaseApp,{backend:newGoogleAIBackend()});// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)constgenerationConfig={thinkingConfig:{includeThoughts:true}};// Specify the config as part of creating the `GenerativeModel` instanceconstmodel=getGenerativeModel(ai,{model:"GEMINI_MODEL_NAME",generationConfig});constresult=awaitmodel.generateContentStream("solve x^2 + 4x + 4 = 0");// Handle the streamed response that includes thought summariesletthoughts="";letanswer="";forawait(constchunkofresult.stream){if(chunk.thoughtSummary()){if(thoughts===""){console.log("--- Thoughts Summary ---");}// In Node.js, process.stdout.write(chunk.thoughtSummary()) could be used// to avoid extra newlines.console.log(chunk.thoughtSummary());thoughts+=chunk.thoughtSummary();}consttext=chunk.text();if(text){if(answer===""){console.log("--- Answer ---");}// In Node.js, process.stdout.write(text) could be used.console.log(text);answer+=text;}}
Dart
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)finalthinkingConfig=ThinkingConfig(includeThoughts:true);finalgenerationConfig=GenerationConfig(thinkingConfig:thinkingConfig);// Specify the config as part of creating the `GenerativeModel` instancefinalmodel=FirebaseAI.googleAI().generativeModel(model:'GEMINI_MODEL_NAME',generationConfig:generationConfig,);finalresponses=model.generateContentStream('solve x^2 + 4x + 4 = 0');// Handle the streamed response that includes thought summariesvarthoughts='';varanswer='';awaitfor(finalresponseinresponses){if(response.thoughtSummary!=null){if(thoughts.isEmpty){print('--- Thoughts Summary ---');}thoughts+=response.thoughtSummary!;}if(response.text!=null){if(answer.isEmpty){print('--- Answer ---');}answer+=response.text!;}}
Unity
Enable thought summaries in theGenerationConfigas part of creating aGenerativeModelinstance.
// ...// Set the thinking configuration// Optionally enable thought summaries in the generated response (default is false)varthinkingConfig=newThinkingConfig(includeThoughts:true);vargenerationConfig=newGenerationConfig(thinkingConfig:thinkingConfig);// Specify the config as part of creating the `GenerativeModel` instancevarmodel=FirebaseAI.GetInstance(FirebaseAI.Backend.GoogleAI()).GetGenerativeModel(modelName:"GEMINI_MODEL_NAME",generationConfig:generationConfig);varstream=model.GenerateContentStreamAsync("solve x^2 + 4x + 4 = 0");// Handle the streamed response that includes thought summariesvarthoughts="";varanswer="";awaitforeach(varresponseinstream){if(response.ThoughtSummary!=null){if(string.IsNullOrEmpty(thoughts)){Debug.Log("--- Thoughts Summary ---");}Debug.Log(response.ThoughtSummary);thoughts+=response.ThoughtSummary;}if(response.Text!=null){if(string.IsNullOrEmpty(answer)){Debug.Log("--- Answer ---");}Debug.Log(response.Text);answer+=response.Text;}}
Thought signatures
When using thinking in multi-turn interactions, the model doesn't have access
to thought context from previous turns. However, if you're usingfunction calling, you can take advantage ofthought signaturesto maintain thought context across turns. Thought
signatures are encrypted representations of the model's internal thought
process, and they're available when using thinkingandfunction calling.
Specifically, thought signatures are generated when:
Thinking is enabled and thoughts are generated.
The request includes function declarations.
To take advantage of thought signatures, use function calling as normal.TheFirebase AI Logic SDKs simplify the process by managing the state
and automatically handling thought signatures for you. The SDKs automatically
pass any generated thought signatures between subsequentsendMessageorsendMessageStreamcalls in aChatsession.
Pricing and counting thinking tokens
Thinking tokens use the samepricingas text-output tokens. If you enablethought summaries,
they are considered to be thinking tokens and are priced accordingly.
You can get the total number of thinking tokens from thethoughtsTokenCountfield in theusageMetadataattribute of the response:
Swift
// ...letresponse=tryawaitmodel.generateContent("Why is the sky blue?")ifletusageMetadata=response.usageMetadata{print("Thoughts Token Count:\(usageMetadata.thoughtsTokenCount)")}
Kotlin
// ...valresponse=model.generateContent("Why is the sky blue?")response.usageMetadata?.let{usageMetadata->println("Thoughts Token Count:${usageMetadata.thoughtsTokenCount}")}
Java
// ...ListenableFuture<GenerateContentResponse>response=model.generateContent("Why is the sky blue?");Futures.addCallback(response,newFutureCallback<GenerateContentResponse>(){@OverridepublicvoidonSuccess(GenerateContentResponseresult){StringusageMetadata=result.getUsageMetadata();if(usageMetadata!=null){System.out.println("Thoughts Token Count: "+usageMetadata.getThoughtsTokenCount());}}@OverridepublicvoidonFailure(Throwablet){t.printStackTrace();}},executor);
Web
// ...constresponse=awaitmodel.generateContent("Why is the sky blue?");if(response?.usageMetadata?.thoughtsTokenCount!=null){console.log(`Thoughts Token Count:${response.usageMetadata.thoughtsTokenCount}`);}
Dart
// ...finalresponse=awaitmodel.generateContent(Content.text("Why is the sky blue?"),]);if(response?.usageMetadatacasefinalusageMetadata?){print("Thoughts Token Count:${usageMetadata.thoughtsTokenCount}");}
Unity
// ...varresponse=awaitmodel.GenerateContentAsync("Why is the sky blue?");if(response.UsageMetadata!=null){UnityEngine.Debug.Log($"Thoughts Token Count: {response.UsageMetadata?.ThoughtsTokenCount}");}
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-03-10 UTC."],[],[]]