You can ask a Gemini model to generate and edit images using both text-only and text-and-image prompts. When you use Firebase AI Logic , you can make this request directly from your app.
With this capability, you can do things like:
-  Iteratively generate images through conversation with natural language, adjusting images while maintaining consistency and context. 
-  Generate images with high-quality text rendering, including long strings of text. 
-  Generate interleaved text-image output. For example, a blog post with text and images in a single turn. Previously, this required stringing together multiple models. 
-  Generate images using Gemini's world knowledge and reasoning capabilities. 
You can find a complete list of supported modalities and capabilities (along with example prompts) later on this page.
Jump to code for text-to-image Jump to code for interleaved text & images
Jump to code for image editing Jump to code for iterative image editing
Analyze images Analyze images on-device Generate structured output
Choosing between Gemini and Imagen models
The Firebase AI Logic SDKs support image generation and editing using either a Gemini model or an Imagen model.
For most use cases, start with Gemini , and then choose Imagen only for specialized tasks where image quality is critical.
Choose Gemini when you want:
- To use world knowledge and reasoning to generate contextually relevant images.
- To seamlessly blend text and images or to interleave text and image output.
- To embed accurate visuals within long text sequences.
- To edit images conversationally while maintaining context.
Choose Imagen when you want:
- To prioritize image quality, photorealism, artistic detail, or specific styles (for example, impressionism or anime).
- To infuse branding, style, or generation of logos and product designs.
- To explicitly specify the aspect ratio or format of generated images.
Before you begin
Click your Gemini API provider to view provider-specific content and code on this page.
If you haven't already, complete the getting started guide 
, which describes how to
set up your Firebase project, connect your app to Firebase, add the SDK,
initialize the backend service for your chosen Gemini API 
provider, and
create a GenerativeModel 
instance.
For testing and iterating on your prompts, we recommend using Google AI Studio .
Models that support this capability
-  gemini-2.5-flash-image(aka "nano banana").
Note that the SDKs also support image generation using Imagen models .
Generate and edit images
You can generate and edit images using a Gemini model.
Generate images (text-only input)
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
You can ask a Gemini model to generate images by prompting with text.
Make sure to create a GenerativeModel 
instance, include responseModalities: ["TEXT", "IMAGE"] 
generateContent 
.
Swift
  import 
  
 FirebaseAILogic 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 let 
  
 generativeModel 
  
 = 
  
 FirebaseAI 
 . 
 firebaseAI 
 ( 
 backend 
 : 
  
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 GenerationConfig 
 ( 
 responseModalities 
 : 
  
 [. 
 text 
 , 
  
 . 
 image 
 ]) 
 ) 
 // Provide a text prompt instructing the model to generate an image 
 let 
  
 prompt 
  
 = 
  
 "Generate an image of the Eiffel tower with fireworks in the background." 
 // To generate an image, call `generateContent` with the text input 
 let 
  
 response 
  
 = 
  
 try 
  
 await 
  
 model 
 . 
 generateContent 
 ( 
 prompt 
 ) 
 // Handle the generated image 
 guard 
  
 let 
  
 inlineDataPart 
  
 = 
  
 response 
 . 
 inlineDataParts 
 . 
 first 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "No image data in response." 
 ) 
 } 
 guard 
  
 let 
  
 uiImage 
  
 = 
  
 UIImage 
 ( 
 data 
 : 
  
 inlineDataPart 
 . 
 data 
 ) 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "Failed to convert data to UIImage." 
 ) 
 } 
 
 
Kotlin
  // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 val 
  
 model 
  
 = 
  
 Firebase 
 . 
 ai 
 ( 
 backend 
  
 = 
  
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
  
 = 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
  
 = 
  
 generationConfig 
  
 { 
 responseModalities 
  
 = 
  
 listOf 
 ( 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 ) 
  
 } 
 ) 
 // Provide a text prompt instructing the model to generate an image 
 val 
  
 prompt 
  
 = 
  
 "Generate an image of the Eiffel tower with fireworks in the background." 
 // To generate image output, call `generateContent` with the text input 
 val 
  
 generatedImageAsBitmap 
  
 = 
  
 model 
 . 
 generateContent 
 ( 
 prompt 
 ) 
  
 // Handle the generated image 
  
 . 
 candidates 
 . 
 first 
 (). 
 content 
 . 
 parts 
 . 
 filterIsInstance<ImagePart> 
 (). 
 firstOrNull 
 () 
 ?. 
 image 
 
 
Java
  // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 GenerativeModel 
  
 ai 
  
 = 
  
 FirebaseAI 
 . 
 getInstance 
 ( 
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 new 
  
 GenerationConfig 
 . 
 Builder 
 () 
  
 . 
 setResponseModalities 
 ( 
 Arrays 
 . 
 asList 
 ( 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 )) 
  
 . 
 build 
 () 
 ); 
 GenerativeModelFutures 
  
 model 
  
 = 
  
 GenerativeModelFutures 
 . 
 from 
 ( 
 ai 
 ); 
 // Provide a text prompt instructing the model to generate an image 
 Content 
  
 prompt 
  
 = 
  
 new 
  
 Content 
 . 
 Builder 
 () 
  
 . 
 addText 
 ( 
 "Generate an image of the Eiffel Tower with fireworks in the background." 
 ) 
  
 . 
 build 
 (); 
 // To generate an image, call `generateContent` with the text input 
 ListenableFuture<GenerateContentResponse> 
  
 response 
  
 = 
  
 model 
 . 
 generateContent 
 ( 
 prompt 
 ); 
 Futures 
 . 
 addCallback 
 ( 
 response 
 , 
  
 new 
  
 FutureCallback<GenerateContentResponse> 
 () 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onSuccess 
 ( 
 GenerateContentResponse 
  
 result 
 ) 
  
 { 
  
  
 // iterate over all the parts in the first candidate in the result object 
  
 for 
  
 ( 
 Part 
  
 part 
  
 : 
  
 result 
 . 
 getCandidates 
 (). 
 get 
 ( 
 0 
 ). 
 getContent 
 (). 
 getParts 
 ()) 
  
 { 
  
 if 
  
 ( 
 part 
  
 instanceof 
  
 ImagePart 
 ) 
  
 { 
  
 ImagePart 
  
 imagePart 
  
 = 
  
 ( 
 ImagePart 
 ) 
  
 part 
 ; 
  
 // The returned image as a bitmap 
  
 Bitmap 
  
 generatedImageAsBitmap 
  
 = 
  
 imagePart 
 . 
 getImage 
 (); 
  
 break 
 ; 
  
 } 
  
 } 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onFailure 
 ( 
 Throwable 
  
 t 
 ) 
  
 { 
  
 t 
 . 
 printStackTrace 
 (); 
  
 } 
 }, 
  
 executor 
 ); 
 
 
Web
  import 
  
 { 
  
 initializeApp 
  
 } 
  
 from 
  
 "firebase/app" 
 ; 
 import 
  
 { 
  
 getAI 
 , 
  
 getGenerativeModel 
 , 
  
 GoogleAIBackend 
 , 
  
 ResponseModality 
  
 } 
  
 from 
  
 "firebase/ai" 
 ; 
 // TODO(developer) Replace the following with your app's Firebase configuration 
 // See: https://firebase.google.com/docs/web/learn-more#config-object 
 const 
  
 firebaseConfig 
  
 = 
  
 { 
  
 // ... 
 }; 
 // Initialize FirebaseApp 
 const 
  
 firebaseApp 
  
 = 
  
 initializeApp 
 ( 
 firebaseConfig 
 ); 
 // Initialize the Gemini Developer API backend service 
 const 
  
 ai 
  
 = 
  
 getAI 
 ( 
 firebaseApp 
 , 
  
 { 
  
 backend 
 : 
  
 new 
  
 GoogleAIBackend 
 () 
  
 }); 
 // Create a `GenerativeModel` instance with a model that supports your use case 
 const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 model 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 { 
  
 responseModalities 
 : 
  
 [ 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 ], 
  
 }, 
 }); 
 // Provide a text prompt instructing the model to generate an image 
 const 
  
 prompt 
  
 = 
  
 'Generate an image of the Eiffel Tower with fireworks in the background.' 
 ; 
 // To generate an image, call `generateContent` with the text input 
 const 
  
 result 
  
 = 
  
 model 
 . 
 generateContent 
 ( 
 prompt 
 ); 
 // Handle the generated image 
 try 
  
 { 
  
 const 
  
 inlineDataParts 
  
 = 
  
 result 
 . 
 response 
 . 
 inlineDataParts 
 (); 
  
 if 
  
 ( 
 inlineDataParts 
 ? 
 .[ 
 0 
 ]) 
  
 { 
  
 const 
  
 image 
  
 = 
  
 inlineDataParts 
 [ 
 0 
 ]. 
 inlineData 
 ; 
  
 console 
 . 
 log 
 ( 
 image 
 . 
 mimeType 
 , 
  
 image 
 . 
 data 
 ); 
  
 } 
 } 
  
 catch 
  
 ( 
 err 
 ) 
  
 { 
  
 console 
 . 
 error 
 ( 
 'Prompt or candidate was blocked:' 
 , 
  
 err 
 ); 
 } 
 
 
Dart
  import 
  
 'package:firebase_ai/firebase_ai.dart' 
 ; 
 import 
  
 'package:firebase_core/firebase_core.dart' 
 ; 
 import 
  
 'firebase_options.dart' 
 ; 
 await 
  
 Firebase 
 . 
 initializeApp 
 ( 
  
 options: 
  
 DefaultFirebaseOptions 
 . 
 currentPlatform 
 , 
 ); 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 final 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 googleAI 
 (). 
 generativeModel 
 ( 
  
 model: 
  
 'gemini-2.5-flash-image' 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig: 
  
 GenerationConfig 
 ( 
 responseModalities: 
  
 [ 
 ResponseModalities 
 . 
 text 
 , 
  
 ResponseModalities 
 . 
 image 
 ]), 
 ); 
 // Provide a text prompt instructing the model to generate an image 
 final 
  
 prompt 
  
 = 
  
 [ 
 Content 
 . 
 text 
 ( 
 'Generate an image of the Eiffel Tower with fireworks in the background.' 
 )]; 
 // To generate an image, call `generateContent` with the text input 
 final 
  
 response 
  
 = 
  
 await 
  
 model 
 . 
 generateContent 
 ( 
 prompt 
 ); 
 if 
  
 ( 
 response 
 . 
 inlineDataParts 
 . 
 isNotEmpty 
 ) 
  
 { 
  
 final 
  
 imageBytes 
  
 = 
  
 response 
 . 
 inlineDataParts 
 [ 
 0 
 ]. 
 bytes 
 ; 
  
 // Process the image 
 } 
  
 else 
  
 { 
  
 // Handle the case where no images were generated 
  
 print 
 ( 
 'Error: No images were generated.' 
 ); 
 } 
 
 
Unity
  using 
  
 Firebase 
 ; 
 using 
  
 Firebase.AI 
 ; 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 var 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 GetInstance 
 ( 
 FirebaseAI 
 . 
 Backend 
 . 
 GoogleAI 
 ()). 
 GetGenerativeModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 new 
  
 GenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 new 
 [] 
  
 { 
  
 ResponseModality 
 . 
 Text 
 , 
  
 ResponseModality 
 . 
 Image 
  
 }) 
 ); 
 // Provide a text prompt instructing the model to generate an image 
 var 
  
 prompt 
  
 = 
  
 "Generate an image of the Eiffel Tower with fireworks in the background." 
 ; 
 // To generate an image, call `GenerateContentAsync` with the text input 
 var 
  
 response 
  
 = 
  
 await 
  
 model 
 . 
 GenerateContentAsync 
 ( 
 prompt 
 ); 
 var 
  
 text 
  
 = 
  
 response 
 . 
 Text 
 ; 
 if 
  
 ( 
 ! 
 string 
 . 
 IsNullOrWhiteSpace 
 ( 
 text 
 )) 
  
 { 
  
 // Do something with the text 
 } 
 // Handle the generated image 
 var 
  
 imageParts 
  
 = 
  
 response 
 . 
 Candidates 
 . 
 First 
 (). 
 Content 
 . 
 Parts 
  
 . 
 OfType 
 < 
 ModelContent 
 . 
 InlineDataPart 
 > 
 () 
  
 . 
 Where 
 ( 
 part 
  
 => 
  
 part 
 . 
 MimeType 
  
 == 
  
 "image/png" 
 ); 
 foreach 
  
 ( 
 var 
  
 imagePart 
  
 in 
  
 imageParts 
 ) 
  
 { 
  
 // Load the Image into a Unity Texture2D object 
  
 UnityEngine 
 . 
 Texture2D 
  
 texture2D 
  
 = 
  
 new 
 ( 
 2 
 , 
  
 2 
 ); 
  
 if 
  
 ( 
 texture2D 
 . 
 LoadImage 
 ( 
 imagePart 
 . 
 Data 
 . 
 ToArray 
 ())) 
  
 { 
  
 // Do something with the image 
  
 } 
 } 
 
 
Generate interleaved images and text
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
You can ask a Gemini model to generate interleaved images with its text responses. For example, you can generate images of what each step of a generated recipe might look like along with the step's instructions, and you don't have to make separate requests to the model or different models.
Make sure to create a GenerativeModel 
instance, include responseModalities: ["TEXT", "IMAGE"] 
generateContent 
.
Swift
  import 
  
 FirebaseAILogic 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 let 
  
 generativeModel 
  
 = 
  
 FirebaseAI 
 . 
 firebaseAI 
 ( 
 backend 
 : 
  
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 GenerationConfig 
 ( 
 responseModalities 
 : 
  
 [. 
 text 
 , 
  
 . 
 image 
 ]) 
 ) 
 // Provide a text prompt instructing the model to generate interleaved text and images 
 let 
  
 prompt 
  
 = 
  
 """ 
 Generate an illustrated recipe for a paella. 
 Create images to go alongside the text as you generate the recipe 
 """ 
 // To generate interleaved text and images, call `generateContent` with the text input 
 let 
  
 response 
  
 = 
  
 try 
  
 await 
  
 model 
 . 
 generateContent 
 ( 
 prompt 
 ) 
 // Handle the generated text and image 
 guard 
  
 let 
  
 candidate 
  
 = 
  
 response 
 . 
 candidates 
 . 
 first 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "No candidates in response." 
 ) 
 } 
 for 
  
 part 
  
 in 
  
 candidate 
 . 
 content 
 . 
 parts 
  
 { 
  
 switch 
  
 part 
  
 { 
  
 case 
  
 let 
  
 textPart 
  
 as 
  
 TextPart 
 : 
  
 // Do something with the generated text 
  
 let 
  
 text 
  
 = 
  
 textPart 
 . 
 text 
  
 case 
  
 let 
  
 inlineDataPart 
  
 as 
  
 InlineDataPart 
 : 
  
 // Do something with the generated image 
  
 guard 
  
 let 
  
 uiImage 
  
 = 
  
 UIImage 
 ( 
 data 
 : 
  
 inlineDataPart 
 . 
 data 
 ) 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "Failed to convert data to UIImage." 
 ) 
  
 } 
  
 default 
 : 
  
 fatalError 
 ( 
 "Unsupported part type: 
 \( 
 part 
 ) 
 " 
 ) 
  
 } 
 } 
 
 
Kotlin
  // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 val 
  
 model 
  
 = 
  
 Firebase 
 . 
 ai 
 ( 
 backend 
  
 = 
  
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
  
 = 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
  
 = 
  
 generationConfig 
  
 { 
 responseModalities 
  
 = 
  
 listOf 
 ( 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 ) 
  
 } 
 ) 
 // Provide a text prompt instructing the model to generate interleaved text and images 
 val 
  
 prompt 
  
 = 
  
 """ 
 Generate an illustrated recipe for a paella. 
 Create images to go alongside the text as you generate the recipe 
 """ 
 . 
 trimIndent 
 () 
 // To generate interleaved text and images, call `generateContent` with the text input 
 val 
  
 responseContent 
  
 = 
  
 model 
 . 
 generateContent 
 ( 
 prompt 
 ). 
 candidates 
 . 
 first 
 (). 
 content 
 // The response will contain image and text parts interleaved 
 for 
  
 ( 
 part 
  
 in 
  
 responseContent 
 . 
 parts 
 ) 
  
 { 
  
 when 
  
 ( 
 part 
 ) 
  
 { 
  
 is 
  
 ImagePart 
  
 -> 
  
 { 
  
 // ImagePart as a bitmap 
  
 val 
  
 generatedImageAsBitmap 
 : 
  
 Bitmap? 
 = 
  
 part 
 . 
 asImageOrNull 
 () 
  
 } 
  
 is 
  
 TextPart 
  
 -> 
  
 { 
  
 // Text content from the TextPart 
  
 val 
  
 text 
  
 = 
  
 part 
 . 
 text 
  
 } 
  
 } 
 } 
 
 
Java
  // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 GenerativeModel 
  
 ai 
  
 = 
  
 FirebaseAI 
 . 
 getInstance 
 ( 
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 new 
  
 GenerationConfig 
 . 
 Builder 
 () 
  
 . 
 setResponseModalities 
 ( 
 Arrays 
 . 
 asList 
 ( 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 )) 
  
 . 
 build 
 () 
 ); 
 GenerativeModelFutures 
  
 model 
  
 = 
  
 GenerativeModelFutures 
 . 
 from 
 ( 
 ai 
 ); 
 // Provide a text prompt instructing the model to generate interleaved text and images 
 Content 
  
 prompt 
  
 = 
  
 new 
  
 Content 
 . 
 Builder 
 () 
  
 . 
 addText 
 ( 
 "Generate an illustrated recipe for a paella.\n" 
  
 + 
  
 "Create images to go alongside the text as you generate the recipe" 
 ) 
  
 . 
 build 
 (); 
 // To generate interleaved text and images, call `generateContent` with the text input 
 ListenableFuture<GenerateContentResponse> 
  
 response 
  
 = 
  
 model 
 . 
 generateContent 
 ( 
 prompt 
 ); 
 Futures 
 . 
 addCallback 
 ( 
 response 
 , 
  
 new 
  
 FutureCallback<GenerateContentResponse> 
 () 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onSuccess 
 ( 
 GenerateContentResponse 
  
 result 
 ) 
  
 { 
  
 Content 
  
 responseContent 
  
 = 
  
 result 
 . 
 getCandidates 
 (). 
 get 
 ( 
 0 
 ). 
 getContent 
 (); 
  
 // The response will contain image and text parts interleaved 
  
 for 
  
 ( 
 Part 
  
 part 
  
 : 
  
 responseContent 
 . 
 getParts 
 ()) 
  
 { 
  
 if 
  
 ( 
 part 
  
 instanceof 
  
 ImagePart 
 ) 
  
 { 
  
 // ImagePart as a bitmap 
  
 Bitmap 
  
 generatedImageAsBitmap 
  
 = 
  
 (( 
 ImagePart 
 ) 
  
 part 
 ). 
 getImage 
 (); 
  
 } 
  
 else 
  
 if 
  
 ( 
 part 
  
 instanceof 
  
 TextPart 
 ){ 
  
 // Text content from the TextPart 
  
 String 
  
 text 
  
 = 
  
 (( 
 TextPart 
 ) 
  
 part 
 ). 
 getText 
 (); 
  
 } 
  
 } 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onFailure 
 ( 
 Throwable 
  
 t 
 ) 
  
 { 
  
 System 
 . 
 err 
 . 
 println 
 ( 
 t 
 ); 
  
 } 
 }, 
  
 executor 
 ); 
 
 
Web
  import 
  
 { 
  
 initializeApp 
  
 } 
  
 from 
  
 "firebase/app" 
 ; 
 import 
  
 { 
  
 getAI 
 , 
  
 getGenerativeModel 
 , 
  
 GoogleAIBackend 
 , 
  
 ResponseModality 
  
 } 
  
 from 
  
 "firebase/ai" 
 ; 
 // TODO(developer) Replace the following with your app's Firebase configuration 
 // See: https://firebase.google.com/docs/web/learn-more#config-object 
 const 
  
 firebaseConfig 
  
 = 
  
 { 
  
 // ... 
 }; 
 // Initialize FirebaseApp 
 const 
  
 firebaseApp 
  
 = 
  
 initializeApp 
 ( 
 firebaseConfig 
 ); 
 // Initialize the Gemini Developer API backend service 
 const 
  
 ai 
  
 = 
  
 getAI 
 ( 
 firebaseApp 
 , 
  
 { 
  
 backend 
 : 
  
 new 
  
 GoogleAIBackend 
 () 
  
 }); 
 // Create a `GenerativeModel` instance with a model that supports your use case 
 const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 model 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 { 
  
 responseModalities 
 : 
  
 [ 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 ], 
  
 }, 
 }); 
 // Provide a text prompt instructing the model to generate interleaved text and images 
 const 
  
 prompt 
  
 = 
  
 'Generate an illustrated recipe for a paella.\n.' 
  
 + 
  
 'Create images to go alongside the text as you generate the recipe' 
 ; 
 // To generate interleaved text and images, call `generateContent` with the text input 
 const 
  
 result 
  
 = 
  
 await 
  
 model 
 . 
 generateContent 
 ( 
 prompt 
 ); 
 // Handle the generated text and image 
 try 
  
 { 
  
 const 
  
 response 
  
 = 
  
 result 
 . 
 response 
 ; 
  
 if 
  
 ( 
 response 
 . 
 candidates 
 ? 
 .[ 
 0 
 ]. 
 content 
 ? 
 . 
 parts 
 ) 
  
 { 
  
 for 
  
 ( 
 const 
  
 part 
  
 of 
  
 response 
 . 
 candidates 
 ? 
 .[ 
 0 
 ]. 
 content 
 ? 
 . 
 parts 
 ) 
  
 { 
  
 if 
  
 ( 
 part 
 . 
 text 
 ) 
  
 { 
  
 // Do something with the text 
  
 console 
 . 
 log 
 ( 
 part 
 . 
 text 
 ) 
  
 } 
  
 if 
  
 ( 
 part 
 . 
 inlineData 
 ) 
  
 { 
  
 // Do something with the image 
  
 const 
  
 image 
  
 = 
  
 part 
 . 
 inlineData 
 ; 
  
 console 
 . 
 log 
 ( 
 image 
 . 
 mimeType 
 , 
  
 image 
 . 
 data 
 ); 
  
 } 
  
 } 
  
 } 
 } 
  
 catch 
  
 ( 
 err 
 ) 
  
 { 
  
 console 
 . 
 error 
 ( 
 'Prompt or candidate was blocked:' 
 , 
  
 err 
 ); 
 } 
 
 
Dart
  import 
  
 'package:firebase_ai/firebase_ai.dart' 
 ; 
 import 
  
 'package:firebase_core/firebase_core.dart' 
 ; 
 import 
  
 'firebase_options.dart' 
 ; 
 await 
  
 Firebase 
 . 
 initializeApp 
 ( 
  
 options: 
  
 DefaultFirebaseOptions 
 . 
 currentPlatform 
 , 
 ); 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 final 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 googleAI 
 (). 
 generativeModel 
 ( 
  
 model: 
  
 'gemini-2.5-flash-image' 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig: 
  
 GenerationConfig 
 ( 
 responseModalities: 
  
 [ 
 ResponseModalities 
 . 
 text 
 , 
  
 ResponseModalities 
 . 
 image 
 ]), 
 ); 
 // Provide a text prompt instructing the model to generate interleaved text and images 
 final 
  
 prompt 
  
 = 
  
 [ 
 Content 
 . 
 text 
 ( 
  
 'Generate an illustrated recipe for a paella 
 \n 
 ' 
  
 + 
  
 'Create images to go alongside the text as you generate the recipe' 
 )]; 
 // To generate interleaved text and images, call `generateContent` with the text input 
 final 
  
 response 
  
 = 
  
 await 
  
 model 
 . 
 generateContent 
 ( 
 prompt 
 ); 
 // Handle the generated text and image 
 final 
  
 parts 
  
 = 
  
 response 
 . 
 candidates 
 . 
 firstOrNull 
 ? 
 . 
 content 
 . 
 parts 
 if 
  
 ( 
 parts 
 . 
 isNotEmpty 
 ) 
  
 { 
  
 for 
  
 ( 
 final 
  
 part 
  
 in 
  
 parts 
 ) 
  
 { 
  
 if 
  
 ( 
 part 
  
 is 
  
 TextPart 
 ) 
  
 { 
  
 // Do something with text part 
  
 final 
  
 text 
  
 = 
  
 part 
 . 
 text 
  
 } 
  
 if 
  
 ( 
 part 
  
 is 
  
 InlineDataPart 
 ) 
  
 { 
  
 // Process image 
  
 final 
  
 imageBytes 
  
 = 
  
 part 
 . 
 bytes 
  
 } 
  
 } 
 } 
  
 else 
  
 { 
  
 // Handle the case where no images were generated 
  
 print 
 ( 
 'Error: No images were generated.' 
 ); 
 } 
 
 
Unity
  using 
  
 Firebase 
 ; 
 using 
  
 Firebase.AI 
 ; 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 var 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 GetInstance 
 ( 
 FirebaseAI 
 . 
 Backend 
 . 
 GoogleAI 
 ()). 
 GetGenerativeModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 new 
  
 GenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 new 
 [] 
  
 { 
  
 ResponseModality 
 . 
 Text 
 , 
  
 ResponseModality 
 . 
 Image 
  
 }) 
 ); 
 // Provide a text prompt instructing the model to generate interleaved text and images 
 var 
  
 prompt 
  
 = 
  
 "Generate an illustrated recipe for a paella \n" 
  
 + 
  
 "Create images to go alongside the text as you generate the recipe" 
 ; 
 // To generate interleaved text and images, call `GenerateContentAsync` with the text input 
 var 
  
 response 
  
 = 
  
 await 
  
 model 
 . 
 GenerateContentAsync 
 ( 
 prompt 
 ); 
 // Handle the generated text and image 
 foreach 
  
 ( 
 var 
  
 part 
  
 in 
  
 response 
 . 
 Candidates 
 . 
 First 
 (). 
 Content 
 . 
 Parts 
 ) 
  
 { 
  
 if 
  
 ( 
 part 
  
 is 
  
 ModelContent 
 . 
 TextPart 
  
 textPart 
 ) 
  
 { 
  
 if 
  
 ( 
 ! 
 string 
 . 
 IsNullOrWhiteSpace 
 ( 
 textPart 
 . 
 Text 
 )) 
  
 { 
  
 // Do something with the text 
  
 } 
  
 } 
  
 else 
  
 if 
  
 ( 
 part 
  
 is 
  
 ModelContent 
 . 
 InlineDataPart 
  
 dataPart 
 ) 
  
 { 
  
 if 
  
 ( 
 dataPart 
 . 
 MimeType 
  
 == 
  
 "image/png" 
 ) 
  
 { 
  
 // Load the Image into a Unity Texture2D object 
  
 UnityEngine 
 . 
 Texture2D 
  
 texture2D 
  
 = 
  
 new 
 ( 
 2 
 , 
  
 2 
 ); 
  
 if 
  
 ( 
 texture2D 
 . 
 LoadImage 
 ( 
 dataPart 
 . 
 Data 
 . 
 ToArray 
 ())) 
  
 { 
  
 // Do something with the image 
  
 } 
  
 } 
  
 } 
 } 
 
 
Edit images (text-and-image input)
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
You can ask a Gemini model to edit images by prompting with text and one or more images.
Make sure to create a GenerativeModel 
instance, include responseModalities: ["TEXT", "IMAGE"] 
generateContent 
.
Swift
  import 
  
 FirebaseAILogic 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 let 
  
 generativeModel 
  
 = 
  
 FirebaseAI 
 . 
 firebaseAI 
 ( 
 backend 
 : 
  
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 GenerationConfig 
 ( 
 responseModalities 
 : 
  
 [. 
 text 
 , 
  
 . 
 image 
 ]) 
 ) 
 // Provide an image for the model to edit 
 guard 
  
 let 
  
 image 
  
 = 
  
 UIImage 
 ( 
 named 
 : 
  
 "scones" 
 ) 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "Image file not found." 
 ) 
  
 } 
 // Provide a text prompt instructing the model to edit the image 
 let 
  
 prompt 
  
 = 
  
 "Edit this image to make it look like a cartoon" 
 // To edit the image, call `generateContent` with the image and text input 
 let 
  
 response 
  
 = 
  
 try 
  
 await 
  
 model 
 . 
 generateContent 
 ( 
 image 
 , 
  
 prompt 
 ) 
 // Handle the generated image 
 guard 
  
 let 
  
 inlineDataPart 
  
 = 
  
 response 
 . 
 inlineDataParts 
 . 
 first 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "No image data in response." 
 ) 
 } 
 guard 
  
 let 
  
 uiImage 
  
 = 
  
 UIImage 
 ( 
 data 
 : 
  
 inlineDataPart 
 . 
 data 
 ) 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "Failed to convert data to UIImage." 
 ) 
 } 
 
 
Kotlin
  // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 val 
  
 model 
  
 = 
  
 Firebase 
 . 
 ai 
 ( 
 backend 
  
 = 
  
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
  
 = 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
  
 = 
  
 generationConfig 
  
 { 
 responseModalities 
  
 = 
  
 listOf 
 ( 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 ) 
  
 } 
 ) 
 // Provide an image for the model to edit 
 val 
  
 bitmap 
  
 = 
  
 BitmapFactory 
 . 
 decodeResource 
 ( 
 context 
 . 
 resources 
 , 
  
 R 
 . 
 drawable 
 . 
 scones 
 ) 
 // Provide a text prompt instructing the model to edit the image 
 val 
  
 prompt 
  
 = 
  
 content 
  
 { 
  
 image 
 ( 
 bitmap 
 ) 
  
 text 
 ( 
 "Edit this image to make it look like a cartoon" 
 ) 
 } 
 // To edit the image, call `generateContent` with the prompt (image and text input) 
 val 
  
 generatedImageAsBitmap 
  
 = 
  
 model 
 . 
 generateContent 
 ( 
 prompt 
 ) 
  
 // Handle the generated text and image 
  
 . 
 candidates 
 . 
 first 
 (). 
 content 
 . 
 parts 
 . 
 filterIsInstance<ImagePart> 
 (). 
 firstOrNull 
 () 
 ?. 
 image 
 
 
Java
  // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 GenerativeModel 
  
 ai 
  
 = 
  
 FirebaseAI 
 . 
 getInstance 
 ( 
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 new 
  
 GenerationConfig 
 . 
 Builder 
 () 
  
 . 
 setResponseModalities 
 ( 
 Arrays 
 . 
 asList 
 ( 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 )) 
  
 . 
 build 
 () 
 ); 
 GenerativeModelFutures 
  
 model 
  
 = 
  
 GenerativeModelFutures 
 . 
 from 
 ( 
 ai 
 ); 
 // Provide an image for the model to edit 
 Bitmap 
  
 bitmap 
  
 = 
  
 BitmapFactory 
 . 
 decodeResource 
 ( 
 resources 
 , 
  
 R 
 . 
 drawable 
 . 
 scones 
 ); 
 // Provide a text prompt instructing the model to edit the image 
 Content 
  
 promptcontent 
  
 = 
  
 new 
  
 Content 
 . 
 Builder 
 () 
  
 . 
 addImage 
 ( 
 bitmap 
 ) 
  
 . 
 addText 
 ( 
 "Edit this image to make it look like a cartoon" 
 ) 
  
 . 
 build 
 (); 
 // To edit the image, call `generateContent` with the prompt (image and text input) 
 ListenableFuture<GenerateContentResponse> 
  
 response 
  
 = 
  
 model 
 . 
 generateContent 
 ( 
 promptcontent 
 ); 
 Futures 
 . 
 addCallback 
 ( 
 response 
 , 
  
 new 
  
 FutureCallback<GenerateContentResponse> 
 () 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onSuccess 
 ( 
 GenerateContentResponse 
  
 result 
 ) 
  
 { 
  
 // iterate over all the parts in the first candidate in the result object 
  
 for 
  
 ( 
 Part 
  
 part 
  
 : 
  
 result 
 . 
 getCandidates 
 (). 
 get 
 ( 
 0 
 ). 
 getContent 
 (). 
 getParts 
 ()) 
  
 { 
  
 if 
  
 ( 
 part 
  
 instanceof 
  
 ImagePart 
 ) 
  
 { 
  
 ImagePart 
  
 imagePart 
  
 = 
  
 ( 
 ImagePart 
 ) 
  
 part 
 ; 
  
 Bitmap 
  
 generatedImageAsBitmap 
  
 = 
  
 imagePart 
 . 
 getImage 
 (); 
  
 break 
 ; 
  
 } 
  
 } 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onFailure 
 ( 
 Throwable 
  
 t 
 ) 
  
 { 
  
 t 
 . 
 printStackTrace 
 (); 
  
 } 
 }, 
  
 executor 
 ); 
 
 
Web
  import 
  
 { 
  
 initializeApp 
  
 } 
  
 from 
  
 "firebase/app" 
 ; 
 import 
  
 { 
  
 getAI 
 , 
  
 getGenerativeModel 
 , 
  
 GoogleAIBackend 
 , 
  
 ResponseModality 
  
 } 
  
 from 
  
 "firebase/ai" 
 ; 
 // TODO(developer) Replace the following with your app's Firebase configuration 
 // See: https://firebase.google.com/docs/web/learn-more#config-object 
 const 
  
 firebaseConfig 
  
 = 
  
 { 
  
 // ... 
 }; 
 // Initialize FirebaseApp 
 const 
  
 firebaseApp 
  
 = 
  
 initializeApp 
 ( 
 firebaseConfig 
 ); 
 // Initialize the Gemini Developer API backend service 
 const 
  
 ai 
  
 = 
  
 getAI 
 ( 
 firebaseApp 
 , 
  
 { 
  
 backend 
 : 
  
 new 
  
 GoogleAIBackend 
 () 
  
 }); 
 // Create a `GenerativeModel` instance with a model that supports your use case 
 const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 model 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 { 
  
 responseModalities 
 : 
  
 [ 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 ], 
  
 }, 
 }); 
 // Prepare an image for the model to edit 
 async 
  
 function 
  
 fileToGenerativePart 
 ( 
 file 
 ) 
  
 { 
  
 const 
  
 base64EncodedDataPromise 
  
 = 
  
 new 
  
 Promise 
 (( 
 resolve 
 ) 
  
 => 
  
 { 
  
 const 
  
 reader 
  
 = 
  
 new 
  
 FileReader 
 (); 
  
 reader 
 . 
 onloadend 
  
 = 
  
 () 
  
 => 
  
 resolve 
 ( 
 reader 
 . 
 result 
 . 
 split 
 ( 
 ',' 
 )[ 
 1 
 ]); 
  
 reader 
 . 
 readAsDataURL 
 ( 
 file 
 ); 
  
 }); 
  
 return 
  
 { 
  
 inlineData 
 : 
  
 { 
  
 data 
 : 
  
 await 
  
 base64EncodedDataPromise 
 , 
  
 mimeType 
 : 
  
 file 
 . 
 type 
  
 }, 
  
 }; 
 } 
 // Provide a text prompt instructing the model to edit the image 
 const 
  
 prompt 
  
 = 
  
 "Edit this image to make it look like a cartoon" 
 ; 
 const 
  
 fileInputEl 
  
 = 
  
 document 
 . 
 querySelector 
 ( 
 "input[type=file]" 
 ); 
 const 
  
 imagePart 
  
 = 
  
 await 
  
 fileToGenerativePart 
 ( 
 fileInputEl 
 . 
 files 
 [ 
 0 
 ]); 
 // To edit the image, call `generateContent` with the image and text input 
 const 
  
 result 
  
 = 
  
 await 
  
 model 
 . 
 generateContent 
 ([ 
 prompt 
 , 
  
 imagePart 
 ]); 
 // Handle the generated image 
 try 
  
 { 
  
 const 
  
 inlineDataParts 
  
 = 
  
 result 
 . 
 response 
 . 
 inlineDataParts 
 (); 
  
 if 
  
 ( 
 inlineDataParts 
 ? 
 .[ 
 0 
 ]) 
  
 { 
  
 const 
  
 image 
  
 = 
  
 inlineDataParts 
 [ 
 0 
 ]. 
 inlineData 
 ; 
  
 console 
 . 
 log 
 ( 
 image 
 . 
 mimeType 
 , 
  
 image 
 . 
 data 
 ); 
  
 } 
 } 
  
 catch 
  
 ( 
 err 
 ) 
  
 { 
  
 console 
 . 
 error 
 ( 
 'Prompt or candidate was blocked:' 
 , 
  
 err 
 ); 
 } 
 
 
Dart
  import 
  
 'package:firebase_ai/firebase_ai.dart' 
 ; 
 import 
  
 'package:firebase_core/firebase_core.dart' 
 ; 
 import 
  
 'firebase_options.dart' 
 ; 
 await 
  
 Firebase 
 . 
 initializeApp 
 ( 
  
 options: 
  
 DefaultFirebaseOptions 
 . 
 currentPlatform 
 , 
 ); 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 final 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 googleAI 
 (). 
 generativeModel 
 ( 
  
 model: 
  
 'gemini-2.5-flash-image' 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig: 
  
 GenerationConfig 
 ( 
 responseModalities: 
  
 [ 
 ResponseModalities 
 . 
 text 
 , 
  
 ResponseModalities 
 . 
 image 
 ]), 
 ); 
 // Prepare an image for the model to edit 
 final 
  
 image 
  
 = 
  
 await 
  
 File 
 ( 
 'scones.jpg' 
 ). 
 readAsBytes 
 (); 
 final 
  
 imagePart 
  
 = 
  
 InlineDataPart 
 ( 
 'image/jpeg' 
 , 
  
 image 
 ); 
 // Provide a text prompt instructing the model to edit the image 
 final 
  
 prompt 
  
 = 
  
 TextPart 
 ( 
 "Edit this image to make it look like a cartoon" 
 ); 
 // To edit the image, call `generateContent` with the image and text input 
 final 
  
 response 
  
 = 
  
 await 
  
 model 
 . 
 generateContent 
 ([ 
  
 Content 
 . 
 multi 
 ([ 
 prompt 
 , 
 imagePart 
 ]) 
 ]); 
 // Handle the generated image 
 if 
  
 ( 
 response 
 . 
 inlineDataParts 
 . 
 isNotEmpty 
 ) 
  
 { 
  
 final 
  
 imageBytes 
  
 = 
  
 response 
 . 
 inlineDataParts 
 [ 
 0 
 ]. 
 bytes 
 ; 
  
 // Process the image 
 } 
  
 else 
  
 { 
  
 // Handle the case where no images were generated 
  
 print 
 ( 
 'Error: No images were generated.' 
 ); 
 } 
 
 
Unity
  using 
  
 Firebase 
 ; 
 using 
  
 Firebase.AI 
 ; 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 var 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 GetInstance 
 ( 
 FirebaseAI 
 . 
 Backend 
 . 
 GoogleAI 
 ()). 
 GetGenerativeModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 new 
  
 GenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 new 
 [] 
  
 { 
  
 ResponseModality 
 . 
 Text 
 , 
  
 ResponseModality 
 . 
 Image 
  
 }) 
 ); 
 // Prepare an image for the model to edit 
 var 
  
 imageFile 
  
 = 
  
 System 
 . 
 IO 
 . 
 File 
 . 
 ReadAllBytes 
 ( 
 System 
 . 
 IO 
 . 
 Path 
 . 
 Combine 
 ( 
  
 UnityEngine 
 . 
 Application 
 . 
 streamingAssetsPath 
 , 
  
 "scones.jpg" 
 )); 
 var 
  
 image 
  
 = 
  
 ModelContent 
 . 
 InlineData 
 ( 
 "image/jpeg" 
 , 
  
 imageFile 
 ); 
 // Provide a text prompt instructing the model to edit the image 
 var 
  
 prompt 
  
 = 
  
 ModelContent 
 . 
 Text 
 ( 
 "Edit this image to make it look like a cartoon." 
 ); 
 // To edit the image, call `GenerateContent` with the image and text input 
 var 
  
 response 
  
 = 
  
 await 
  
 model 
 . 
 GenerateContentAsync 
 ( 
 new 
  
 [] 
  
 { 
  
 prompt 
 , 
  
 image 
  
 }); 
 var 
  
 text 
  
 = 
  
 response 
 . 
 Text 
 ; 
 if 
  
 ( 
 ! 
 string 
 . 
 IsNullOrWhiteSpace 
 ( 
 text 
 )) 
  
 { 
  
 // Do something with the text 
 } 
 // Handle the generated image 
 var 
  
 imageParts 
  
 = 
  
 response 
 . 
 Candidates 
 . 
 First 
 (). 
 Content 
 . 
 Parts 
  
 . 
 OfType 
 < 
 ModelContent 
 . 
 InlineDataPart 
 > 
 () 
  
 . 
 Where 
 ( 
 part 
  
 => 
  
 part 
 . 
 MimeType 
  
 == 
  
 "image/png" 
 ); 
 foreach 
  
 ( 
 var 
  
 imagePart 
  
 in 
  
 imageParts 
 ) 
  
 { 
  
 // Load the Image into a Unity Texture2D object 
  
 Texture2D 
  
 texture2D 
  
 = 
  
 new 
  
 Texture2D 
 ( 
 2 
 , 
  
 2 
 ); 
  
 if 
  
 ( 
 texture2D 
 . 
 LoadImage 
 ( 
 imagePart 
 . 
 Data 
 . 
 ToArray 
 ())) 
  
 { 
  
 // Do something with the image 
  
 } 
 } 
 
 
Iterate and edit images using multi-turn chat
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
Using multi-turn chat, you can iterate with a Gemini model on the images that it generates or that you supply.
Make sure to create a GenerativeModel 
instance, include responseModalities: ["TEXT", "IMAGE"] 
startChat() 
and sendMessage() 
to send new user
messages.
Swift
  import 
  
 FirebaseAILogic 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 let 
  
 generativeModel 
  
 = 
  
 FirebaseAI 
 . 
 firebaseAI 
 ( 
 backend 
 : 
  
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 GenerationConfig 
 ( 
 responseModalities 
 : 
  
 [. 
 text 
 , 
  
 . 
 image 
 ]) 
 ) 
 // Initialize the chat 
 let 
  
 chat 
  
 = 
  
 model 
 . 
 startChat 
 () 
 guard 
  
 let 
  
 image 
  
 = 
  
 UIImage 
 ( 
 named 
 : 
  
 "scones" 
 ) 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "Image file not found." 
 ) 
  
 } 
 // Provide an initial text prompt instructing the model to edit the image 
 let 
  
 prompt 
  
 = 
  
 "Edit this image to make it look like a cartoon" 
 // To generate an initial response, send a user message with the image and text prompt 
 let 
  
 response 
  
 = 
  
 try 
  
 await 
  
 chat 
 . 
 sendMessage 
 ( 
 image 
 , 
  
 prompt 
 ) 
 // Inspect the generated image 
 guard 
  
 let 
  
 inlineDataPart 
  
 = 
  
 response 
 . 
 inlineDataParts 
 . 
 first 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "No image data in response." 
 ) 
 } 
 guard 
  
 let 
  
 uiImage 
  
 = 
  
 UIImage 
 ( 
 data 
 : 
  
 inlineDataPart 
 . 
 data 
 ) 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "Failed to convert data to UIImage." 
 ) 
 } 
 // Follow up requests do not need to specify the image again 
 let 
  
 followUpResponse 
  
 = 
  
 try 
  
 await 
  
 chat 
 . 
 sendMessage 
 ( 
 "But make it old-school line drawing style" 
 ) 
 // Inspect the edited image after the follow up request 
 guard 
  
 let 
  
 followUpInlineDataPart 
  
 = 
  
 followUpResponse 
 . 
 inlineDataParts 
 . 
 first 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "No image data in response." 
 ) 
 } 
 guard 
  
 let 
  
 followUpUIImage 
  
 = 
  
 UIImage 
 ( 
 data 
 : 
  
 followUpInlineDataPart 
 . 
 data 
 ) 
  
 else 
  
 { 
  
 fatalError 
 ( 
 "Failed to convert data to UIImage." 
 ) 
 } 
 
 
Kotlin
  // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 val 
  
 model 
  
 = 
  
 Firebase 
 . 
 ai 
 ( 
 backend 
  
 = 
  
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 modelName 
  
 = 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
  
 = 
  
 generationConfig 
  
 { 
 responseModalities 
  
 = 
  
 listOf 
 ( 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 ) 
  
 } 
 ) 
 // Provide an image for the model to edit 
 val 
  
 bitmap 
  
 = 
  
 BitmapFactory 
 . 
 decodeResource 
 ( 
 context 
 . 
 resources 
 , 
  
 R 
 . 
 drawable 
 . 
 scones 
 ) 
 // Create the initial prompt instructing the model to edit the image 
 val 
  
 prompt 
  
 = 
  
 content 
  
 { 
  
 image 
 ( 
 bitmap 
 ) 
  
 text 
 ( 
 "Edit this image to make it look like a cartoon" 
 ) 
 } 
 // Initialize the chat 
 val 
  
 chat 
  
 = 
  
 model 
 . 
 startChat 
 () 
 // To generate an initial response, send a user message with the image and text prompt 
 var 
  
 response 
  
 = 
  
 chat 
 . 
 sendMessage 
 ( 
 prompt 
 ) 
 // Inspect the returned image 
 var 
  
 generatedImageAsBitmap 
  
 = 
  
 response 
  
 . 
 candidates 
 . 
 first 
 (). 
 content 
 . 
 parts 
 . 
 filterIsInstance<ImagePart> 
 (). 
 firstOrNull 
 () 
 ?. 
 image 
 // Follow up requests do not need to specify the image again 
 response 
  
 = 
  
 chat 
 . 
 sendMessage 
 ( 
 "But make it old-school line drawing style" 
 ) 
 generatedImageAsBitmap 
  
 = 
  
 response 
  
 . 
 candidates 
 . 
 first 
 (). 
 content 
 . 
 parts 
 . 
 filterIsInstance<ImagePart> 
 (). 
 firstOrNull 
 () 
 ?. 
 image 
 
 
Java
  // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 GenerativeModel 
  
 ai 
  
 = 
  
 FirebaseAI 
 . 
 getInstance 
 ( 
 GenerativeBackend 
 . 
 googleAI 
 ()). 
 generativeModel 
 ( 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 new 
  
 GenerationConfig 
 . 
 Builder 
 () 
  
 . 
 setResponseModalities 
 ( 
 Arrays 
 . 
 asList 
 ( 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 )) 
  
 . 
 build 
 () 
 ); 
 GenerativeModelFutures 
  
 model 
  
 = 
  
 GenerativeModelFutures 
 . 
 from 
 ( 
 ai 
 ); 
 // Provide an image for the model to edit 
 Bitmap 
  
 bitmap 
  
 = 
  
 BitmapFactory 
 . 
 decodeResource 
 ( 
 resources 
 , 
  
 R 
 . 
 drawable 
 . 
 scones 
 ); 
 // Initialize the chat 
 ChatFutures 
  
 chat 
  
 = 
  
 model 
 . 
 startChat 
 (); 
 // Create the initial prompt instructing the model to edit the image 
 Content 
  
 prompt 
  
 = 
  
 new 
  
 Content 
 . 
 Builder 
 () 
  
 . 
 setRole 
 ( 
 "user" 
 ) 
  
 . 
 addImage 
 ( 
 bitmap 
 ) 
  
 . 
 addText 
 ( 
 "Edit this image to make it look like a cartoon" 
 ) 
  
 . 
 build 
 (); 
 // To generate an initial response, send a user message with the image and text prompt 
 ListenableFuture<GenerateContentResponse> 
  
 response 
  
 = 
  
 chat 
 . 
 sendMessage 
 ( 
 prompt 
 ); 
 // Extract the image from the initial response 
 ListenableFuture 
< @Nullable 
  
 Bitmap 
>  
 initialRequest 
  
 = 
  
 Futures 
 . 
 transform 
 ( 
 response 
 , 
  
 result 
  
 - 
>  
 { 
  
 for 
  
 ( 
 Part 
  
 part 
  
 : 
  
 result 
 . 
 getCandidates 
 (). 
 get 
 ( 
 0 
 ). 
 getContent 
 (). 
 getParts 
 ()) 
  
 { 
  
 if 
  
 ( 
 part 
  
 instanceof 
  
 ImagePart 
 ) 
  
 { 
  
 ImagePart 
  
 imagePart 
  
 = 
  
 ( 
 ImagePart 
 ) 
  
 part 
 ; 
  
 return 
  
 imagePart 
 . 
 getImage 
 (); 
  
 } 
  
 } 
  
 return 
  
 null 
 ; 
 }, 
  
 executor 
 ); 
 // Follow up requests do not need to specify the image again 
 ListenableFuture<GenerateContentResponse> 
  
 modelResponseFuture 
  
 = 
  
 Futures 
 . 
 transformAsync 
 ( 
  
 initialRequest 
 , 
  
 generatedImage 
  
 - 
>  
 { 
  
 Content 
  
 followUpPrompt 
  
 = 
  
 new 
  
 Content 
 . 
 Builder 
 () 
  
 . 
 addText 
 ( 
 "But make it old-school line drawing style" 
 ) 
  
 . 
 build 
 (); 
  
 return 
  
 chat 
 . 
 sendMessage 
 ( 
 followUpPrompt 
 ); 
  
 }, 
  
 executor 
 ); 
 // Add a final callback to check the reworked image 
 Futures 
 . 
 addCallback 
 ( 
 modelResponseFuture 
 , 
  
 new 
  
 FutureCallback<GenerateContentResponse> 
 () 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onSuccess 
 ( 
 GenerateContentResponse 
  
 result 
 ) 
  
 { 
  
 for 
  
 ( 
 Part 
  
 part 
  
 : 
  
 result 
 . 
 getCandidates 
 (). 
 get 
 ( 
 0 
 ). 
 getContent 
 (). 
 getParts 
 ()) 
  
 { 
  
 if 
  
 ( 
 part 
  
 instanceof 
  
 ImagePart 
 ) 
  
 { 
  
 ImagePart 
  
 imagePart 
  
 = 
  
 ( 
 ImagePart 
 ) 
  
 part 
 ; 
  
 Bitmap 
  
 generatedImageAsBitmap 
  
 = 
  
 imagePart 
 . 
 getImage 
 (); 
  
 break 
 ; 
  
 } 
  
 } 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onFailure 
 ( 
 Throwable 
  
 t 
 ) 
  
 { 
  
 t 
 . 
 printStackTrace 
 (); 
  
 } 
 }, 
  
 executor 
 ); 
 
 
Web
  import 
  
 { 
  
 initializeApp 
  
 } 
  
 from 
  
 "firebase/app" 
 ; 
 import 
  
 { 
  
 getAI 
 , 
  
 getGenerativeModel 
 , 
  
 GoogleAIBackend 
 , 
  
 ResponseModality 
  
 } 
  
 from 
  
 "firebase/ai" 
 ; 
 // TODO(developer) Replace the following with your app's Firebase configuration 
 // See: https://firebase.google.com/docs/web/learn-more#config-object 
 const 
  
 firebaseConfig 
  
 = 
  
 { 
  
 // ... 
 }; 
 // Initialize FirebaseApp 
 const 
  
 firebaseApp 
  
 = 
  
 initializeApp 
 ( 
 firebaseConfig 
 ); 
 // Initialize the Gemini Developer API backend service 
 const 
  
 ai 
  
 = 
  
 getAI 
 ( 
 firebaseApp 
 , 
  
 { 
  
 backend 
 : 
  
 new 
  
 GoogleAIBackend 
 () 
  
 }); 
 // Create a `GenerativeModel` instance with a model that supports your use case 
 const 
  
 model 
  
 = 
  
 getGenerativeModel 
 ( 
 ai 
 , 
  
 { 
  
 model 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 { 
  
 responseModalities 
 : 
  
 [ 
 ResponseModality 
 . 
 TEXT 
 , 
  
 ResponseModality 
 . 
 IMAGE 
 ], 
  
 }, 
 }); 
 // Prepare an image for the model to edit 
 async 
  
 function 
  
 fileToGenerativePart 
 ( 
 file 
 ) 
  
 { 
  
 const 
  
 base64EncodedDataPromise 
  
 = 
  
 new 
  
 Promise 
 (( 
 resolve 
 ) 
  
 => 
  
 { 
  
 const 
  
 reader 
  
 = 
  
 new 
  
 FileReader 
 (); 
  
 reader 
 . 
 onloadend 
  
 = 
  
 () 
  
 => 
  
 resolve 
 ( 
 reader 
 . 
 result 
 . 
 split 
 ( 
 ',' 
 )[ 
 1 
 ]); 
  
 reader 
 . 
 readAsDataURL 
 ( 
 file 
 ); 
  
 }); 
  
 return 
  
 { 
  
 inlineData 
 : 
  
 { 
  
 data 
 : 
  
 await 
  
 base64EncodedDataPromise 
 , 
  
 mimeType 
 : 
  
 file 
 . 
 type 
  
 }, 
  
 }; 
 } 
 const 
  
 fileInputEl 
  
 = 
  
 document 
 . 
 querySelector 
 ( 
 "input[type=file]" 
 ); 
 const 
  
 imagePart 
  
 = 
  
 await 
  
 fileToGenerativePart 
 ( 
 fileInputEl 
 . 
 files 
 [ 
 0 
 ]); 
 // Provide an initial text prompt instructing the model to edit the image 
 const 
  
 prompt 
  
 = 
  
 "Edit this image to make it look like a cartoon" 
 ; 
 // Initialize the chat 
 const 
  
 chat 
  
 = 
  
 model 
 . 
 startChat 
 (); 
 // To generate an initial response, send a user message with the image and text prompt 
 const 
  
 result 
  
 = 
  
 await 
  
 chat 
 . 
 sendMessage 
 ([ 
 prompt 
 , 
  
 imagePart 
 ]); 
 // Request and inspect the generated image 
 try 
  
 { 
  
 const 
  
 inlineDataParts 
  
 = 
  
 result 
 . 
 response 
 . 
 inlineDataParts 
 (); 
  
 if 
  
 ( 
 inlineDataParts 
 ? 
 .[ 
 0 
 ]) 
  
 { 
  
 // Inspect the generated image 
  
 const 
  
 image 
  
 = 
  
 inlineDataParts 
 [ 
 0 
 ]. 
 inlineData 
 ; 
  
 console 
 . 
 log 
 ( 
 image 
 . 
 mimeType 
 , 
  
 image 
 . 
 data 
 ); 
  
 } 
 } 
  
 catch 
  
 ( 
 err 
 ) 
  
 { 
  
 console 
 . 
 error 
 ( 
 'Prompt or candidate was blocked:' 
 , 
  
 err 
 ); 
 } 
 // Follow up requests do not need to specify the image again 
 const 
  
 followUpResult 
  
 = 
  
 await 
  
 chat 
 . 
 sendMessage 
 ( 
 "But make it old-school line drawing style" 
 ); 
 // Request and inspect the returned image 
 try 
  
 { 
  
 const 
  
 followUpInlineDataParts 
  
 = 
  
 followUpResult 
 . 
 response 
 . 
 inlineDataParts 
 (); 
  
 if 
  
 ( 
 followUpInlineDataParts 
 ? 
 .[ 
 0 
 ]) 
  
 { 
  
 // Inspect the generated image 
  
 const 
  
 followUpImage 
  
 = 
  
 followUpInlineDataParts 
 [ 
 0 
 ]. 
 inlineData 
 ; 
  
 console 
 . 
 log 
 ( 
 followUpImage 
 . 
 mimeType 
 , 
  
 followUpImage 
 . 
 data 
 ); 
  
 } 
 } 
  
 catch 
  
 ( 
 err 
 ) 
  
 { 
  
 console 
 . 
 error 
 ( 
 'Prompt or candidate was blocked:' 
 , 
  
 err 
 ); 
 } 
 
 
Dart
  import 
  
 'package:firebase_ai/firebase_ai.dart' 
 ; 
 import 
  
 'package:firebase_core/firebase_core.dart' 
 ; 
 import 
  
 'firebase_options.dart' 
 ; 
 await 
  
 Firebase 
 . 
 initializeApp 
 ( 
  
 options: 
  
 DefaultFirebaseOptions 
 . 
 currentPlatform 
 , 
 ); 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 final 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 googleAI 
 (). 
 generativeModel 
 ( 
  
 model: 
  
 'gemini-2.5-flash-image' 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig: 
  
 GenerationConfig 
 ( 
 responseModalities: 
  
 [ 
 ResponseModalities 
 . 
 text 
 , 
  
 ResponseModalities 
 . 
 image 
 ]), 
 ); 
 // Prepare an image for the model to edit 
 final 
  
 image 
  
 = 
  
 await 
  
 File 
 ( 
 'scones.jpg' 
 ). 
 readAsBytes 
 (); 
 final 
  
 imagePart 
  
 = 
  
 InlineDataPart 
 ( 
 'image/jpeg' 
 , 
  
 image 
 ); 
 // Provide an initial text prompt instructing the model to edit the image 
 final 
  
 prompt 
  
 = 
  
 TextPart 
 ( 
 "Edit this image to make it look like a cartoon" 
 ); 
 // Initialize the chat 
 final 
  
 chat 
  
 = 
  
 model 
 . 
 startChat 
 (); 
 // To generate an initial response, send a user message with the image and text prompt 
 final 
  
 response 
  
 = 
  
 await 
  
 chat 
 . 
 sendMessage 
 ([ 
  
 Content 
 . 
 multi 
 ([ 
 prompt 
 , 
 imagePart 
 ]) 
 ]); 
 // Inspect the returned image 
 if 
  
 ( 
 response 
 . 
 inlineDataParts 
 . 
 isNotEmpty 
 ) 
  
 { 
  
 final 
  
 imageBytes 
  
 = 
  
 response 
 . 
 inlineDataParts 
 [ 
 0 
 ]. 
 bytes 
 ; 
  
 // Process the image 
 } 
  
 else 
  
 { 
  
 // Handle the case where no images were generated 
  
 print 
 ( 
 'Error: No images were generated.' 
 ); 
 } 
 // Follow up requests do not need to specify the image again 
 final 
  
 followUpResponse 
  
 = 
  
 await 
  
 chat 
 . 
 sendMessage 
 ([ 
  
 Content 
 . 
 text 
 ( 
 "But make it old-school line drawing style" 
 ) 
 ]); 
 // Inspect the returned image 
 if 
  
 ( 
 followUpResponse 
 . 
 inlineDataParts 
 . 
 isNotEmpty 
 ) 
  
 { 
  
 final 
  
 followUpImageBytes 
  
 = 
  
 response 
 . 
 inlineDataParts 
 [ 
 0 
 ]. 
 bytes 
 ; 
  
 // Process the image 
 } 
  
 else 
  
 { 
  
 // Handle the case where no images were generated 
  
 print 
 ( 
 'Error: No images were generated.' 
 ); 
 } 
 
 
Unity
  using 
  
 Firebase 
 ; 
 using 
  
 Firebase.AI 
 ; 
 // Initialize the Gemini Developer API backend service 
 // Create a `GenerativeModel` instance with a Gemini model that supports image output 
 var 
  
 model 
  
 = 
  
 FirebaseAI 
 . 
 GetInstance 
 ( 
 FirebaseAI 
 . 
 Backend 
 . 
 GoogleAI 
 ()). 
 GetGenerativeModel 
 ( 
  
 modelName 
 : 
  
 "gemini-2.5-flash-image" 
 , 
  
 // Configure the model to respond with text and images (required) 
  
 generationConfig 
 : 
  
 new 
  
 GenerationConfig 
 ( 
  
 responseModalities 
 : 
  
 new 
 [] 
  
 { 
  
 ResponseModality 
 . 
 Text 
 , 
  
 ResponseModality 
 . 
 Image 
  
 }) 
 ); 
 // Prepare an image for the model to edit 
 var 
  
 imageFile 
  
 = 
  
 System 
 . 
 IO 
 . 
 File 
 . 
 ReadAllBytes 
 ( 
 System 
 . 
 IO 
 . 
 Path 
 . 
 Combine 
 ( 
  
 UnityEngine 
 . 
 Application 
 . 
 streamingAssetsPath 
 , 
  
 "scones.jpg" 
 )); 
 var 
  
 image 
  
 = 
  
 ModelContent 
 . 
 InlineData 
 ( 
 "image/jpeg" 
 , 
  
 imageFile 
 ); 
 // Provide an initial text prompt instructing the model to edit the image 
 var 
  
 prompt 
  
 = 
  
 ModelContent 
 . 
 Text 
 ( 
 "Edit this image to make it look like a cartoon." 
 ); 
 // Initialize the chat 
 var 
  
 chat 
  
 = 
  
 model 
 . 
 StartChat 
 (); 
 // To generate an initial response, send a user message with the image and text prompt 
 var 
  
 response 
  
 = 
  
 await 
  
 chat 
 . 
 SendMessageAsync 
 ( 
 new 
  
 [] 
  
 { 
  
 prompt 
 , 
  
 image 
  
 }); 
 // Inspect the returned image 
 var 
  
 imageParts 
  
 = 
  
 response 
 . 
 Candidates 
 . 
 First 
 (). 
 Content 
 . 
 Parts 
  
 . 
 OfType 
 < 
 ModelContent 
 . 
 InlineDataPart 
 > 
 () 
  
 . 
 Where 
 ( 
 part 
  
 => 
  
 part 
 . 
 MimeType 
  
 == 
  
 "image/png" 
 ); 
 // Load the image into a Unity Texture2D object 
 UnityEngine 
 . 
 Texture2D 
  
 texture2D 
  
 = 
  
 new 
 ( 
 2 
 , 
  
 2 
 ); 
 if 
  
 ( 
 texture2D 
 . 
 LoadImage 
 ( 
 imageParts 
 . 
 First 
 (). 
 Data 
 . 
 ToArray 
 ())) 
  
 { 
  
 // Do something with the image 
 } 
 // Follow up requests do not need to specify the image again 
 var 
  
 followUpResponse 
  
 = 
  
 await 
  
 chat 
 . 
 SendMessageAsync 
 ( 
 "But make it old-school line drawing style" 
 ); 
 // Inspect the returned image 
 var 
  
 followUpImageParts 
  
 = 
  
 followUpResponse 
 . 
 Candidates 
 . 
 First 
 (). 
 Content 
 . 
 Parts 
  
 . 
 OfType 
 < 
 ModelContent 
 . 
 InlineDataPart 
 > 
 () 
  
 . 
 Where 
 ( 
 part 
  
 => 
  
 part 
 . 
 MimeType 
  
 == 
  
 "image/png" 
 ); 
 // Load the image into a Unity Texture2D object 
 UnityEngine 
 . 
 Texture2D 
  
 followUpTexture2D 
  
 = 
  
 new 
 ( 
 2 
 , 
  
 2 
 ); 
 if 
  
 ( 
 followUpTexture2D 
 . 
 LoadImage 
 ( 
 followUpImageParts 
 . 
 First 
 (). 
 Data 
 . 
 ToArray 
 ())) 
  
 { 
  
 // Do something with the image 
 } 
 
 
Supported features, limitations, and best practices
Supported modalities and capabilities
The following are supported modalities and capabilities for image-output from a Gemini model. Each capability shows an example prompt and has an example code sample above.
-  Text Image(s) (text-only to image) - Generate an image of the Eiffel tower with fireworks in the background.
 
-  Text Image(s) (text rendering within image) - Generate a cinematic photo of a large building with this giant text projection mapped on the front of the building.
 
-  Text Image(s) & Text (interleaved) -  Generate an illustrated recipe for a paella. Create images alongside the text as you generate the recipe. 
-  Generate a story about a dog in a 3D cartoon animation style. For each scene, generate an image. 
 
-  
-  Image(s) & Text Image(s) & Text (interleaved) - [image of a furnished room] + What other color sofas would work in my space? Can you update the image?
 
-  Image editing (text-and-image to image) -  [image of scones] + Edit this image to make it look like a cartoon 
-  [image of a cat] + [image of a pillow] + Create a cross stitch of my cat on this pillow. 
 
-  
-  Multi-turn image editing (chat) - [image of a blue car] + Turn this car into a convertible. , then Now change the color to yellow.
 
Limitations and best practices
The following are limitations and best practices for image-output from a Gemini model.
-  Image-generating Gemini models support the following: - Generating PNG images with a maximum dimension of 1024 px.
- Generating and editing images of people.
- Using safety filters that provide a flexible and less restrictive user experience.
 
-  Image-generating Gemini models do not support the following: - Including audio or video inputs.
- Generating only 
images.
 The models will always return both text and images, and you must includeresponseModalities: ["TEXT", "IMAGE"]
 
-  For best performance, use the following languages: en,es-mx,ja-jp,zh-cn,hi-in.
-  Image generation may not always trigger. Here are some known issues: -  The model may output text only.Try asking for image outputs explicitly (for example, "generate an image", "provide images as you go along", "update the image"). 
-  The model may stop generating partway through.Try again or try a different prompt. 
-  The model may generate text as an image.Try asking for text outputs explicitly. For example, "generate narrative text along with illustrations." 
 
-  
-  When generating text for an image, Gemini works best if you first generate the text and then ask for an image with the text. 

