You can ask a Gemini model to generate and edit images using both text-only and text-and-image prompts. When you use Firebase AI Logic , you can make this request directly from your app.
With this capability, you can do things like:
-
Iteratively generate images through conversation with natural language, adjusting images while maintaining consistency and context.
-
Generate images with high-quality text rendering, including long strings of text.
-
Generate interleaved text-image output. For example, a blog post with text and images in a single turn. Previously, this required stringing together multiple models.
-
Generate images using Gemini's world knowledge and reasoning capabilities.
You can find a complete list of supported modalities and capabilities (along with example prompts) later on this page.
Jump to code for text-to-image Jump to code for interleaved text & images
Jump to code for image editing Jump to code for iterative image editing
Analyze images Analyze images on-device Generate structured output
Choosing between Gemini and Imagen models
The Firebase AI Logic SDKs support image generation and editing using either a Gemini model or an Imagen model.
For most use cases, start with Gemini , and then choose Imagen only for specialized tasks where image quality is critical.
Choose Gemini when you want:
- To use world knowledge and reasoning to generate contextually relevant images.
- To seamlessly blend text and images or to interleave text and image output.
- To embed accurate visuals within long text sequences.
- To edit images conversationally while maintaining context.
Choose Imagen when you want:
- To prioritize image quality, photorealism, artistic detail, or specific styles (for example, impressionism or anime).
- To infuse branding, style, or generation of logos and product designs.
- To explicitly specify the aspect ratio or format of generated images.
Before you begin
Click your Gemini API provider to view provider-specific content and code on this page.
If you haven't already, complete the getting started guide
, which describes how to
set up your Firebase project, connect your app to Firebase, add the SDK,
initialize the backend service for your chosen Gemini API
provider, and
create a GenerativeModel
instance.
For testing and iterating on your prompts and even getting a generated code snippet, we recommend using Google AI Studio .
Models that support this capability
-
gemini-2.5-flash-image-preview
(aka "nano banana") -
gemini-2.0-flash-preview-image-generation
Take note that the segment order is different between the 2.0 model name and the
2.5 model name. Also, image-output from Gemini
is not
supported by
the standard Flash models like gemini-2.5-flash
or gemini-2.0-flash
.
Note that the SDKs also support image generation using Imagen models .
Generate and edit images
You can generate and edit images using a Gemini model.
Generate images (text-only input)
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
You can ask a Gemini model to generate images by prompting with text.
Make sure to create a GenerativeModel
instance, include responseModalities: ["TEXT", "IMAGE"]
generateContent
.
Swift
import
FirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let
generativeModel
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
()).
generativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
GenerationConfig
(
responseModalities
:
[.
text
,
.
image
])
)
// Provide a text prompt instructing the model to generate an image
let
prompt
=
"Generate an image of the Eiffel tower with fireworks in the background."
// To generate an image, call `generateContent` with the text input
let
response
=
try
await
model
.
generateContent
(
prompt
)
// Handle the generated image
guard
let
inlineDataPart
=
response
.
inlineDataParts
.
first
else
{
fatalError
(
"No image data in response."
)
}
guard
let
uiImage
=
UIImage
(
data
:
inlineDataPart
.
data
)
else
{
fatalError
(
"Failed to convert data to UIImage."
)
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val
model
=
Firebase
.
ai
(
backend
=
GenerativeBackend
.
googleAI
()).
generativeModel
(
modelName
=
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
=
generationConfig
{
responseModalities
=
listOf
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
)
}
)
// Provide a text prompt instructing the model to generate an image
val
prompt
=
"Generate an image of the Eiffel tower with fireworks in the background."
// To generate image output, call `generateContent` with the text input
val
generatedImageAsBitmap
=
model
.
generateContent
(
prompt
)
// Handle the generated image
.
candidates
.
first
().
content
.
parts
.
filterIsInstance<ImagePart>
().
firstOrNull
()
?.
image
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel
ai
=
FirebaseAI
.
getInstance
(
GenerativeBackend
.
googleAI
()).
generativeModel
(
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
new
GenerationConfig
.
Builder
()
.
setResponseModalities
(
Arrays
.
asList
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
))
.
build
()
);
GenerativeModelFutures
model
=
GenerativeModelFutures
.
from
(
ai
);
// Provide a text prompt instructing the model to generate an image
Content
prompt
=
new
Content
.
Builder
()
.
addText
(
"Generate an image of the Eiffel Tower with fireworks in the background."
)
.
build
();
// To generate an image, call `generateContent` with the text input
ListenableFuture<GenerateContentResponse>
response
=
model
.
generateContent
(
prompt
);
Futures
.
addCallback
(
response
,
new
FutureCallback<GenerateContentResponse>
()
{
@Override
public
void
onSuccess
(
GenerateContentResponse
result
)
{
// iterate over all the parts in the first candidate in the result object
for
(
Part
part
:
result
.
getCandidates
().
get
(
0
).
getContent
().
getParts
())
{
if
(
part
instanceof
ImagePart
)
{
ImagePart
imagePart
=
(
ImagePart
)
part
;
// The returned image as a bitmap
Bitmap
generatedImageAsBitmap
=
imagePart
.
getImage
();
break
;
}
}
}
@Override
public
void
onFailure
(
Throwable
t
)
{
t
.
printStackTrace
();
}
},
executor
);
Web
import
{
initializeApp
}
from
"firebase/app"
;
import
{
getAI
,
getGenerativeModel
,
GoogleAIBackend
,
ResponseModality
}
from
"firebase/ai"
;
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const
firebaseConfig
=
{
// ...
};
// Initialize FirebaseApp
const
firebaseApp
=
initializeApp
(
firebaseConfig
);
// Initialize the Gemini Developer API backend service
const
ai
=
getAI
(
firebaseApp
,
{
backend
:
new
GoogleAIBackend
()
});
// Create a `GenerativeModel` instance with a model that supports your use case
const
model
=
getGenerativeModel
(
ai
,
{
model
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
{
responseModalities
:
[
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
],
},
});
// Provide a text prompt instructing the model to generate an image
const
prompt
=
'Generate an image of the Eiffel Tower with fireworks in the background.'
;
// To generate an image, call `generateContent` with the text input
const
result
=
model
.
generateContent
(
prompt
);
// Handle the generated image
try
{
const
inlineDataParts
=
result
.
response
.
inlineDataParts
();
if
(
inlineDataParts
?
.[
0
])
{
const
image
=
inlineDataParts
[
0
].
inlineData
;
console
.
log
(
image
.
mimeType
,
image
.
data
);
}
}
catch
(
err
)
{
console
.
error
(
'Prompt or candidate was blocked:'
,
err
);
}
Dart
import
'package:firebase_ai/firebase_ai.dart'
;
import
'package:firebase_core/firebase_core.dart'
;
import
'firebase_options.dart'
;
await
Firebase
.
initializeApp
(
options:
DefaultFirebaseOptions
.
currentPlatform
,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final
model
=
FirebaseAI
.
googleAI
().
generativeModel
(
model:
'gemini-2.5-flash-image-preview'
,
// Configure the model to respond with text and images (required)
generationConfig:
GenerationConfig
(
responseModalities:
[
ResponseModalities
.
text
,
ResponseModalities
.
image
]),
);
// Provide a text prompt instructing the model to generate an image
final
prompt
=
[
Content
.
text
(
'Generate an image of the Eiffel Tower with fireworks in the background.'
)];
// To generate an image, call `generateContent` with the text input
final
response
=
await
model
.
generateContent
(
prompt
);
if
(
response
.
inlineDataParts
.
isNotEmpty
)
{
final
imageBytes
=
response
.
inlineDataParts
[
0
].
bytes
;
// Process the image
}
else
{
// Handle the case where no images were generated
print
(
'Error: No images were generated.'
);
}
Unity
using
Firebase
;
using
Firebase.AI
;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var
model
=
FirebaseAI
.
GetInstance
(
FirebaseAI
.
Backend
.
GoogleAI
()).
GetGenerativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
new
GenerationConfig
(
responseModalities
:
new
[]
{
ResponseModality
.
Text
,
ResponseModality
.
Image
})
);
// Provide a text prompt instructing the model to generate an image
var
prompt
=
"Generate an image of the Eiffel Tower with fireworks in the background."
;
// To generate an image, call `GenerateContentAsync` with the text input
var
response
=
await
model
.
GenerateContentAsync
(
prompt
);
var
text
=
response
.
Text
;
if
(
!
string
.
IsNullOrWhiteSpace
(
text
))
{
// Do something with the text
}
// Handle the generated image
var
imageParts
=
response
.
Candidates
.
First
().
Content
.
Parts
.
OfType
<
ModelContent
.
InlineDataPart
>
()
.
Where
(
part
=>
part
.
MimeType
==
"image/png"
);
foreach
(
var
imagePart
in
imageParts
)
{
// Load the Image into a Unity Texture2D object
UnityEngine
.
Texture2D
texture2D
=
new
(
2
,
2
);
if
(
texture2D
.
LoadImage
(
imagePart
.
Data
.
ToArray
()))
{
// Do something with the image
}
}
Generate interleaved images and text
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
You can ask a Gemini model to generate interleaved images with its text responses. For example, you can generate images of what each step of a generated recipe might look like along with the step's instructions, and you don't have to make separate requests to the model or different models.
Make sure to create a GenerativeModel
instance, include responseModalities: ["TEXT", "IMAGE"]
generateContent
.
Swift
import
FirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let
generativeModel
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
()).
generativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
GenerationConfig
(
responseModalities
:
[.
text
,
.
image
])
)
// Provide a text prompt instructing the model to generate interleaved text and images
let
prompt
=
"""
Generate an illustrated recipe for a paella.
Create images to go alongside the text as you generate the recipe
"""
// To generate interleaved text and images, call `generateContent` with the text input
let
response
=
try
await
model
.
generateContent
(
prompt
)
// Handle the generated text and image
guard
let
candidate
=
response
.
candidates
.
first
else
{
fatalError
(
"No candidates in response."
)
}
for
part
in
candidate
.
content
.
parts
{
switch
part
{
case
let
textPart
as
TextPart
:
// Do something with the generated text
let
text
=
textPart
.
text
case
let
inlineDataPart
as
InlineDataPart
:
// Do something with the generated image
guard
let
uiImage
=
UIImage
(
data
:
inlineDataPart
.
data
)
else
{
fatalError
(
"Failed to convert data to UIImage."
)
}
default
:
fatalError
(
"Unsupported part type:
\(
part
)
"
)
}
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val
model
=
Firebase
.
ai
(
backend
=
GenerativeBackend
.
googleAI
()).
generativeModel
(
modelName
=
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
=
generationConfig
{
responseModalities
=
listOf
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
)
}
)
// Provide a text prompt instructing the model to generate interleaved text and images
val
prompt
=
"""
Generate an illustrated recipe for a paella.
Create images to go alongside the text as you generate the recipe
"""
.
trimIndent
()
// To generate interleaved text and images, call `generateContent` with the text input
val
responseContent
=
model
.
generateContent
(
prompt
).
candidates
.
first
().
content
// The response will contain image and text parts interleaved
for
(
part
in
responseContent
.
parts
)
{
when
(
part
)
{
is
ImagePart
->
{
// ImagePart as a bitmap
val
generatedImageAsBitmap
:
Bitmap?
=
part
.
asImageOrNull
()
}
is
TextPart
->
{
// Text content from the TextPart
val
text
=
part
.
text
}
}
}
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel
ai
=
FirebaseAI
.
getInstance
(
GenerativeBackend
.
googleAI
()).
generativeModel
(
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
new
GenerationConfig
.
Builder
()
.
setResponseModalities
(
Arrays
.
asList
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
))
.
build
()
);
GenerativeModelFutures
model
=
GenerativeModelFutures
.
from
(
ai
);
// Provide a text prompt instructing the model to generate interleaved text and images
Content
prompt
=
new
Content
.
Builder
()
.
addText
(
"Generate an illustrated recipe for a paella.\n"
+
"Create images to go alongside the text as you generate the recipe"
)
.
build
();
// To generate interleaved text and images, call `generateContent` with the text input
ListenableFuture<GenerateContentResponse>
response
=
model
.
generateContent
(
prompt
);
Futures
.
addCallback
(
response
,
new
FutureCallback<GenerateContentResponse>
()
{
@Override
public
void
onSuccess
(
GenerateContentResponse
result
)
{
Content
responseContent
=
result
.
getCandidates
().
get
(
0
).
getContent
();
// The response will contain image and text parts interleaved
for
(
Part
part
:
responseContent
.
getParts
())
{
if
(
part
instanceof
ImagePart
)
{
// ImagePart as a bitmap
Bitmap
generatedImageAsBitmap
=
((
ImagePart
)
part
).
getImage
();
}
else
if
(
part
instanceof
TextPart
){
// Text content from the TextPart
String
text
=
((
TextPart
)
part
).
getText
();
}
}
}
@Override
public
void
onFailure
(
Throwable
t
)
{
System
.
err
.
println
(
t
);
}
},
executor
);
Web
import
{
initializeApp
}
from
"firebase/app"
;
import
{
getAI
,
getGenerativeModel
,
GoogleAIBackend
,
ResponseModality
}
from
"firebase/ai"
;
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const
firebaseConfig
=
{
// ...
};
// Initialize FirebaseApp
const
firebaseApp
=
initializeApp
(
firebaseConfig
);
// Initialize the Gemini Developer API backend service
const
ai
=
getAI
(
firebaseApp
,
{
backend
:
new
GoogleAIBackend
()
});
// Create a `GenerativeModel` instance with a model that supports your use case
const
model
=
getGenerativeModel
(
ai
,
{
model
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
{
responseModalities
:
[
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
],
},
});
// Provide a text prompt instructing the model to generate interleaved text and images
const
prompt
=
'Generate an illustrated recipe for a paella.\n.'
+
'Create images to go alongside the text as you generate the recipe'
;
// To generate interleaved text and images, call `generateContent` with the text input
const
result
=
await
model
.
generateContent
(
prompt
);
// Handle the generated text and image
try
{
const
response
=
result
.
response
;
if
(
response
.
candidates
?
.[
0
].
content
?
.
parts
)
{
for
(
const
part
of
response
.
candidates
?
.[
0
].
content
?
.
parts
)
{
if
(
part
.
text
)
{
// Do something with the text
console
.
log
(
part
.
text
)
}
if
(
part
.
inlineData
)
{
// Do something with the image
const
image
=
part
.
inlineData
;
console
.
log
(
image
.
mimeType
,
image
.
data
);
}
}
}
}
catch
(
err
)
{
console
.
error
(
'Prompt or candidate was blocked:'
,
err
);
}
Dart
import
'package:firebase_ai/firebase_ai.dart'
;
import
'package:firebase_core/firebase_core.dart'
;
import
'firebase_options.dart'
;
await
Firebase
.
initializeApp
(
options:
DefaultFirebaseOptions
.
currentPlatform
,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final
model
=
FirebaseAI
.
googleAI
().
generativeModel
(
model:
'gemini-2.5-flash-image-preview'
,
// Configure the model to respond with text and images (required)
generationConfig:
GenerationConfig
(
responseModalities:
[
ResponseModalities
.
text
,
ResponseModalities
.
image
]),
);
// Provide a text prompt instructing the model to generate interleaved text and images
final
prompt
=
[
Content
.
text
(
'Generate an illustrated recipe for a paella
\n
'
+
'Create images to go alongside the text as you generate the recipe'
)];
// To generate interleaved text and images, call `generateContent` with the text input
final
response
=
await
model
.
generateContent
(
prompt
);
// Handle the generated text and image
final
parts
=
response
.
candidates
.
firstOrNull
?
.
content
.
parts
if
(
parts
.
isNotEmpty
)
{
for
(
final
part
in
parts
)
{
if
(
part
is
TextPart
)
{
// Do something with text part
final
text
=
part
.
text
}
if
(
part
is
InlineDataPart
)
{
// Process image
final
imageBytes
=
part
.
bytes
}
}
}
else
{
// Handle the case where no images were generated
print
(
'Error: No images were generated.'
);
}
Unity
using
Firebase
;
using
Firebase.AI
;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var
model
=
FirebaseAI
.
GetInstance
(
FirebaseAI
.
Backend
.
GoogleAI
()).
GetGenerativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
new
GenerationConfig
(
responseModalities
:
new
[]
{
ResponseModality
.
Text
,
ResponseModality
.
Image
})
);
// Provide a text prompt instructing the model to generate interleaved text and images
var
prompt
=
"Generate an illustrated recipe for a paella \n"
+
"Create images to go alongside the text as you generate the recipe"
;
// To generate interleaved text and images, call `GenerateContentAsync` with the text input
var
response
=
await
model
.
GenerateContentAsync
(
prompt
);
// Handle the generated text and image
foreach
(
var
part
in
response
.
Candidates
.
First
().
Content
.
Parts
)
{
if
(
part
is
ModelContent
.
TextPart
textPart
)
{
if
(
!
string
.
IsNullOrWhiteSpace
(
textPart
.
Text
))
{
// Do something with the text
}
}
else
if
(
part
is
ModelContent
.
InlineDataPart
dataPart
)
{
if
(
dataPart
.
MimeType
==
"image/png"
)
{
// Load the Image into a Unity Texture2D object
UnityEngine
.
Texture2D
texture2D
=
new
(
2
,
2
);
if
(
texture2D
.
LoadImage
(
dataPart
.
Data
.
ToArray
()))
{
// Do something with the image
}
}
}
}
Edit images (text-and-image input)
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
You can ask a Gemini model to edit images by prompting with text and one or more images.
Make sure to create a GenerativeModel
instance, include responseModalities: ["TEXT", "IMAGE"]
generateContent
.
Swift
import
FirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let
generativeModel
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
()).
generativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
GenerationConfig
(
responseModalities
:
[.
text
,
.
image
])
)
// Provide an image for the model to edit
guard
let
image
=
UIImage
(
named
:
"scones"
)
else
{
fatalError
(
"Image file not found."
)
}
// Provide a text prompt instructing the model to edit the image
let
prompt
=
"Edit this image to make it look like a cartoon"
// To edit the image, call `generateContent` with the image and text input
let
response
=
try
await
model
.
generateContent
(
image
,
prompt
)
// Handle the generated image
guard
let
inlineDataPart
=
response
.
inlineDataParts
.
first
else
{
fatalError
(
"No image data in response."
)
}
guard
let
uiImage
=
UIImage
(
data
:
inlineDataPart
.
data
)
else
{
fatalError
(
"Failed to convert data to UIImage."
)
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val
model
=
Firebase
.
ai
(
backend
=
GenerativeBackend
.
googleAI
()).
generativeModel
(
modelName
=
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
=
generationConfig
{
responseModalities
=
listOf
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
)
}
)
// Provide an image for the model to edit
val
bitmap
=
BitmapFactory
.
decodeResource
(
context
.
resources
,
R
.
drawable
.
scones
)
// Provide a text prompt instructing the model to edit the image
val
prompt
=
content
{
image
(
bitmap
)
text
(
"Edit this image to make it look like a cartoon"
)
}
// To edit the image, call `generateContent` with the prompt (image and text input)
val
generatedImageAsBitmap
=
model
.
generateContent
(
prompt
)
// Handle the generated text and image
.
candidates
.
first
().
content
.
parts
.
filterIsInstance<ImagePart>
().
firstOrNull
()
?.
image
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel
ai
=
FirebaseAI
.
getInstance
(
GenerativeBackend
.
googleAI
()).
generativeModel
(
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
new
GenerationConfig
.
Builder
()
.
setResponseModalities
(
Arrays
.
asList
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
))
.
build
()
);
GenerativeModelFutures
model
=
GenerativeModelFutures
.
from
(
ai
);
// Provide an image for the model to edit
Bitmap
bitmap
=
BitmapFactory
.
decodeResource
(
resources
,
R
.
drawable
.
scones
);
// Provide a text prompt instructing the model to edit the image
Content
promptcontent
=
new
Content
.
Builder
()
.
addImage
(
bitmap
)
.
addText
(
"Edit this image to make it look like a cartoon"
)
.
build
();
// To edit the image, call `generateContent` with the prompt (image and text input)
ListenableFuture<GenerateContentResponse>
response
=
model
.
generateContent
(
promptcontent
);
Futures
.
addCallback
(
response
,
new
FutureCallback<GenerateContentResponse>
()
{
@Override
public
void
onSuccess
(
GenerateContentResponse
result
)
{
// iterate over all the parts in the first candidate in the result object
for
(
Part
part
:
result
.
getCandidates
().
get
(
0
).
getContent
().
getParts
())
{
if
(
part
instanceof
ImagePart
)
{
ImagePart
imagePart
=
(
ImagePart
)
part
;
Bitmap
generatedImageAsBitmap
=
imagePart
.
getImage
();
break
;
}
}
}
@Override
public
void
onFailure
(
Throwable
t
)
{
t
.
printStackTrace
();
}
},
executor
);
Web
import
{
initializeApp
}
from
"firebase/app"
;
import
{
getAI
,
getGenerativeModel
,
GoogleAIBackend
,
ResponseModality
}
from
"firebase/ai"
;
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const
firebaseConfig
=
{
// ...
};
// Initialize FirebaseApp
const
firebaseApp
=
initializeApp
(
firebaseConfig
);
// Initialize the Gemini Developer API backend service
const
ai
=
getAI
(
firebaseApp
,
{
backend
:
new
GoogleAIBackend
()
});
// Create a `GenerativeModel` instance with a model that supports your use case
const
model
=
getGenerativeModel
(
ai
,
{
model
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
{
responseModalities
:
[
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
],
},
});
// Prepare an image for the model to edit
async
function
fileToGenerativePart
(
file
)
{
const
base64EncodedDataPromise
=
new
Promise
((
resolve
)
=>
{
const
reader
=
new
FileReader
();
reader
.
onloadend
=
()
=>
resolve
(
reader
.
result
.
split
(
','
)[
1
]);
reader
.
readAsDataURL
(
file
);
});
return
{
inlineData
:
{
data
:
await
base64EncodedDataPromise
,
mimeType
:
file
.
type
},
};
}
// Provide a text prompt instructing the model to edit the image
const
prompt
=
"Edit this image to make it look like a cartoon"
;
const
fileInputEl
=
document
.
querySelector
(
"input[type=file]"
);
const
imagePart
=
await
fileToGenerativePart
(
fileInputEl
.
files
[
0
]);
// To edit the image, call `generateContent` with the image and text input
const
result
=
await
model
.
generateContent
([
prompt
,
imagePart
]);
// Handle the generated image
try
{
const
inlineDataParts
=
result
.
response
.
inlineDataParts
();
if
(
inlineDataParts
?
.[
0
])
{
const
image
=
inlineDataParts
[
0
].
inlineData
;
console
.
log
(
image
.
mimeType
,
image
.
data
);
}
}
catch
(
err
)
{
console
.
error
(
'Prompt or candidate was blocked:'
,
err
);
}
Dart
import
'package:firebase_ai/firebase_ai.dart'
;
import
'package:firebase_core/firebase_core.dart'
;
import
'firebase_options.dart'
;
await
Firebase
.
initializeApp
(
options:
DefaultFirebaseOptions
.
currentPlatform
,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final
model
=
FirebaseAI
.
googleAI
().
generativeModel
(
model:
'gemini-2.5-flash-image-preview'
,
// Configure the model to respond with text and images (required)
generationConfig:
GenerationConfig
(
responseModalities:
[
ResponseModalities
.
text
,
ResponseModalities
.
image
]),
);
// Prepare an image for the model to edit
final
image
=
await
File
(
'scones.jpg'
).
readAsBytes
();
final
imagePart
=
InlineDataPart
(
'image/jpeg'
,
image
);
// Provide a text prompt instructing the model to edit the image
final
prompt
=
TextPart
(
"Edit this image to make it look like a cartoon"
);
// To edit the image, call `generateContent` with the image and text input
final
response
=
await
model
.
generateContent
([
Content
.
multi
([
prompt
,
imagePart
])
]);
// Handle the generated image
if
(
response
.
inlineDataParts
.
isNotEmpty
)
{
final
imageBytes
=
response
.
inlineDataParts
[
0
].
bytes
;
// Process the image
}
else
{
// Handle the case where no images were generated
print
(
'Error: No images were generated.'
);
}
Unity
using
Firebase
;
using
Firebase.AI
;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var
model
=
FirebaseAI
.
GetInstance
(
FirebaseAI
.
Backend
.
GoogleAI
()).
GetGenerativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
new
GenerationConfig
(
responseModalities
:
new
[]
{
ResponseModality
.
Text
,
ResponseModality
.
Image
})
);
// Prepare an image for the model to edit
var
imageFile
=
System
.
IO
.
File
.
ReadAllBytes
(
System
.
IO
.
Path
.
Combine
(
UnityEngine
.
Application
.
streamingAssetsPath
,
"scones.jpg"
));
var
image
=
ModelContent
.
InlineData
(
"image/jpeg"
,
imageFile
);
// Provide a text prompt instructing the model to edit the image
var
prompt
=
ModelContent
.
Text
(
"Edit this image to make it look like a cartoon."
);
// To edit the image, call `GenerateContent` with the image and text input
var
response
=
await
model
.
GenerateContentAsync
(
new
[]
{
prompt
,
image
});
var
text
=
response
.
Text
;
if
(
!
string
.
IsNullOrWhiteSpace
(
text
))
{
// Do something with the text
}
// Handle the generated image
var
imageParts
=
response
.
Candidates
.
First
().
Content
.
Parts
.
OfType
<
ModelContent
.
InlineDataPart
>
()
.
Where
(
part
=>
part
.
MimeType
==
"image/png"
);
foreach
(
var
imagePart
in
imageParts
)
{
// Load the Image into a Unity Texture2D object
Texture2D
texture2D
=
new
Texture2D
(
2
,
2
);
if
(
texture2D
.
LoadImage
(
imagePart
.
Data
.
ToArray
()))
{
// Do something with the image
}
}
Iterate and edit images using multi-turn chat
In that section, you'll also click a button for your chosen Gemini API provider so that you see provider-specific content on this page.
Using multi-turn chat, you can iterate with a Gemini model on the images that it generates or that you supply.
Make sure to create a GenerativeModel
instance, include responseModalities: ["TEXT", "IMAGE"]
startChat()
and sendMessage()
to send new user
messages.
Swift
import
FirebaseAI
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
let
generativeModel
=
FirebaseAI
.
firebaseAI
(
backend
:
.
googleAI
()).
generativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
GenerationConfig
(
responseModalities
:
[.
text
,
.
image
])
)
// Initialize the chat
let
chat
=
model
.
startChat
()
guard
let
image
=
UIImage
(
named
:
"scones"
)
else
{
fatalError
(
"Image file not found."
)
}
// Provide an initial text prompt instructing the model to edit the image
let
prompt
=
"Edit this image to make it look like a cartoon"
// To generate an initial response, send a user message with the image and text prompt
let
response
=
try
await
chat
.
sendMessage
(
image
,
prompt
)
// Inspect the generated image
guard
let
inlineDataPart
=
response
.
inlineDataParts
.
first
else
{
fatalError
(
"No image data in response."
)
}
guard
let
uiImage
=
UIImage
(
data
:
inlineDataPart
.
data
)
else
{
fatalError
(
"Failed to convert data to UIImage."
)
}
// Follow up requests do not need to specify the image again
let
followUpResponse
=
try
await
chat
.
sendMessage
(
"But make it old-school line drawing style"
)
// Inspect the edited image after the follow up request
guard
let
followUpInlineDataPart
=
followUpResponse
.
inlineDataParts
.
first
else
{
fatalError
(
"No image data in response."
)
}
guard
let
followUpUIImage
=
UIImage
(
data
:
followUpInlineDataPart
.
data
)
else
{
fatalError
(
"Failed to convert data to UIImage."
)
}
Kotlin
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
val
model
=
Firebase
.
ai
(
backend
=
GenerativeBackend
.
googleAI
()).
generativeModel
(
modelName
=
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
=
generationConfig
{
responseModalities
=
listOf
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
)
}
)
// Provide an image for the model to edit
val
bitmap
=
BitmapFactory
.
decodeResource
(
context
.
resources
,
R
.
drawable
.
scones
)
// Create the initial prompt instructing the model to edit the image
val
prompt
=
content
{
image
(
bitmap
)
text
(
"Edit this image to make it look like a cartoon"
)
}
// Initialize the chat
val
chat
=
model
.
startChat
()
// To generate an initial response, send a user message with the image and text prompt
var
response
=
chat
.
sendMessage
(
prompt
)
// Inspect the returned image
var
generatedImageAsBitmap
=
response
.
candidates
.
first
().
content
.
parts
.
filterIsInstance<ImagePart>
().
firstOrNull
()
?.
image
// Follow up requests do not need to specify the image again
response
=
chat
.
sendMessage
(
"But make it old-school line drawing style"
)
generatedImageAsBitmap
=
response
.
candidates
.
first
().
content
.
parts
.
filterIsInstance<ImagePart>
().
firstOrNull
()
?.
image
Java
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
GenerativeModel
ai
=
FirebaseAI
.
getInstance
(
GenerativeBackend
.
googleAI
()).
generativeModel
(
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
new
GenerationConfig
.
Builder
()
.
setResponseModalities
(
Arrays
.
asList
(
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
))
.
build
()
);
GenerativeModelFutures
model
=
GenerativeModelFutures
.
from
(
ai
);
// Provide an image for the model to edit
Bitmap
bitmap
=
BitmapFactory
.
decodeResource
(
resources
,
R
.
drawable
.
scones
);
// Initialize the chat
ChatFutures
chat
=
model
.
startChat
();
// Create the initial prompt instructing the model to edit the image
Content
prompt
=
new
Content
.
Builder
()
.
setRole
(
"user"
)
.
addImage
(
bitmap
)
.
addText
(
"Edit this image to make it look like a cartoon"
)
.
build
();
// To generate an initial response, send a user message with the image and text prompt
ListenableFuture<GenerateContentResponse>
response
=
chat
.
sendMessage
(
prompt
);
// Extract the image from the initial response
ListenableFuture
< @Nullable
Bitmap
>
initialRequest
=
Futures
.
transform
(
response
,
result
-
>
{
for
(
Part
part
:
result
.
getCandidates
().
get
(
0
).
getContent
().
getParts
())
{
if
(
part
instanceof
ImagePart
)
{
ImagePart
imagePart
=
(
ImagePart
)
part
;
return
imagePart
.
getImage
();
}
}
return
null
;
},
executor
);
// Follow up requests do not need to specify the image again
ListenableFuture<GenerateContentResponse>
modelResponseFuture
=
Futures
.
transformAsync
(
initialRequest
,
generatedImage
-
>
{
Content
followUpPrompt
=
new
Content
.
Builder
()
.
addText
(
"But make it old-school line drawing style"
)
.
build
();
return
chat
.
sendMessage
(
followUpPrompt
);
},
executor
);
// Add a final callback to check the reworked image
Futures
.
addCallback
(
modelResponseFuture
,
new
FutureCallback<GenerateContentResponse>
()
{
@Override
public
void
onSuccess
(
GenerateContentResponse
result
)
{
for
(
Part
part
:
result
.
getCandidates
().
get
(
0
).
getContent
().
getParts
())
{
if
(
part
instanceof
ImagePart
)
{
ImagePart
imagePart
=
(
ImagePart
)
part
;
Bitmap
generatedImageAsBitmap
=
imagePart
.
getImage
();
break
;
}
}
}
@Override
public
void
onFailure
(
Throwable
t
)
{
t
.
printStackTrace
();
}
},
executor
);
Web
import
{
initializeApp
}
from
"firebase/app"
;
import
{
getAI
,
getGenerativeModel
,
GoogleAIBackend
,
ResponseModality
}
from
"firebase/ai"
;
// TODO(developer) Replace the following with your app's Firebase configuration
// See: https://firebase.google.com/docs/web/learn-more#config-object
const
firebaseConfig
=
{
// ...
};
// Initialize FirebaseApp
const
firebaseApp
=
initializeApp
(
firebaseConfig
);
// Initialize the Gemini Developer API backend service
const
ai
=
getAI
(
firebaseApp
,
{
backend
:
new
GoogleAIBackend
()
});
// Create a `GenerativeModel` instance with a model that supports your use case
const
model
=
getGenerativeModel
(
ai
,
{
model
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
{
responseModalities
:
[
ResponseModality
.
TEXT
,
ResponseModality
.
IMAGE
],
},
});
// Prepare an image for the model to edit
async
function
fileToGenerativePart
(
file
)
{
const
base64EncodedDataPromise
=
new
Promise
((
resolve
)
=>
{
const
reader
=
new
FileReader
();
reader
.
onloadend
=
()
=>
resolve
(
reader
.
result
.
split
(
','
)[
1
]);
reader
.
readAsDataURL
(
file
);
});
return
{
inlineData
:
{
data
:
await
base64EncodedDataPromise
,
mimeType
:
file
.
type
},
};
}
const
fileInputEl
=
document
.
querySelector
(
"input[type=file]"
);
const
imagePart
=
await
fileToGenerativePart
(
fileInputEl
.
files
[
0
]);
// Provide an initial text prompt instructing the model to edit the image
const
prompt
=
"Edit this image to make it look like a cartoon"
;
// Initialize the chat
const
chat
=
model
.
startChat
();
// To generate an initial response, send a user message with the image and text prompt
const
result
=
await
chat
.
sendMessage
([
prompt
,
imagePart
]);
// Request and inspect the generated image
try
{
const
inlineDataParts
=
result
.
response
.
inlineDataParts
();
if
(
inlineDataParts
?
.[
0
])
{
// Inspect the generated image
const
image
=
inlineDataParts
[
0
].
inlineData
;
console
.
log
(
image
.
mimeType
,
image
.
data
);
}
}
catch
(
err
)
{
console
.
error
(
'Prompt or candidate was blocked:'
,
err
);
}
// Follow up requests do not need to specify the image again
const
followUpResult
=
await
chat
.
sendMessage
(
"But make it old-school line drawing style"
);
// Request and inspect the returned image
try
{
const
followUpInlineDataParts
=
followUpResult
.
response
.
inlineDataParts
();
if
(
followUpInlineDataParts
?
.[
0
])
{
// Inspect the generated image
const
followUpImage
=
followUpInlineDataParts
[
0
].
inlineData
;
console
.
log
(
followUpImage
.
mimeType
,
followUpImage
.
data
);
}
}
catch
(
err
)
{
console
.
error
(
'Prompt or candidate was blocked:'
,
err
);
}
Dart
import
'package:firebase_ai/firebase_ai.dart'
;
import
'package:firebase_core/firebase_core.dart'
;
import
'firebase_options.dart'
;
await
Firebase
.
initializeApp
(
options:
DefaultFirebaseOptions
.
currentPlatform
,
);
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
final
model
=
FirebaseAI
.
googleAI
().
generativeModel
(
model:
'gemini-2.5-flash-image-preview'
,
// Configure the model to respond with text and images (required)
generationConfig:
GenerationConfig
(
responseModalities:
[
ResponseModalities
.
text
,
ResponseModalities
.
image
]),
);
// Prepare an image for the model to edit
final
image
=
await
File
(
'scones.jpg'
).
readAsBytes
();
final
imagePart
=
InlineDataPart
(
'image/jpeg'
,
image
);
// Provide an initial text prompt instructing the model to edit the image
final
prompt
=
TextPart
(
"Edit this image to make it look like a cartoon"
);
// Initialize the chat
final
chat
=
model
.
startChat
();
// To generate an initial response, send a user message with the image and text prompt
final
response
=
await
chat
.
sendMessage
([
Content
.
multi
([
prompt
,
imagePart
])
]);
// Inspect the returned image
if
(
response
.
inlineDataParts
.
isNotEmpty
)
{
final
imageBytes
=
response
.
inlineDataParts
[
0
].
bytes
;
// Process the image
}
else
{
// Handle the case where no images were generated
print
(
'Error: No images were generated.'
);
}
// Follow up requests do not need to specify the image again
final
followUpResponse
=
await
chat
.
sendMessage
([
Content
.
text
(
"But make it old-school line drawing style"
)
]);
// Inspect the returned image
if
(
followUpResponse
.
inlineDataParts
.
isNotEmpty
)
{
final
followUpImageBytes
=
response
.
inlineDataParts
[
0
].
bytes
;
// Process the image
}
else
{
// Handle the case where no images were generated
print
(
'Error: No images were generated.'
);
}
Unity
using
Firebase
;
using
Firebase.AI
;
// Initialize the Gemini Developer API backend service
// Create a `GenerativeModel` instance with a Gemini model that supports image output
var
model
=
FirebaseAI
.
GetInstance
(
FirebaseAI
.
Backend
.
GoogleAI
()).
GetGenerativeModel
(
modelName
:
"gemini-2.5-flash-image-preview"
,
// Configure the model to respond with text and images (required)
generationConfig
:
new
GenerationConfig
(
responseModalities
:
new
[]
{
ResponseModality
.
Text
,
ResponseModality
.
Image
})
);
// Prepare an image for the model to edit
var
imageFile
=
System
.
IO
.
File
.
ReadAllBytes
(
System
.
IO
.
Path
.
Combine
(
UnityEngine
.
Application
.
streamingAssetsPath
,
"scones.jpg"
));
var
image
=
ModelContent
.
InlineData
(
"image/jpeg"
,
imageFile
);
// Provide an initial text prompt instructing the model to edit the image
var
prompt
=
ModelContent
.
Text
(
"Edit this image to make it look like a cartoon."
);
// Initialize the chat
var
chat
=
model
.
StartChat
();
// To generate an initial response, send a user message with the image and text prompt
var
response
=
await
chat
.
SendMessageAsync
(
new
[]
{
prompt
,
image
});
// Inspect the returned image
var
imageParts
=
response
.
Candidates
.
First
().
Content
.
Parts
.
OfType
<
ModelContent
.
InlineDataPart
>
()
.
Where
(
part
=>
part
.
MimeType
==
"image/png"
);
// Load the image into a Unity Texture2D object
UnityEngine
.
Texture2D
texture2D
=
new
(
2
,
2
);
if
(
texture2D
.
LoadImage
(
imageParts
.
First
().
Data
.
ToArray
()))
{
// Do something with the image
}
// Follow up requests do not need to specify the image again
var
followUpResponse
=
await
chat
.
SendMessageAsync
(
"But make it old-school line drawing style"
);
// Inspect the returned image
var
followUpImageParts
=
followUpResponse
.
Candidates
.
First
().
Content
.
Parts
.
OfType
<
ModelContent
.
InlineDataPart
>
()
.
Where
(
part
=>
part
.
MimeType
==
"image/png"
);
// Load the image into a Unity Texture2D object
UnityEngine
.
Texture2D
followUpTexture2D
=
new
(
2
,
2
);
if
(
followUpTexture2D
.
LoadImage
(
followUpImageParts
.
First
().
Data
.
ToArray
()))
{
// Do something with the image
}
Supported features, limitations, and best practices
Supported modalities and capabilities
The following are supported modalities and capabilities for image-output from a Gemini model. Each capability shows an example prompt and has an example code sample above.
-
Text Image(s) (text-only to image)
- Generate an image of the Eiffel tower with fireworks in the background.
-
Text Image(s) (text rendering within image)
- Generate a cinematic photo of a large building with this giant text projection mapped on the front of the building.
-
Text Image(s) & Text (interleaved)
-
Generate an illustrated recipe for a paella. Create images alongside the text as you generate the recipe.
-
Generate a story about a dog in a 3D cartoon animation style. For each scene, generate an image.
-
-
Image(s) & Text Image(s) & Text (interleaved)
- [image of a furnished room] + What other color sofas would work in my space? Can you update the image?
-
Image editing (text-and-image to image)
-
[image of scones] + Edit this image to make it look like a cartoon
-
[image of a cat] + [image of a pillow] + Create a cross stitch of my cat on this pillow.
-
-
Multi-turn image editing (chat)
- [image of a blue car] + Turn this car into a convertible. , then Now change the color to yellow.
Limitations and best practices
The following are limitations and best practices for image-output from a Gemini model.
-
Image-generating Gemini models support the following:
- Generating PNG images with a maximum dimension of 1024 px.
- Generating and editing images of people.
- Using safety filters that provide a flexible and less restrictive user experience.
-
Image-generating Gemini models do not support the following:
- Including audio or video inputs.
- Generating only
images.
The models will always return both text and images, and you must includeresponseModalities: ["TEXT", "IMAGE"]
-
For best performance, use the following languages:
en
,es-mx
,ja-jp
,zh-cn
,hi-in
. -
Image generation may not always trigger. Here are some known issues:
-
The model may output text only.Try asking for image outputs explicitly (for example, "generate an image", "provide images as you go along", "update the image").
-
The model may stop generating partway through.Try again or try a different prompt.
-
The model may generate text as an image.Try asking for text outputs explicitly. For example, "generate narrative text along with illustrations."
-
-
When generating text for an image, Gemini works best if you first generate the text and then ask for an image with the text.