Stay organized with collectionsSave and categorize content based on your preferences.
Editing withImagenis only supported if you're using theVertex AIGemini API. It's also currently only supported for
Android and Flutter apps. Support for other platforms is coming later in the
year.
This page describes how to use thecustomization capabilityfromImagentoedit or generate images based on a specifiedcontrolusing theFirebase AI Logic SDKs.
How it works: You provide a text prompt and at least onecontrolreference image (like a drawing or a Canny edge image). The
model uses these inputs to generate a new image based on the control images.
For example, you can provide the model with a drawing of a rocket ship and
the moon along with a text prompt to create a watercolor painting based on the
drawing.
The reference image for controlled customization can be ascribble,
aCanny edge image, or aface mesh.
What's a scribble?
Ascribbleis a rough, hand-drawn sketch or outline that provides the
model with a basic structure, spatial arrangement, and layout to follow. The
text prompt provides the details, color, and texture for the generated image.
Example: You provide a drawing of a house, a tree, and a sun, and you also
provide a text prompt like "A whimsical watercolor painting of a cottage with
a large oak tree next to it at sunrise." The model will then generate an image
that matches the described scene while following the general layout from your
drawing.
What's a Canny edge image?
ACanny edge imageis where an algorithm, specifically theCanny edge detector,
has been applied to a source image to map the edges of objects within the
image. These edges help the model maintain the precise structure of the
objects while changing the style, color, or other attributes specified by the
text prompt.
Example: You have a photo of a dog sitting on a couch. You run the
Canny edge detector on the photo to get an image of just the dog's and couch's
outlines. You then use this edge map as the control image and a text prompt
like "a photo of a golden retriever puppy on a leather sofa." The model will
generate a new photo that matches the exact pose of the original dog and the
composition of the couch, but with a golden retriever puppy and a leather sofa
instead of the original subjects.
What's a face mesh?
Aface meshis an image that helps the model understand and replicate a
specific face. It's a digital representation of a human face in 3D, typically
a network of interconnected points (vertices) and triangles that define the
shape and contours of the face. This provides the model with key landmarks
(like the eyes, nose, and mouth) and textures.
Before you begin
Only available when using theVertex AIGemini APIas your
API provider.
If you haven't already, complete thegetting started guide, which
describes how to set up your Firebase project, connect your app to Firebase,
add the SDK, initialize the backend service for your chosen API provider, and
create anImagenModelinstance.
Models that support this capability
Imagenoffers image editing through itscapabilitymodel:
imagen-3.0-capability-001
Note that forImagenmodels, thegloballocation isnotsupported.
Send a controlled customization request
The following sample shows a controlled customization request that asks the
model to generate a new image based on the provided reference image (in this
example, a drawing of space, like a rocket and the moon). Since the reference
image is arough, hand-drawn sketch or outline, it uses the control typeCONTROL_TYPE_SCRIBBLE.
If your reference image is aCanny edge imageor aface mesh, you can also
use this example but with the following changes:
If your reference image is aCanny edge image, use the control typeCONTROL_TYPE_CANNY.
Review theprompt templateslater on this page to
learn about writing prompts and how to use reference images within them.
Swift
Image editing withImagenmodels isn't supported for Swift. Check back later this year!
Kotlin
// Using this SDK to access Imagen models is a Preview release and requires opt-in@OptIn(PublicPreviewAPI::class)suspendfuncustomizeImage(){// Initialize the Vertex AI Gemini API backend service// Optionally specify the location to access the model (for example, `us-central1`)valai=Firebase.ai(backend=GenerativeBackend.vertexAI(location="us-central1"))// Create an `ImagenModel` instance with an Imagen "capability" modelvalmodel=ai.imagenModel("imagen-3.0-capability-001")// This example assumes 'referenceImage' is a pre-loaded Bitmap.// In a real app, this might come from the user's device or a URL.valreferenceImage:Bitmap=TODO("Load your reference image Bitmap here")// Define the subject reference using the reference image.valcontrolReference=ImagenControlReference(image=referenceImage,referenceID=1,controlType=CONTROL_TYPE_SCRIBBLE)// Provide a prompt that describes the final image.// The "[1]" links the prompt to the subject reference with ID 1.valprompt="A cat flying through outer space arranged like the space scribble[1]"// Use the editImage API to perform the controlled customization.// Pass the list of references, the prompt, and an editing configuration.valeditedImage=model.editImage(references=listOf(controlReference),prompt=prompt,config=ImagenEditingConfig(editSteps=50// Number of editing steps, a higher value can improve quality))// Process the result}
Java
// Initialize the Vertex AI Gemini API backend service// Optionally specify the location to access the model (for example, `us-central1`)// Create an `ImagenModel` instance with an Imagen "capability" modelImagenModelimagenModel=FirebaseAI.getInstance(GenerativeBackend.vertexAI("us-central1")).imagenModel(/* modelName */"imagen-3.0-capability-001");ImagenModelFuturesmodel=ImagenModelFutures.from(imagenModel);// This example assumes 'referenceImage' is a pre-loaded Bitmap.// In a real app, this might come from the user's device or a URL.BitmapreferenceImage=null;// TODO("Load your image Bitmap here");// Define the subject reference using the reference image.ImagenControlReferencecontrolReference=newImagenControlReference.Builder().setImage(referenceImage).setReferenceID(1).setControlType(CONTROL_TYPE_SCRIBBLE).build();// Provide a prompt that describes the final image.// The "[1]" links the prompt to the subject reference with ID 1.Stringprompt="A cat flying through outer space arranged like the space scribble[1]";// Define the editing configuration.ImagenEditingConfigimagenEditingConfig=newImagenEditingConfig.Builder().setEditSteps(50)// Number of editing steps, a higher value can improve quality.build();// Use the editImage API to perform the controlled customization.// Pass the list of references, the prompt, and an editing configuration.Futures.addCallback(model.editImage(Collections.singletonList(controlReference),prompt,imagenEditingConfig),newFutureCallback<ImagenGenerationResponse>(){@OverridepublicvoidonSuccess(ImagenGenerationResponseresult){if(result.getImages().isEmpty()){Log.d("TAG","No images generated");}Bitmapbitmap=result.getImages().get(0).asBitmap();// Use the bitmap to display the image in your UI}@OverridepublicvoidonFailure(Throwablet){// ...}},Executors.newSingleThreadExecutor());
Web
Image editing withImagenmodels isn't supported for Web apps. Check back later this year!
Dart
import'dart:typed_data';import'package:firebase_ai/firebase_ai.dart';import'package:firebase_core/firebase_core.dart';import'firebase_options.dart';// Initialize FirebaseAppawaitFirebase.initializeApp(options:DefaultFirebaseOptions.currentPlatform,);// Initialize the Vertex AI Gemini API backend service// Optionally specify a location to access the model (for example, `us-central1`)finalai=FirebaseAI.vertexAI(location:'us-central1');// Create an `ImagenModel` instance with an Imagen "capability" modelfinalmodel=ai.imagenModel(model:'imagen-3.0-capability-001');// This example assumes 'referenceImage' is a pre-loaded Uint8List.// In a real app, this might come from the user's device or a URL.finalUint8ListreferenceImage=Uint8List(0);// TODO: Load your reference image data here// Define the control reference using the reference image.finalcontrolReference=ImagenControlReference(image:referenceImage,referenceId:1,controlType:ImagenControlType.scribble,);// Provide a prompt that describes the final image.// The "[1]" links the prompt to the subject reference with ID 1.finalprompt="A cat flying through outer space arranged like the space scribble[1]";try{// Use the editImage API to perform the controlled customization.// Pass the list of references, the prompt, and an editing configuration.finalresponse=awaitmodel.editImage([controlReference],prompt,config:ImagenEditingConfig(editSteps:50,// Number of editing steps, a higher value can improve quality),);// Process the result.if(response.images.isNotEmpty){finaleditedImage=response.images.first.bytes;// Use the editedImage (a Uint8List) to display the image, save it, etc.print('Image successfully generated!');}else{// Handle the case where no images were generated.print('Error: No images were generated.');}}catch(e){// Handle any potential errors during the API call.print('An error occurred:$e');}
Unity
Image editing withImagenmodels isn't supported for Unity. Check back later this year!
Prompt templates
In the request, you provide reference images (up to 4 images) by defining anImagenControlReferencein which you specify a reference ID for an image.
Note that multiple images can
have the same reference ID (for example, multiple scribbles of the same idea).
Then, when writing the prompt, you refer to these IDs. For example, you use[1]in the prompt to refer to images with the reference ID1.
The following table provides prompt templates that can be a starting
point for writing prompts for customization based on a control.
Use case
Reference images
Prompt template
Example
Controlled customization
Scribble map (1)
Generate an image that aligns with thescribble map [1]to match the description:${STYLE_PROMPT} ${PROMPT}.
Generate an image that aligns with thescribble map [1]to
match the description:The image should be in the style of an
impressionistic oil painting with relaxed brushstrokes. It possesses a
naturally-lit ambience and noticeable brushstrokes. A side-view of a
car. The car is parked on a wet, reflective road surface, with city
lights reflecting in the puddles.
Controlled customization
Canny control image (1)
Generate an image aligning with theedge map [1]to match the
description:${STYLE_PROMPT} ${PROMPT}
Generate an image aligning with theedge map [1]to match the
description:The image should be in the style of an impressionistic
oil painting, with relaxed brushstrokes. It posses a naturally-lit
ambience and noticeable brushstrokes. A side-view of a car. The car is
parked on a wet, reflective road surface, with city lights reflecting in
the puddles.
Person image stylization with FaceMesh input
Subject image (1-3) FaceMesh control image (1)
Create an image aboutSUBJECT_DESCRIPTION [1]in the pose of
theCONTROL_IMAGE [2]to match the description: a portrait ofSUBJECT_DESCRIPTION [1]${PROMPT}
Create an image abouta woman with short hair [1]in the pose
of thecontrol image [2]to match the description: a portrait
ofa woman with short hair [1]in 3D-cartoon style with a
blurred background. A cute and lovely character, with a smiling face,
looking at the camera, pastel color tone ...
Person image stylization with FaceMesh input
Subject image (1-3) FaceMesh control image (1)
Create a${STYLE_PROMPT}image aboutSUBJECT_DESCRIPTION
[1]in the pose of theCONTROL_IMAGE [2]to match the
description: a portrait ofSUBJECT_DESCRIPTION [1]${PROMPT}
Create a3D-cartoon styleimage abouta woman with short
hair [1]in the pose of thecontrol image [2]to match
the description: a portrait ofa woman with short hair [1]in 3D-cartoon style with a blurred background. A cute and lovely
character, with a smiling face, looking at the camera, pastel color
tone ...
Best practices and limitations
Use cases
The customization capability offers free-style prompting, which can give the
impression that the model can do more than it's trained to do. The following
sections describeintendeduse casesfor
customization, and non-exhaustiveexamples ofunintendeduse cases.
We recommend using this capability for the intended use cases, since we've
trained the model on those use cases and expect good results for them.
Conversely, if you push the model to do things outside of the intended use
cases, you should expect poor results.
Intended use cases
The following areintendeduse cases for customization based on acontrol:
Generate an image that follows the prompt and the canny edge control images.
Generate an image that follows the prompt and the scribble images.
Stylize a photo of a person while preserving the facial expression.
Examples of unintended use cases
The following is a non-exhaustive list ofunintendeduse cases for
customization based on acontrol. The model isn't trained for these use cases,
and will likely produce poor results.
Generate an image using a style specified in the prompt.
Generate an image from text that follows a specific style provided by a
reference image, with some level of control on the image composition using
control image.
Generate an image from text that follows a specific style provided by a
reference image, with some level of control on the image composition using a
control scribble.
Generate an image from text that follows a specific style provided by the
reference image, with some level of control on the image composition using a
control image. The person in the image has a specific facial expression.
Stylize a photo of two or more people, and preserve their facial expressions.
Stylize a photo of a pet, and turn it into a drawing. Preserve or specify the
composition of the image (for example, watercolor).
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-09 UTC."],[],[],null,[]]