The Imagen API lets you create high quality images in seconds, using text prompts and reference images to guide subject or style generation.
View Imagen for Editing and Customization model card
Supported Models
Model | Code |
---|---|
Customization using reference images (few-shot) | imagen-3.0-capability-001
|
For more information about the features that each model supports, see Imagen models .
HTTP method and URL
POST
https:// ${
LOCATION
}
-aiplatform.googleapis.com/v1/projects/ ${
PROJECT_ID
}
/locations/ ${
LOCATION
}
/publishers/google/models/imagen-3.0-capability-001:predict
Example syntax
Syntax to customize an image from a text prompt and reference images.
Syntax
Syntax to customize an image.
REST
curl -X POST \ -H "Authorization: Bearer $( gcloud auth print-access-token ) " \ -H "Content-Type: application/json" \ https:// ${ LOCATION } -aiplatform.googleapis.com/v1/projects/ ${ PROJECT_ID } /locations/ ${ LOCATION } /publishers/google/models/imagen-3.0-capability-001:predict \ -d '{ "instances": [ { // Use [1] to refer to the reference images with referenceId=1 // [2] to refer to the reference images with referenceId=2, // following the same format for all reference IDs that you provide. "prompt": "${TEXT_PROMPT}", "referenceImages": [ // A list of at most 4 reference image objects. [...] ] } ], "parameters": { [...] } }'
Sample request body :
This request is for person customization with a face mesh control image and three reference images.
{ "instances": [ { "prompt": "Create an image about a man with short hair [1] in the pose of control image [2] to match the description: A pencil style sketch of a full-body portrait of a man with short hair [1] with hatch-cross drawing, hatch drawing of portrait with 6B and graphite pencils, white background, pencil drawing, high quality, pencil stroke, looking at camera, natural human eyes", "referenceImages": [ { "referenceType": "REFERENCE_TYPE_CONTROL", "referenceId": 2, "referenceImage": { "bytesBase64Encoded": "${IMAGE_BYTES_1}" }, "controlImageConfig": { "controlType": "CONTROL_TYPE_FACE_MESH", "enableControlImageComputation": true } }, { "referenceType": "REFERENCE_TYPE_SUBJECT", "referenceId": 1, "referenceImage": { "bytesBase64Encoded": "${IMAGE_BYTES_2}" }, "subjectImageConfig": { "subjectDescription": "a man with short hair", "subjectType": "SUBJECT_TYPE_PERSON" } }, { "referenceType": "REFERENCE_TYPE_SUBJECT", "referenceId": 1, "referenceImage": { "bytesBase64Encoded": "${IMAGE_BYTES_3}" }, "subjectImageConfig": { "subjectDescription": "a man with short hair", "subjectType": "SUBJECT_TYPE_PERSON" } }, { "referenceType": "REFERENCE_TYPE_SUBJECT", "referenceId": 1, "referenceImage": { "bytesBase64Encoded": "${IMAGE_BYTES_4}" }, "subjectImageConfig": { "subjectDescription": "a man with short hair", "subjectType": "SUBJECT_TYPE_PERSON" } } ] } ], "parameters": { "negativePrompt": "wrinkles, noise, Low quality, dirty, low res, multi face, rough texture, messy, messy background, color background, photo realistic, photo, super realistic, signature, autograph, sign, text, characters, alphabet, letter", "seed": 1, "language": "en", "sampleCount": 4 } }
Parameter list
See examples for implementation details.
Customize images
REST
referenceType
-
REFERENCE_TYPE_RAW
- A raw reference image is required for editing use cases.
- A raw reference image isn't needed for other use cases.
- At most one raw reference image exists in one request.
- The output image has the same size as the raw reference input image.
-
REFERENCE_TYPE_MASK
- A mask reference image is required for masked editing use cases.
- A mask reference image isn't required for other use cases.
- If a raw reference image is present, the mask image has to be in the same size as the raw reference image.
- The user can either provide their own mask, or let Imagen compute the mask for them from the provided reference image.
- If mask reference image is empty and
maskMode
is not set toMASK_MODE_USER_PROVIDED
, the mask is computed based on the raw reference image.
-
REFERENCE_TYPE_CONTROL
- If raw reference image is present, the control image has to be in the same size with raw reference image.
- If control reference image is empty and
enableControlImageComputation
is set totrue
, the control image is computed based on the raw reference image.
-
REFERENCE_TYPE_SUBJECT
- The user can provide multiple reference images with the same reference ID. For example, multiple images for the same subject can have the same reference ID. This could potentially improve the output quality.
-
REFERENCE_TYPE_STYLE
referenceId
integer
The reference ID. Use this reference ID in the prompt. For example, use
[1]
to refer to the reference images with referenceId=1, [2]
to
refer to the reference images with referenceId=2.referenceImage.bytesBase64Encoded
string
A Base64 string for the encoded reference image.
maskImageConfig.maskMode
-
MASK_MODE_USER_PROVIDED
, if the reference image is a mask image. -
MASK_MODE_BACKGROUND
, to automatically generate a mask using background segmentation. -
MASK_MODE_FOREGROUND
, to automatically generate a mask using foreground segmentation. -
MASK_MODE_SEMANTIC
, to automatically generate a mask using semantic segmentation, and the given mask class .
Specified when
referenceType
is set as REFERENCE_TYPE_MASK
.maskImageConfig.dilation
float
. Range: [0, 1]The percentage of image width to dilate this mask by.
Specified when
referenceType
is set as REFERENCE_TYPE_MASK
.maskImageConfig.maskClasses
list[Integer]
.Mask classes for
MASK_MODE_SEMANTIC
mode.Specified when
referenceType
is set as REFERENCE_TYPE_MASK
.controlImageConfig.controlType
-
CONTROL_TYPE_FACE_MESH
for face mesh (person customization). -
CONTROL_TYPE_CANNY
for canny edge . -
CONTROL_TYPE_SCRIBBLE
for scribble.
Specified when
referenceType
is set as REFERENCE_TYPE_CONTROL
.controlImageConfig.enableControlImageComputation
bool
.Default:
false
.- Set to
false
if you provide your own control image.
- Set to
true
if you want to let Imagen compute the control image from the reference image.
Specified when
referenceType
is set as REFERENCE_TYPE_CONTROL
.language
Optional: string
( imagen-3.0-capability-001
, imagen-3.0.generate-001
, and imagegeneration@006
only)
The language code that corresponds to your text prompt language. The following values are supported:
-
auto
: Automatic detection. If Imagen detects a supported language, the prompt and an optional negative prompt are translated to English. If the language detected isn't supported, Imagen uses the input text verbatim, which might result in an unexpected output. No error code is returned. -
es
: Spanish -
hi
: Hindi -
ja
: Japanese -
ko
: Korean -
pt
: Portuguese -
zh-TW
: Chinese (traditional) -
zh
orzh-CN
: Chinese (simplified)
en
: English (if omitted, the default value) subjectImageConfig.subjectDescription
string
.A short description of the subject in the image. For example, a woman with short brown hair .
Specified when
referenceType
is set as REFERENCE_TYPE_SUBJECT
.subjectImageConfig.subjectType
-
SUBJECT_TYPE_PERSON
: Person subject type. -
SUBJECT_TYPE_ANIMAL
: Animal subject type. -
SUBJECT_TYPE_PRODUCT
: Product subject type. -
SUBJECT_TYPE_DEFAULT
: Default subject type.
Specified when
referenceType
is set as REFERENCE_TYPE_SUBJECT
.styleImageConfig.styleDescription
string
.A short description for the style.
Specified when
referenceType
is set as REFERENCE_TYPE_STYLE
.Response
The response body from the REST request.
predictions
An array of VisionGenerativeModelResult
objects
,
one for each requested sampleCount
. If any images are
filtered by responsible AI, they are not included.
Vision generative model result object
Information about the model result.
bytesBase64Encoded
The base64 encoded generated image. Not present if the output image did not pass responsible AI filters.
mimeType
The type of the generated image. Not present if the output image did not pass responsible AI filters.
Examples
The following examples show how to use the Imagen model to customize images.
Customize images
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID : Your Google Cloud project ID .
- LOCATION
: Your project's region. For example,
us-central1
,europe-west2
, orasia-northeast3
. For a list of available regions, see Generative AI on Vertex AI locations . - TEXT_PROMPT
: The text prompt guides what images the model
generates. To use Imagen 3 Customization, include the
referenceId
of the reference image or images you provide in the format [$referenceId] . For example:- The following text prompt is for a request that has two reference images with
"referenceId": 1
. Both images have an optional description of"subjectDescription": "man with short hair"
: Create an image about a man with short hair to match the description: A pencil style sketch of a full-body portrait of a man with short hair [1] with hatch-cross drawing, hatch drawing of portrait with 6B and graphite pencils, white background, pencil drawing, high quality, pencil stroke, looking at camera, natural human eyes
- The following text prompt is for a request that has two reference images with
-
"referenceId"
: The ID of the reference image, or the ID for a series of reference images that correspond to the same subject or style. In this example the two reference images are of the same person, so they share the samereferenceId
(1
). - BASE64_REFERENCE_IMAGE : A reference image to guide image generation. The image must be specified as a base64-encoded byte string.
- SUBJECT_DESCRIPTION
: Optional. A text description of the reference image you can
then use in the
prompt
field. For example:"prompt": " a full-body portrait of a man with short hair [1] with hatch-cross drawing ", [...], "subjectDescription": " man with short hair "
- IMAGE_COUNT : The number of generated images. Accepted integer values: 1-4. Default value: 4.
HTTP method and URL:
POST https:// LOCATION -aiplatform.googleapis.com/v1/projects/ PROJECT_ID /locations/ LOCATION /publishers/google/models/ imagen-3.0-capability-001:predict
Request JSON body:
{ "instances": [ { "prompt": " TEXT_PROMPT ", "referenceImages": [ { "referenceType": "REFERENCE_TYPE_SUBJECT", "referenceId": 1 , "referenceImage": { "bytesBase64Encoded": " BASE64_REFERENCE_IMAGE " }, "subjectImageConfig": { "subjectDescription": " SUBJECT_DESCRIPTION ", "subjectType": "SUBJECT_TYPE_PERSON"} }, { "referenceType": "REFERENCE_TYPE_SUBJECT", "referenceId": 1 , "referenceImage": { "bytesBase64Encoded": " BASE64_REFERENCE_IMAGE " }, "subjectImageConfig": { "subjectDescription": " SUBJECT_DESCRIPTION ", "subjectType": "SUBJECT_TYPE_PERSON"} } ] } ], "parameters": { "sampleCount": IMAGE_COUNT } }
To send your request, choose one of these options:
curl
Save the request body in a file named request.json
,
and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https:// LOCATION -aiplatform.googleapis.com/v1/projects/ PROJECT_ID /locations/ LOCATION /publishers/google/models/ imagen-3.0-capability-001:predict"
PowerShell
Save the request body in a file named request.json
,
and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https:// LOCATION -aiplatform.googleapis.com/v1/projects/ PROJECT_ID /locations/ LOCATION /publishers/google/models/ imagen-3.0-capability-001:predict" | Select-Object -Expand Content
"sampleCount": 2
. The response returns two prediction objects, with
the generated image bytes base64-encoded. { "predictions": [ { "bytesBase64Encoded": " BASE64_IMG_BYTES ", "mimeType": "image/png" }, { "mimeType": "image/png", "bytesBase64Encoded": " BASE64_IMG_BYTES " } ] }
Class IDs
Use the following object class IDs to automatically create an image mask based on specific objects.
Class ID ( class_
) |
Object |
---|---|
0 | backpack |
1 | umbrella |
2 | bag |
3 | tie |
4 | suitcase |
5 | case |
6 | bird |
7 | cat |
8 | dog |
9 | horse |
10 | sheep |
11 | cow |
12 | elephant |
13 | bear |
14 | zebra |
15 | giraffe |
16 | animal (other) |
17 | microwave |
18 | radiator |
19 | oven |
20 | toaster |
21 | storage tank |
22 | conveyor belt |
23 | sink |
24 | refrigerator |
25 | washer dryer |
26 | fan |
27 | dishwasher |
28 | toilet |
29 | bathtub |
30 | shower |
31 | tunnel |
32 | bridge |
33 | pier wharf |
34 | tent |
35 | building |
36 | ceiling |
37 | laptop |
38 | keyboard |
39 | mouse |
40 | remote |
41 | cell phone |
42 | television |
43 | floor |
44 | stage |
45 | banana |
46 | apple |
47 | sandwich |
48 | orange |
49 | broccoli |
50 | carrot |
51 | hot dog |
52 | pizza |
53 | donut |
54 | cake |
55 | fruit (other) |
56 | food (other) |
57 | chair (other) |
58 | armchair |
59 | swivel chair |
60 | stool |
61 | seat |
62 | couch |
63 | trash can |
64 | potted plant |
65 | nightstand |
66 | bed |
67 | table |
68 | pool table |
69 | barrel |
70 | desk |
71 | ottoman |
72 | wardrobe |
73 | crib |
74 | basket |
75 | chest of drawers |
76 | bookshelf |
77 | counter (other) |
78 | bathroom counter |
79 | kitchen island |
80 | door |
81 | light (other) |
82 | lamp |
83 | sconce |
84 | chandelier |
85 | mirror |
86 | whiteboard |
87 | shelf |
88 | stairs |
89 | escalator |
90 | cabinet |
91 | fireplace |
92 | stove |
93 | arcade machine |
94 | gravel |
95 | platform |
96 | playingfield |
97 | railroad |
98 | road |
99 | snow |
100 | sidewalk pavement |
101 | runway |
102 | terrain |
103 | book |
104 | box |
105 | clock |
106 | vase |
107 | scissors |
108 | plaything (other) |
109 | teddy bear |
110 | hair dryer |
111 | toothbrush |
112 | painting |
113 | poster |
114 | bulletin board |
115 | bottle |
116 | cup |
117 | wine glass |
118 | knife |
119 | fork |
120 | spoon |
121 | bowl |
122 | tray |
123 | range hood |
124 | plate |
125 | person |
126 | rider (other) |
127 | bicyclist |
128 | motorcyclist |
129 | paper |
130 | streetlight |
131 | road barrier |
132 | mailbox |
133 | cctv camera |
134 | junction box |
135 | traffic sign |
136 | traffic light |
137 | fire hydrant |
138 | parking meter |
139 | bench |
140 | bike rack |
141 | billboard |
142 | sky |
143 | pole |
144 | fence |
145 | railing banister |
146 | guard rail |
147 | mountain hill |
148 | rock |
149 | frisbee |
150 | skis |
151 | snowboard |
152 | sports ball |
153 | kite |
154 | baseball bat |
155 | baseball glove |
156 | skateboard |
157 | surfboard |
158 | tennis racket |
159 | net |
160 | base |
161 | sculpture |
162 | column |
163 | fountain |
164 | awning |
165 | apparel |
166 | banner |
167 | flag |
168 | blanket |
169 | curtain (other) |
170 | shower curtain |
171 | pillow |
172 | towel |
173 | rug floormat |
174 | vegetation |
175 | bicycle |
176 | car |
177 | autorickshaw |
178 | motorcycle |
179 | airplane |
180 | bus |
181 | train |
182 | truck |
183 | trailer |
184 | boat ship |
185 | slow wheeled object |
186 | river lake |
187 | sea |
188 | water (other) |
189 | swimming pool |
190 | waterfall |
191 | wall |
192 | window |
193 | window blind |
What's next
- For more information, see Imagen on Vertex AI .