Generate images with Gemini

Gemini 2.5 Flash Image supports response generation in multiple modalities, including text and images.

Image generation

Gemini 2.5 Flash Image ( gemini-2.5-flash-image ) supports the ability to generate images in addition to text. This expands Gemini's capabilities to include the following:

  • Iteratively generate images through conversation with natural language, adjusting images while maintaining consistency and context.
  • Generate images with high-quality long text rendering.
  • Generate interleaved text-image output. For example, a blog post with text and images in a single turn. Previously, this required stringing together multiple models.
  • Generate images using Gemini's world knowledge and reasoning capabilities.

With this public experimental release, Gemini 2.5 Flash Image can generate images in 1024px, supports generating images of people, and contains updated safety filters that provide a more flexible and less restrictive user experience.

It supports the following modalities and capabilities:

  • Text to image

    • Example prompt:"Generate an image of the Eiffel tower with fireworks in the background."
  • Text to image (text rendering)

    • Example prompt:"generate a cinematic photo of a large building with this giant text projection mapped on the front of the building: "Gemini 2.5 can now generate long form text""
  • Text to image(s) and text (interleaved)

    • Example prompt:"Generate an illustrated recipe for a paella. Create images alongside the text as you generate the recipe."
    • Example prompt:"Generate a story about a dog in a 3D cartoon animation style. For each scene, generate an image"
  • Image(s) and text to image(s) and text (interleaved)

    • Example prompt:(With an image of a furnished room) "What other color sofas would work in my space? Can you update the image?"
  • Locale-aware image generation

    • Example prompt:"Generate an image of a breakfast meal."

Best practices

To improve your image generation results, follow these best practices:

  • Be specific:More details give you more control. For example, instead of "fantasy armor," try "ornate elven plate armor, etched with silver leaf patterns, with a high collar and pauldrons shaped like falcon wings."

  • Provide context and intent:Explain the purpose of the image to help the model understand the context. For example, "Create a logo for a high-end, minimalist skincare brand" works better than "Create a logo."

  • Iterate and refine:Don't expect a perfect image on your first attempt. Use follow-up prompts to make small changes, for example, "Make the lighting warmer" or "Change the character's expression to be more serious."

  • Use step-by-step instructions:For complex scenes, split your request into steps. For example, "First, create a background of a serene, misty forest at dawn. Then, in the foreground, add a moss-covered ancient stone altar. Finally, place a single, glowing sword on top of the altar."

  • Describe what you want, not what you don't:Instead of saying "no cars", describe the scene positively by saying, "an empty, deserted street with no signs of traffic."

  • Control the camera:Guide the camera view. Use photographic and cinematic terms to describe the composition, for example, "wide-angle shot", "macro shot", or "low-angle perspective".

  • Prompt for images:Describe the intent by using phrases such as "create an image of" or "generate an image of". Otherwise, the multimodal model might respond with text instead of the image.

Limitations:

  • For best performance, use the following languages: EN, es-MX, ja-JP, zh-CN, hi-IN.

  • Image generation doesn't support audio or video inputs.

  • The model might not create the exact number of images you ask for.

  • For best results, include a maximum of three images in an input.

  • When generating an image containing text, first generate the text and then generate an image with that text.

  • Image or text generation might not work as expected in these situations:

    • The model might only create text. If you want images, clearly ask for images in your request. For example, "provide images as you go along."

    • The model might create text as an image. To generate text, specifically ask for text output. For example, "generate narrative text along with illustrations."

    • The model might stop generating content even when it's not finished. If this occurs, try again or use a different prompt.

    • If a prompt is potentially unsafe, the model might not process the request and returns a response indicating that it can't create unsafe images. In this case, the FinishReason is STOP .

Generate images

The following sections cover how to generate images using either Vertex AI Studio or using the API.

For guidance and best practices for prompting, see Design multimodal prompts .

Console

To use image generation:

  1. Open Vertex AI Studio > Create prompt .
  2. Click Switch model and select gemini-2.5-flash-image from the menu.
  3. In the Outputs panel, select Image and text from the drop-down menu.
  4. Write a description of the image you want to generate in the text area of the Write a prompt text area.
  5. Click the Prompt ( ) button.

Gemini will generate an image based on your description. This process should take a few seconds, but may be comparatively slower depending on capacity.

Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation .

Set environment variables to use the Gen AI SDK with Vertex AI:

 # Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values 
 # with appropriate values for your project. 
 export 
  
 GOOGLE_CLOUD_PROJECT 
 = 
 GOOGLE_CLOUD_PROJECT 
 export 
  
 GOOGLE_CLOUD_LOCATION 
 = 
 global 
 export 
  
 GOOGLE_GENAI_USE_VERTEXAI 
 = 
True
  from 
  
 google 
  
 import 
 genai 
 from 
  
 google.genai.types 
  
 import 
 GenerateContentConfig 
 , 
 Modality 
 from 
  
 PIL 
  
 import 
 Image 
 from 
  
 io 
  
 import 
 BytesIO 
 client 
 = 
 genai 
 . 
 Client 
 () 
 response 
 = 
 client 
 . 
 models 
 . 
 generate_content 
 ( 
 model 
 = 
 "gemini-2.5-flash-image" 
 , 
 contents 
 = 
 ( 
 "Generate an image of the Eiffel tower with fireworks in the background." 
 ), 
 config 
 = 
 GenerateContentConfig 
 ( 
 response_modalities 
 = 
 [ 
 Modality 
 . 
 TEXT 
 , 
 Modality 
 . 
 IMAGE 
 ], 
 candidate_count 
 = 
 1 
 , 
 safety_settings 
 = 
 [ 
 { 
 "method" 
 : 
 "PROBABILITY" 
 }, 
 { 
 "category" 
 : 
 "HARM_CATEGORY_DANGEROUS_CONTENT" 
 }, 
 { 
 "threshold" 
 : 
 "BLOCK_MEDIUM_AND_ABOVE" 
 }, 
 ], 
 ), 
 ) 
 for 
 part 
 in 
 response 
 . 
 candidates 
 [ 
 0 
 ] 
 . 
 content 
 . 
 parts 
 : 
 if 
 part 
 . 
 text 
 : 
 print 
 ( 
 part 
 . 
 text 
 ) 
 elif 
 part 
 . 
 inline_data 
 : 
 image 
 = 
 Image 
 . 
 open 
 ( 
 BytesIO 
 (( 
 part 
 . 
 inline_data 
 . 
 data 
 ))) 
 image 
 . 
 save 
 ( 
 "output_folder/example-image-eiffel-tower.png" 
 ) 
 # Example response: 
 #   I will generate an image of the Eiffel Tower at night, with a vibrant display of 
 #   colorful fireworks exploding in the dark sky behind it. The tower will be 
 #   illuminated, standing tall as the focal point of the scene, with the bursts of 
 #   light from the fireworks creating a festive atmosphere. 
 

Node.js

Install

npm install @google/genai

To learn more, see the SDK reference documentation .

Set environment variables to use the Gen AI SDK with Vertex AI:

 # Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values 
 # with appropriate values for your project. 
 export 
  
 GOOGLE_CLOUD_PROJECT 
 = 
 GOOGLE_CLOUD_PROJECT 
 export 
  
 GOOGLE_CLOUD_LOCATION 
 = 
 global 
 export 
  
 GOOGLE_GENAI_USE_VERTEXAI 
 = 
True
  const 
  
 fs 
  
 = 
  
 require 
 ( 
 'fs' 
 ); 
 const 
  
 { 
 GoogleGenAI 
 , 
  
 Modality 
 } 
  
 = 
  
 require 
 ( 
 '@google/genai' 
 ); 
 const 
  
 GOOGLE_CLOUD_PROJECT 
  
 = 
  
 process 
 . 
 env 
 . 
 GOOGLE_CLOUD_PROJECT 
 ; 
 const 
  
 GOOGLE_CLOUD_LOCATION 
  
 = 
  
 process 
 . 
 env 
 . 
 GOOGLE_CLOUD_LOCATION 
  
 || 
  
 'us-central1' 
 ; 
 async 
  
 function 
  
 generateImage 
 ( 
  
 projectId 
  
 = 
  
 GOOGLE_CLOUD_PROJECT 
 , 
  
 location 
  
 = 
  
 GOOGLE_CLOUD_LOCATION 
 ) 
  
 { 
  
 const 
  
 client 
  
 = 
  
 new 
  
 GoogleGenAI 
 ({ 
  
 vertexai 
 : 
  
 true 
 , 
  
 project 
 : 
  
 projectId 
 , 
  
 location 
 : 
  
 location 
 , 
  
 }); 
  
 const 
  
 response 
  
 = 
  
 await 
  
 client 
 . 
 models 
 . 
 generateContentStream 
 ({ 
  
 model 
 : 
  
 'gemini-2.5-flash-image' 
 , 
  
 contents 
 : 
  
 'Generate an image of the Eiffel tower with fireworks in the background.' 
 , 
  
 config 
 : 
  
 { 
  
 responseModalities 
 : 
  
 [ 
 Modality 
 . 
 TEXT 
 , 
  
 Modality 
 . 
 IMAGE 
 ], 
  
 }, 
  
 }); 
  
 const 
  
 generatedFileNames 
  
 = 
  
 []; 
  
 let 
  
 imageIndex 
  
 = 
  
 0 
 ; 
  
 for 
  
 await 
  
 ( 
 const 
  
 chunk 
  
 of 
  
 response 
 ) 
  
 { 
  
 const 
  
 text 
  
 = 
  
 chunk 
 . 
 text 
 ; 
  
 const 
  
 data 
  
 = 
  
 chunk 
 . 
 data 
 ; 
  
 if 
  
 ( 
 text 
 ) 
  
 { 
  
 console 
 . 
 debug 
 ( 
 text 
 ); 
  
 } 
  
 else 
  
 if 
  
 ( 
 data 
 ) 
  
 { 
  
 const 
  
 outputDir 
  
 = 
  
 'output-folder' 
 ; 
  
 if 
  
 ( 
 ! 
 fs 
 . 
 existsSync 
 ( 
 outputDir 
 )) 
  
 { 
  
 fs 
 . 
 mkdirSync 
 ( 
 outputDir 
 , 
  
 { 
 recursive 
 : 
  
 true 
 }); 
  
 } 
  
 const 
  
 fileName 
  
 = 
  
 ` 
 $ 
 { 
 outputDir 
 } 
 / 
 generate_content_streaming_image_ 
 $ 
 { 
 imageIndex 
 ++ 
 } 
 . 
 png 
 ` 
 ; 
  
 console 
 . 
 debug 
 ( 
 ` 
 Writing 
  
 response 
  
 image 
  
 to 
  
 file 
 : 
  
 $ 
 { 
 fileName 
 } 
 . 
 ` 
 ); 
  
 try 
  
 { 
  
 fs 
 . 
 writeFileSync 
 ( 
 fileName 
 , 
  
 data 
 ); 
  
 generatedFileNames 
 . 
 push 
 ( 
 fileName 
 ); 
  
 } 
  
 catch 
  
 ( 
 error 
 ) 
  
 { 
  
 console 
 . 
 error 
 ( 
 ` 
 Failed 
  
 to 
  
 write 
  
 image 
  
 file 
  
 $ 
 { 
 fileName 
 }: 
 ` 
 , 
  
 error 
 ); 
  
 } 
  
 } 
  
 } 
  
 // 
  
 Example 
  
 response 
 : 
  
 // 
  
 I 
  
 will 
  
 generate 
  
 an 
  
 image 
  
 of 
  
 the 
  
 Eiffel 
  
 Tower 
  
 at 
  
 night 
 , 
  
 with 
  
 a 
  
 vibrant 
  
 display 
  
 of 
  
 // 
  
 colorful 
  
 fireworks 
  
 exploding 
  
 in 
  
 the 
  
 dark 
  
 sky 
  
 behind 
  
 it 
 . 
  
 The 
  
 tower 
  
 will 
  
 be 
  
 // 
  
 illuminated 
 , 
  
 standing 
  
 tall 
  
 as 
  
 the 
  
 focal 
  
 point 
  
 of 
  
 the 
  
 scene 
 , 
  
 with 
  
 the 
  
 bursts 
  
 of 
  
 // 
  
 light 
  
 from 
  
 the 
  
 fireworks 
  
 creating 
  
 a 
  
 festive 
  
 atmosphere 
 . 
  
 return 
  
 generatedFileNames 
 ; 
 } 
 

Java

Learn how to install or update the Java .

To learn more, see the SDK reference documentation .

Set environment variables to use the Gen AI SDK with Vertex AI:

 # Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values 
 # with appropriate values for your project. 
 export 
  
 GOOGLE_CLOUD_PROJECT 
 = 
 GOOGLE_CLOUD_PROJECT 
 export 
  
 GOOGLE_CLOUD_LOCATION 
 = 
 global 
 export 
  
 GOOGLE_GENAI_USE_VERTEXAI 
 = 
True
  import 
  
 com.google.genai.Client 
 ; 
 import 
  
 com.google.genai.types.Blob 
 ; 
 import 
  
 com.google.genai.types.Candidate 
 ; 
 import 
  
 com.google.genai.types.Content 
 ; 
 import 
  
 com.google.genai.types.GenerateContentConfig 
 ; 
 import 
  
 com.google.genai.types.GenerateContentResponse 
 ; 
 import 
  
 com.google.genai.types.Part 
 ; 
 import 
  
 com.google.genai.types.SafetySetting 
 ; 
 import 
  
 java.awt.image.BufferedImage 
 ; 
 import 
  
 java.io.ByteArrayInputStream 
 ; 
 import 
  
 java.io.File 
 ; 
 import 
  
 java.io.IOException 
 ; 
 import 
  
 java.util.ArrayList 
 ; 
 import 
  
 java.util.List 
 ; 
 import 
  
 javax.imageio.ImageIO 
 ; 
 public 
 class 
  
 ImageGenMmFlashWithText 
 { 
 public 
 static 
 void 
 main 
 ( 
 String 
 [] 
 args 
 ) 
 throws 
 IOException 
 { 
 // 
 TODO 
 ( 
 developer 
 ): 
 Replace 
 these 
 variables 
 before 
 running 
 the 
 sample 
 . 
 String 
 modelId 
 = 
 "gemini-2.5-flash-image" 
 ; 
 String 
 outputFile 
 = 
 "resources/output/example-image-eiffel-tower.png" 
 ; 
 generateContent 
 ( 
 modelId 
 , 
 outputFile 
 ); 
 } 
 // 
 Generates 
 an 
 image 
 with 
 text 
 input 
 public 
 static 
 void 
 generateContent 
 ( 
 String 
 modelId 
 , 
 String 
 outputFile 
 ) 
 throws 
 IOException 
 { 
 // 
 Client 
 Initialization 
 . 
 Once 
 created 
 , 
 it 
 can 
 be 
 reused 
 for 
 multiple 
 requests 
 . 
 try 
 ( 
 Client 
 client 
 = 
 Client 
 . 
 builder 
 () 
 . 
 location 
 ( 
 "global" 
 ) 
 . 
 vertexAI 
 ( 
 true 
 ) 
 . 
 build 
 ()) 
 { 
 GenerateContentConfig 
 contentConfig 
 = 
 GenerateContentConfig 
 . 
 builder 
 () 
 . 
 responseModalities 
 ( 
 "TEXT" 
 , 
 "IMAGE" 
 ) 
 . 
 candidateCount 
 ( 
 1 
 ) 
 . 
 safetySettings 
 ( 
 SafetySetting 
 . 
 builder 
 () 
 . 
 method 
 ( 
 "PROBABILITY" 
 ) 
 . 
 category 
 ( 
 "HARM_CATEGORY_DANGEROUS_CONTENT" 
 ) 
 . 
 threshold 
 ( 
 "BLOCK_MEDIUM_AND_ABOVE" 
 ) 
 . 
 build 
 ()) 
 . 
 build 
 (); 
 GenerateContentResponse 
 response 
 = 
 client 
 . 
 models 
 . 
 generateContent 
 ( 
 modelId 
 , 
 "Generate an image of the Eiffel tower with fireworks in the background." 
 , 
 contentConfig 
 ); 
 // 
 Get 
 parts 
 of 
 the 
 response 
 List<Part> 
 parts 
 = 
 response 
 . 
 candidates 
 () 
 . 
 flatMap 
 ( 
 candidates 
 - 
> candidates 
 . 
 stream 
 () 
 . 
 findFirst 
 ()) 
 . 
 flatMap 
 ( 
 Candidate 
 :: 
 content 
 ) 
 . 
 flatMap 
 ( 
 Content 
 :: 
 parts 
 ) 
 . 
 orElse 
 ( 
 new 
 ArrayList 
<> ()); 
 // 
 For 
 each 
 part 
 print 
 text 
 if 
 present 
 , 
 otherwise 
 read 
 image 
 data 
 if 
 present 
 and 
 // 
 write 
 it 
 to 
 the 
 output 
 file 
 for 
 ( 
 Part 
 part 
 : 
 parts 
 ) 
 { 
 if 
 ( 
 part 
 . 
 text 
 () 
 . 
 isPresent 
 ()) 
 { 
 System 
 . 
 out 
 . 
 println 
 ( 
 part 
 . 
 text 
 () 
 . 
 get 
 ()); 
 } 
 else 
 if 
 ( 
 part 
 . 
 inlineData 
 () 
 . 
 flatMap 
 ( 
 Blob 
 :: 
 data 
 ) 
 . 
 isPresent 
 ()) 
 { 
 BufferedImage 
 image 
 = 
 ImageIO 
 . 
 read 
 ( 
 new 
 ByteArrayInputStream 
 ( 
 part 
 . 
 inlineData 
 () 
 . 
 flatMap 
 ( 
 Blob 
 :: 
 data 
 ) 
 . 
 get 
 ())); 
 ImageIO 
 . 
 write 
 ( 
 image 
 , 
 "png" 
 , 
 new 
 File 
 ( 
 outputFile 
 )); 
 } 
 } 
 System 
 . 
 out 
 . 
 println 
 ( 
 "Content written to: " 
 + 
 outputFile 
 ); 
 // 
 Example 
 response 
 : 
 // 
 Here 
 is 
 the 
 Eiffel 
 Tower 
 with 
 fireworks 
 in 
 the 
 background 
 ... 
 // 
 // 
 Content 
 written 
 to 
 : 
 resources 
 / 
 output 
 / 
 example 
 - 
 image 
 - 
 eiffel 
 - 
 tower 
 . 
 png 
 } 
 } 
 } 
 

REST

Run the following command in the terminal to create or overwrite this file in the current directory:

 curl  
-X  
POST  
 \ 
  
-H  
 "Authorization: Bearer 
 $( 
gcloud  
auth  
print-access-token ) 
 " 
  
 \ 
  
-H  
 "Content-Type: application/json" 
  
 \ 
  
https:// ${ 
 API_ENDPOINT 
 } 
:generateContent  
 \ 
  
-d  
 '{ 
 "contents": { 
 "role": "USER", 
 "parts": { "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps."}, 
 }, 
 "generation_config": { 
 "response_modalities": ["TEXT", "IMAGE"], 
 "image_config": { 
 "aspect_ratio": "16:9", 
 }, 
 }, 
 "safetySettings": { 
 "method": "PROBABILITY", 
 "category": "HARM_CATEGORY_DANGEROUS_CONTENT", 
 "threshold": "BLOCK_MEDIUM_AND_ABOVE" 
 }, 
 }' 
  
 2 
>/dev/null  
>response.json 

Gemini will generate an image based on your description. This process should take a few seconds, but may be comparatively slower depending on capacity.

Generate interleaved images and text

Gemini 2.5 Flash Image can generate interleaved images with its text responses. For example, you can generate images of what each step of a generated recipe might look like to go along with the text of that step, without having to make separate requests to the model to do so.

Console

To generate interleaved images with text responses:

  1. Open Vertex AI Studio > Create prompt .
  2. Click Switch model and select gemini-2.5-flash-image from the menu.
  3. In the Outputs panel, select Image and text from the drop-down menu.
  4. Write a description of the image you want to generate in the text area of the Write a prompt text area. For example, "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps. For each step, provide a title with the number of the step, an explanation, and also generate an image, generate each image in a 1:1 aspect ratio."
  5. Click the Prompt ( ) button.

Gemini will generate a response based on your description. This process should take a few seconds, but may be comparatively slower depending on capacity.

Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation .

Set environment variables to use the Gen AI SDK with Vertex AI:

 # Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values 
 # with appropriate values for your project. 
 export 
  
 GOOGLE_CLOUD_PROJECT 
 = 
 GOOGLE_CLOUD_PROJECT 
 export 
  
 GOOGLE_CLOUD_LOCATION 
 = 
 global 
 export 
  
 GOOGLE_GENAI_USE_VERTEXAI 
 = 
True
  from 
  
 google 
  
 import 
 genai 
 from 
  
 google.genai.types 
  
 import 
 GenerateContentConfig 
 , 
 Modality 
 from 
  
 PIL 
  
 import 
 Image 
 from 
  
 io 
  
 import 
 BytesIO 
 client 
 = 
 genai 
 . 
 Client 
 () 
 response 
 = 
 client 
 . 
 models 
 . 
 generate_content 
 ( 
 model 
 = 
 "gemini-2.5-flash-image" 
 , 
 contents 
 = 
 ( 
 "Generate an illustrated recipe for a paella." 
 "Create images to go alongside the text as you generate the recipe" 
 ), 
 config 
 = 
 GenerateContentConfig 
 ( 
 response_modalities 
 = 
 [ 
 Modality 
 . 
 TEXT 
 , 
 Modality 
 . 
 IMAGE 
 ]), 
 ) 
 with 
 open 
 ( 
 "output_folder/paella-recipe.md" 
 , 
 "w" 
 ) 
 as 
 fp 
 : 
 for 
 i 
 , 
 part 
 in 
 enumerate 
 ( 
 response 
 . 
 candidates 
 [ 
 0 
 ] 
 . 
 content 
 . 
 parts 
 ): 
 if 
 part 
 . 
 text 
 is 
 not 
 None 
 : 
 fp 
 . 
 write 
 ( 
 part 
 . 
 text 
 ) 
 elif 
 part 
 . 
 inline_data 
 is 
 not 
 None 
 : 
 image 
 = 
 Image 
 . 
 open 
 ( 
 BytesIO 
 (( 
 part 
 . 
 inline_data 
 . 
 data 
 ))) 
 image 
 . 
 save 
 ( 
 f 
 "output_folder/example-image- 
 { 
 i 
 + 
 1 
 } 
 .png" 
 ) 
 fp 
 . 
 write 
 ( 
 f 
 "![image](example-image- 
 { 
 i 
 + 
 1 
 } 
 .png)" 
 ) 
 # Example response: 
 #  A markdown page for a Paella recipe(`paella-recipe.md`) has been generated. 
 #   It includes detailed steps and several images illustrating the cooking process. 
 

Java

Learn how to install or update the Java .

To learn more, see the SDK reference documentation .

Set environment variables to use the Gen AI SDK with Vertex AI:

 # Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values 
 # with appropriate values for your project. 
 export 
  
 GOOGLE_CLOUD_PROJECT 
 = 
 GOOGLE_CLOUD_PROJECT 
 export 
  
 GOOGLE_CLOUD_LOCATION 
 = 
 global 
 export 
  
 GOOGLE_GENAI_USE_VERTEXAI 
 = 
True
  import 
  
 com.google.genai.Client 
 ; 
 import 
  
 com.google.genai.types.Blob 
 ; 
 import 
  
 com.google.genai.types.Candidate 
 ; 
 import 
  
 com.google.genai.types.Content 
 ; 
 import 
  
 com.google.genai.types.GenerateContentConfig 
 ; 
 import 
  
 com.google.genai.types.GenerateContentResponse 
 ; 
 import 
  
 com.google.genai.types.Part 
 ; 
 import 
  
 java.awt.image.BufferedImage 
 ; 
 import 
  
 java.io.BufferedWriter 
 ; 
 import 
  
 java.io.ByteArrayInputStream 
 ; 
 import 
  
 java.io.File 
 ; 
 import 
  
 java.io.FileWriter 
 ; 
 import 
  
 java.io.IOException 
 ; 
 import 
  
 java.util.ArrayList 
 ; 
 import 
  
 java.util.List 
 ; 
 import 
  
 javax.imageio.ImageIO 
 ; 
 public 
 class 
  
 ImageGenMmFlashTextAndImageWithText 
 { 
 public 
 static 
 void 
 main 
 ( 
 String 
 [] 
 args 
 ) 
 throws 
 IOException 
 { 
 // 
 TODO 
 ( 
 developer 
 ): 
 Replace 
 these 
 variables 
 before 
 running 
 the 
 sample 
 . 
 String 
 modelId 
 = 
 "gemini-2.5-flash-image" 
 ; 
 String 
 outputFile 
 = 
 "resources/output/paella-recipe.md" 
 ; 
 generateContent 
 ( 
 modelId 
 , 
 outputFile 
 ); 
 } 
 // 
 Generates 
 text 
 and 
 image 
 with 
 text 
 input 
 public 
 static 
 void 
 generateContent 
 ( 
 String 
 modelId 
 , 
 String 
 outputFile 
 ) 
 throws 
 IOException 
 { 
 // 
 Client 
 Initialization 
 . 
 Once 
 created 
 , 
 it 
 can 
 be 
 reused 
 for 
 multiple 
 requests 
 . 
 try 
 ( 
 Client 
 client 
 = 
 Client 
 . 
 builder 
 () 
 . 
 location 
 ( 
 "global" 
 ) 
 . 
 vertexAI 
 ( 
 true 
 ) 
 . 
 build 
 ()) 
 { 
 GenerateContentResponse 
 response 
 = 
 client 
 . 
 models 
 . 
 generateContent 
 ( 
 modelId 
 , 
 Content 
 . 
 fromParts 
 ( 
 Part 
 . 
 fromText 
 ( 
 "Generate an illustrated recipe for a paella." 
 ), 
 Part 
 . 
 fromText 
 ( 
 "Create images to go alongside the text as you generate the recipe." 
 )), 
 GenerateContentConfig 
 . 
 builder 
 () 
 . 
 responseModalities 
 ( 
 "TEXT" 
 , 
 "IMAGE" 
 ) 
 . 
 build 
 ()); 
 try 
 ( 
 BufferedWriter 
 writer 
 = 
 new 
 BufferedWriter 
 ( 
 new 
 FileWriter 
 ( 
 outputFile 
 ))) 
 { 
 // 
 Get 
 parts 
 of 
 the 
 response 
 List<Part> 
 parts 
 = 
 response 
 . 
 candidates 
 () 
 . 
 flatMap 
 ( 
 candidates 
 - 
> candidates 
 . 
 stream 
 () 
 . 
 findFirst 
 ()) 
 . 
 flatMap 
 ( 
 Candidate 
 :: 
 content 
 ) 
 . 
 flatMap 
 ( 
 Content 
 :: 
 parts 
 ) 
 . 
 orElse 
 ( 
 new 
 ArrayList 
<> ()); 
 int 
 index 
 = 
 1 
 ; 
 // 
 For 
 each 
 part 
 print 
 text 
 if 
 present 
 , 
 otherwise 
 read 
 image 
 data 
 if 
 present 
 and 
 // 
 write 
 it 
 to 
 the 
 output 
 file 
 for 
 ( 
 Part 
 part 
 : 
 parts 
 ) 
 { 
 if 
 ( 
 part 
 . 
 text 
 () 
 . 
 isPresent 
 ()) 
 { 
 writer 
 . 
 write 
 ( 
 part 
 . 
 text 
 () 
 . 
 get 
 ()); 
 } 
 else 
 if 
 ( 
 part 
 . 
 inlineData 
 () 
 . 
 flatMap 
 ( 
 Blob 
 :: 
 data 
 ) 
 . 
 isPresent 
 ()) 
 { 
 BufferedImage 
 image 
 = 
 ImageIO 
 . 
 read 
 ( 
 new 
 ByteArrayInputStream 
 ( 
 part 
 . 
 inlineData 
 () 
 . 
 flatMap 
 ( 
 Blob 
 :: 
 data 
 ) 
 . 
 get 
 ())); 
 ImageIO 
 . 
 write 
 ( 
 image 
 , 
 "png" 
 , 
 new 
 File 
 ( 
 "resources/output/example-image-" 
 + 
 index 
 + 
 ".png" 
 )); 
 writer 
 . 
 write 
 ( 
 "![image](example-image-" 
 + 
 index 
 + 
 ".png)" 
 ); 
 } 
 index 
 ++ 
 ; 
 } 
 System 
 . 
 out 
 . 
 println 
 ( 
 "Content written to: " 
 + 
 outputFile 
 ); 
 // 
 Example 
 response 
 : 
 // 
 A 
 markdown 
 page 
 for 
 a 
 Paella 
 recipe 
 ( 
 ` 
 paella 
 - 
 recipe 
 . 
 md 
 ` 
 ) 
 has 
 been 
 generated 
 . 
 // 
 It 
 includes 
 detailed 
 steps 
 and 
 several 
 images 
 illustrating 
 the 
 cooking 
 process 
 . 
 // 
 // 
 Content 
 written 
 to 
 : 
 resources 
 / 
 output 
 / 
 paella 
 - 
 recipe 
 . 
 md 
 } 
 } 
 } 
 } 
 

REST

Run the following command in the terminal to create or overwrite this file in the current directory:

 curl  
-X  
POST  
 \ 
  
-H  
 "Authorization: Bearer 
 $( 
gcloud  
auth  
print-access-token ) 
 " 
  
 \ 
  
-H  
 "Content-Type: application/json" 
  
 \ 
  
https:// ${ 
 API_ENDPOINT 
 } 
:generateContent  
 \ 
  
-d  
 '{ 
 "contents": { 
 "role": "USER", 
 "parts": { "text": "Create a tutorial explaining how to make a peanut butter and jelly sandwich in three easy steps. For each step, provide a title with the number of the step, an explanation, and also generate an image, generate each image in a 1:1 aspect ratio."}, 
 }, 
 "generation_config": { 
 "response_modalities": ["TEXT", "IMAGE"], 
 "image_config": { 
 "aspect_ratio": "16:9", 
 }, 
 }, 
 "safetySettings": { 
 "method": "PROBABILITY", 
 "category": "HARM_CATEGORY_DANGEROUS_CONTENT", 
 "threshold": "BLOCK_MEDIUM_AND_ABOVE" 
 }, 
 }' 
  
 2 
>/dev/null  
>response.json 

Gemini will generate an image based on your description. This process should take a few seconds, but may be comparatively slower depending on capacity.

Locale-aware image generation

Gemini 2.5 Flash Image can also include information about your location when providing text or image responses. For example, you can generate images of types of locations or experiences that take your current location into account without having to specify your location to the model to do so.

Console

To use locale-aware image generation:

  1. Open Vertex AI Studio > Create prompt .
  2. Click Switch model and select gemini-2.5-flash-image from the menu.
  3. In the Outputs panel, select Image and text from the drop-down menu.
  4. Write a description of the image you want to generate in the text area of the Write a prompt text area. For example, "Generate a photo of a typical breakfast."
  5. Click the Prompt ( ) button.

Gemini will generate a response based on your description. This process should take a few seconds, but may be comparatively slower depending on capacity.

Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation .

Set environment variables to use the Gen AI SDK with Vertex AI:

 # Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values 
 # with appropriate values for your project. 
 export 
  
 GOOGLE_CLOUD_PROJECT 
 = 
 GOOGLE_CLOUD_PROJECT 
 export 
  
 GOOGLE_CLOUD_LOCATION 
 = 
 global 
 export 
  
 GOOGLE_GENAI_USE_VERTEXAI 
 = 
True
  from 
  
 google 
  
 import 
 genai 
 from 
  
 google.genai.types 
  
 import 
 GenerateContentConfig 
 , 
 Modality 
 from 
  
 PIL 
  
 import 
 Image 
 from 
  
 io 
  
 import 
 BytesIO 
 client 
 = 
 genai 
 . 
 Client 
 () 
 response 
 = 
 client 
 . 
 models 
 . 
 generate_content 
 ( 
 model 
 = 
 "gemini-2.5-flash-image" 
 , 
 contents 
 = 
 ( 
 "Generate a photo of a breakfast meal." 
 ), 
 config 
 = 
 GenerateContentConfig 
 ( 
 response_modalities 
 = 
 [ 
 Modality 
 . 
 TEXT 
 , 
 Modality 
 . 
 IMAGE 
 ]), 
 ) 
 for 
 part 
 in 
 response 
 . 
 candidates 
 [ 
 0 
 ] 
 . 
 content 
 . 
 parts 
 : 
 if 
 part 
 . 
 text 
 : 
 print 
 ( 
 part 
 . 
 text 
 ) 
 elif 
 part 
 . 
 inline_data 
 : 
 image 
 = 
 Image 
 . 
 open 
 ( 
 BytesIO 
 (( 
 part 
 . 
 inline_data 
 . 
 data 
 ))) 
 image 
 . 
 save 
 ( 
 "output_folder/example-breakfast-meal.png" 
 ) 
 # Example response: 
 #   Generates a photo of a vibrant and appetizing breakfast meal. 
 #   The scene will feature a white plate with golden-brown pancakes 
 #   stacked neatly, drizzled with rich maple syrup and ... 
 

Java

Learn how to install or update the Java .

To learn more, see the SDK reference documentation .

Set environment variables to use the Gen AI SDK with Vertex AI:

 # Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values 
 # with appropriate values for your project. 
 export 
  
 GOOGLE_CLOUD_PROJECT 
 = 
 GOOGLE_CLOUD_PROJECT 
 export 
  
 GOOGLE_CLOUD_LOCATION 
 = 
 global 
 export 
  
 GOOGLE_GENAI_USE_VERTEXAI 
 = 
True
  import 
  
 com.google.genai.Client 
 ; 
 import 
  
 com.google.genai.types.Blob 
 ; 
 import 
  
 com.google.genai.types.Candidate 
 ; 
 import 
  
 com.google.genai.types.Content 
 ; 
 import 
  
 com.google.genai.types.GenerateContentConfig 
 ; 
 import 
  
 com.google.genai.types.GenerateContentResponse 
 ; 
 import 
  
 com.google.genai.types.Part 
 ; 
 import 
  
 java.awt.image.BufferedImage 
 ; 
 import 
  
 java.io.ByteArrayInputStream 
 ; 
 import 
  
 java.io.File 
 ; 
 import 
  
 java.io.IOException 
 ; 
 import 
  
 java.util.ArrayList 
 ; 
 import 
  
 java.util.List 
 ; 
 import 
  
 javax.imageio.ImageIO 
 ; 
 public 
 class 
  
 ImageGenMmFlashLocaleAwareWithText 
 { 
 public 
 static 
 void 
 main 
 ( 
 String 
 [] 
 args 
 ) 
 throws 
 IOException 
 { 
 // 
 TODO 
 ( 
 developer 
 ): 
 Replace 
 these 
 variables 
 before 
 running 
 the 
 sample 
 . 
 String 
 modelId 
 = 
 "gemini-2.5-flash-image" 
 ; 
 String 
 outputFile 
 = 
 "resources/output/example-breakfast-meal.png" 
 ; 
 generateContent 
 ( 
 modelId 
 , 
 outputFile 
 ); 
 } 
 // 
 Generates 
 an 
 image 
 with 
 text 
 input 
 public 
 static 
 void 
 generateContent 
 ( 
 String 
 modelId 
 , 
 String 
 outputFile 
 ) 
 throws 
 IOException 
 { 
 // 
 Client 
 Initialization 
 . 
 Once 
 created 
 , 
 it 
 can 
 be 
 reused 
 for 
 multiple 
 requests 
 . 
 try 
 ( 
 Client 
 client 
 = 
 Client 
 . 
 builder 
 () 
 . 
 location 
 ( 
 "global" 
 ) 
 . 
 vertexAI 
 ( 
 true 
 ) 
 . 
 build 
 ()) 
 { 
 GenerateContentResponse 
 response 
 = 
 client 
 . 
 models 
 . 
 generateContent 
 ( 
 modelId 
 , 
 "Generate a photo of a breakfast meal." 
 , 
 GenerateContentConfig 
 . 
 builder 
 () 
 . 
 responseModalities 
 ( 
 "TEXT" 
 , 
 "IMAGE" 
 ) 
 . 
 build 
 ()); 
 // 
 Get 
 parts 
 of 
 the 
 response 
 List<Part> 
 parts 
 = 
 response 
 . 
 candidates 
 () 
 . 
 flatMap 
 ( 
 candidates 
 - 
> candidates 
 . 
 stream 
 () 
 . 
 findFirst 
 ()) 
 . 
 flatMap 
 ( 
 Candidate 
 :: 
 content 
 ) 
 . 
 flatMap 
 ( 
 Content 
 :: 
 parts 
 ) 
 . 
 orElse 
 ( 
 new 
 ArrayList 
<> ()); 
 // 
 For 
 each 
 part 
 print 
 text 
 if 
 present 
 , 
 otherwise 
 read 
 image 
 data 
 if 
 present 
 and 
 // 
 write 
 it 
 to 
 the 
 output 
 file 
 for 
 ( 
 Part 
 part 
 : 
 parts 
 ) 
 { 
 if 
 ( 
 part 
 . 
 text 
 () 
 . 
 isPresent 
 ()) 
 { 
 System 
 . 
 out 
 . 
 println 
 ( 
 part 
 . 
 text 
 () 
 . 
 get 
 ()); 
 } 
 else 
 if 
 ( 
 part 
 . 
 inlineData 
 () 
 . 
 flatMap 
 ( 
 Blob 
 :: 
 data 
 ) 
 . 
 isPresent 
 ()) 
 { 
 BufferedImage 
 image 
 = 
 ImageIO 
 . 
 read 
 ( 
 new 
 ByteArrayInputStream 
 ( 
 part 
 . 
 inlineData 
 () 
 . 
 flatMap 
 ( 
 Blob 
 :: 
 data 
 ) 
 . 
 get 
 ())); 
 ImageIO 
 . 
 write 
 ( 
 image 
 , 
 "png" 
 , 
 new 
 File 
 ( 
 outputFile 
 )); 
 } 
 } 
 System 
 . 
 out 
 . 
 println 
 ( 
 "Content written to: " 
 + 
 outputFile 
 ); 
 // 
 Example 
 response 
 : 
 // 
 Here 
 is 
 a 
 photo 
 of 
 a 
 breakfast 
 meal 
 for 
 you 
 ! 
 // 
 // 
 Content 
 written 
 to 
 : 
 resources 
 / 
 output 
 / 
 example 
 - 
 breakfast 
 - 
 meal 
 . 
 png 
 } 
 } 
 } 
 

REST

Run the following command in the terminal to create or overwrite this file in the current directory:

 curl  
-X  
POST  
 \ 
  
-H  
 "Authorization: Bearer 
 $( 
gcloud  
auth  
print-access-token ) 
 " 
  
 \ 
  
-H  
 "Content-Type: application/json" 
  
 \ 
  
https:// ${ 
 API_ENDPOINT 
 } 
:generateContent  
 \ 
  
-d  
 '{ 
 "contents": { 
 "role": "USER", 
 "parts": { "text": "Generate a photo of a typical breakfast."}, 
 }, 
 "generation_config": { 
 "response_modalities": ["TEXT", "IMAGE"], 
 "image_config": { 
 "aspect_ratio": "16:9", 
 }, 
 }, 
 "safetySettings": { 
 "method": "PROBABILITY", 
 "category": "HARM_CATEGORY_DANGEROUS_CONTENT", 
 "threshold": "BLOCK_MEDIUM_AND_ABOVE" 
 }, 
 }' 
  
 2 
>/dev/null  
>response.json 

Gemini will generate an image based on your description. This process should take a few seconds, but may be comparatively slower depending on capacity.

Design a Mobile Site
View Site in Mobile | Classic
Share by: