Get started with Prompt API

This page describes how to do the following:

Configure your project to use Prompt API
Provide text-only input and receive a response
Provide an image input with related text input and receive a response

For more details about the Prompt API, see the reference documentation for Kotlin ( com.google.mlkit.genai.prompt ) and Java ( com.google.mlkit.genai.prompt.java , com.google.mlkit.genai.prompt ).

Configure project

Add the ML Kit Prompt API as a dependency in your build.gradle configuration:

  implementation 
 ( 
 "com.google.mlkit:genai-prompt:1.0.0-alpha1" 
 )

Implement generative model

To implement the code in your project, follow these steps:

Create a generativeModel object:

Kotlin

  // Get a GenerativeModel instance 
 val 
  
 generativeModel 
  
 = 
  
 Generation 
 . 
 getClient 
 ()

Java

  // Get a GenerativeModel instance 
 GenerativeModelFutures 
  
 generativeModelFutures 
  
 = 
  
 GenerativeModelFutures 
  
 . 
 from 
 ( 
 Generation 
 . 
 INSTANCE 
 . 
 getClient 
 ());

Check if Gemini Nano is AVAILABLE, DOWNLOADABLE , or UNAVAILABLE . Then, download the feature if it is downloadable:

Kotlin

  val 
  
 status 
  
 = 
  
 generativeModel 
 . 
 checkStatus 
 () 
 when 
  
 ( 
 status 
 ) 
  
 { 
  
 FeatureStatus 
 . 
 UNAVAILABLE 
  
 - 
>  
 { 
  
 // Gemini Nano not supported on this device or device hasn't fetched the latest configuration to support it 
  
 } 
  
 FeatureStatus 
 . 
 DOWNLOADABLE 
  
 - 
>  
 { 
  
 // Gemini Nano can be downloaded on this device, but is not currently downloaded 
  
 generativeModel 
 . 
 download 
 (). 
 collect 
  
 { 
  
 status 
  
 - 
>  
 when 
  
 ( 
 status 
 ) 
  
 { 
  
 is 
  
 DownloadStatus 
 . 
 DownloadStarted 
  
 - 
>  
 Log 
 . 
 d 
 ( 
 TAG 
 , 
  
 "starting download for Gemini Nano" 
 ) 
  
 is 
  
 DownloadStatus 
 . 
 DownloadProgress 
  
 - 
>  
 Log 
 . 
 d 
 ( 
 TAG 
 , 
  
 "Nano 
 ${ 
 status 
 . 
 totalBytesDownloaded 
 } 
 bytes downloaded" 
 ) 
  
 DownloadStatus 
 . 
 DownloadCompleted 
  
 - 
>  
 { 
  
 Log 
 . 
 d 
 ( 
 TAG 
 , 
  
 "Gemini Nano download complete" 
 ) 
  
 modelDownloaded 
  
 = 
  
 true 
  
 } 
  
 is 
  
 DownloadStatus 
 . 
 DownloadFailed 
  
 - 
>  
 { 
  
 Log 
 . 
 e 
 ( 
 TAG 
 , 
  
 "Nano download failed 
 ${ 
 status 
 . 
 e 
 . 
 message 
 } 
 " 
 ) 
  
 } 
  
 } 
  
 } 
  
 } 
  
 FeatureStatus 
 . 
 DOWNLOADING 
  
 - 
>  
 { 
  
 // Gemini Nano currently being downloaded 
  
 } 
  
 FeatureStatus 
 . 
 AVAILABLE 
  
 - 
>  
 { 
  
 // Gemini Nano currently downloaded and available to use on this device 
  
 } 
 }

Java

  ListenableFuture<Integer> 
  
 status 
  
 = 
  
 generativeModelFutures 
 . 
 checkStatus 
 (); 
 Futures 
 . 
 addCallback 
 ( 
 generativeModelFutures 
 . 
 checkStatus 
 (), 
  
 new 
  
 FutureCallback 
<> () 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onSuccess 
 ( 
 Integer 
  
 featureStatus 
 ) 
  
 { 
  
 switch 
  
 ( 
 featureStatus 
 ) 
  
 { 
  
 case 
  
 FeatureStatus 
 . 
 AVAILABLE 
  
 - 
 > 
 { 
  
 // Gemini Nano currently downloaded and available to use on this device 
  
 } 
  
 case 
  
 FeatureStatus 
 . 
 UNAVAILABLE 
  
 - 
 > 
 { 
  
 // Gemini Nano not supported on this device or device hasn't fetched the latest configuration to support it 
  
 } 
  
 case 
  
 FeatureStatus 
 . 
 DOWNLOADING 
  
 - 
 > 
 { 
  
 // Gemini Nano currently being downloaded 
  
 } 
  
 case 
  
 FeatureStatus 
 . 
 DOWNLOADABLE 
  
 - 
 > 
 { 
  
 generativeModelFutures 
 . 
 download 
 ( 
 new 
  
 DownloadCallback 
 () 
  
 { 
  
 @Override 
  
 public 
  
 void 
  
 onDownloadStarted 
 ( 
 long 
  
 l 
 ) 
  
 { 
  
 Log 
 . 
 d 
 ( 
 TAG 
 , 
  
 "starting download for Gemini Nano" 
 ); 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onDownloadProgress 
 ( 
 long 
  
 l 
 ) 
  
 { 
  
 Log 
 . 
 d 
 ( 
 TAG 
 , 
  
 "Nano " 
  
 + 
  
 l 
  
 + 
  
 " bytes downloaded" 
 ); 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onDownloadCompleted 
 () 
  
 { 
  
 Log 
 . 
 d 
 ( 
 TAG 
 , 
  
 "Gemini Nano download complete" 
 ); 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onDownloadFailed 
 ( 
 @NonNull 
  
 GenAiException 
  
 e 
 ) 
  
 { 
  
 Log 
 . 
 e 
 ( 
 TAG 
 , 
  
 "Nano download failed: " 
  
 + 
  
 e 
 . 
 getMessage 
 ()); 
  
 } 
  
 }); 
  
 } 
  
 } 
  
 } 
  
 @Override 
  
 public 
  
 void 
  
 onFailure 
 ( 
 @NonNull 
  
 Throwable 
  
 t 
 ) 
  
 { 
  
 // Failed to check status 
  
 } 
 }, 
  
 ContextCompat 
 . 
 getMainExecutor 
 ( 
 context 
 ));

Provide text-only input

Kotlin

  val 
  
 response 
  
 = 
  
 generativeModel 
 . 
 generateContent 
 ( 
 "Write a 3 sentence story about a magical dog." 
 )

Java

  GenerateContentResponse 
  
 response 
  
 = 
  
 generativeModelFutures 
 . 
 generateContent 
 ( 
  
 new 
  
 GenerateContentRequest 
 . 
 Builder 
 ( 
  
 new 
  
 TextPart 
 ( 
 "Write a 3 sentence story about a magical dog." 
 )) 
  
 . 
 build 
 ()) 
  
 . 
 get 
 ();

Alternatively, add optional parameters:

Kotlin

  val 
  
 response 
  
 = 
  
 generativeModel 
 . 
 generateContent 
 ( 
  
 generateContentRequest 
 ( 
  
 TextPart 
 ( 
 "Write a 3 sentence story about a magical dog." 
 ), 
  
 ) 
  
 { 
  
 // Optional parameters 
  
 temperature 
  
 = 
  
 0.2f 
  
 topK 
  
 = 
  
 10 
  
 candidateCount 
  
 = 
  
 3 
  
 }, 
 )

Java

  GenerateContentRequest 
 . 
 Builder 
  
 requestBuilder 
  
 = 
  
 new 
  
 GenerateContentRequest 
 . 
 Builder 
 ( 
  
 new 
  
 TextPart 
 ( 
 "Write a 3 sentence story about a magical dog." 
 )); 
 requestBuilder 
 . 
 setTemperature 
 ( 
 .2f 
 ); 
 requestBuilder 
 . 
 setTopK 
 ( 
 10 
 ); 
 requestBuilder 
 . 
 setCandidateCount 
 ( 
 3 
 ); 
 GenerateContentResponse 
  
 response 
  
 = 
  
 generativeModelFutures 
 . 
 generateContent 
 ( 
 requestBuilder 
 . 
 build 
 ()). 
 get 
 ();

For more information about the optional parameters, see Optional configurations .

Provide multimodal (image and text) input

Bundle an image and a text input together in the generateContentRequest() function, with the text prompt being a question or command related to the image:

Kotlin

  val 
  
 response 
  
 = 
  
 generativeModel 
 . 
 generateContent 
 ( 
  
 generateContentRequest 
 ( 
 ImagePart 
 ( 
 bitmap 
 ), 
  
 TextPart 
 ( 
 textPrompt 
 )) 
  
 { 
  
 // optional parameters 
  
 ... 
  
 }, 
 )

Java

  GenerateContentResponse 
  
 response 
  
 = 
  
 generativeModelFutures 
 . 
 generateContent 
 ( 
  
 new 
  
 GenerateContentRequest 
 . 
 Builder 
 ( 
  
 new 
  
 ImagePart 
 ( 
 bitmap 
 ), 
  
 new 
  
 TextPart 
 ( 
 "textPrompt" 
 )) 
  
 // optional parameters 
  
 . 
 build 
 ()) 
 . 
 get 
 ();

Process inference result

Run the inference and retrieve the result. You can choose to either wait for the full result or stream the response as it's generated for both text-only and multimodal prompts.

This uses non-streaming inference, which retrieves the entire result from the AI model before returning the result:

Kotlin

  // Call the AI model to generate content and store the complete 
 // in a new variable named 'response' once it's finished 
 val 
  
 response 
  
 = 
  
 generativeModel 
 . 
 generateContent 
 ( 
 "Write a 3 sentence story about a magical dog" 
 )

Java

  GenerateContentResponse 
  
 response 
  
 = 
  
 generativeModelFutures 
 . 
 generateContent 
 ( 
  
 new 
  
 GenerateContentRequest 
 . 
 Builder 
 ( 
  
 new 
  
 TextPart 
 ( 
 "Write a 3 sentence story about a magical dog." 
 )) 
  
 . 
 build 
 ()) 
  
 . 
 get 
 ();

The following snippets are examples of using streaming inference, which retrieves the result in chunks as it's being generated:

Kotlin

  // Streaming inference 
 var 
  
 fullResponse 
  
 = 
  
 "" 
 generativeModel 
 . 
 generateContentStream 
 ( 
 "Write a 3 sentence story about a magical dog" 
 ). 
 collect 
  
 { 
  
 chunk 
  
 - 
>  
 val 
  
 newChunkReceived 
  
 = 
  
 chunk 
 . 
 candidates 
 [ 
 0 
 ] 
 . 
 text 
  
 print 
 ( 
 newChunkReceived 
 ) 
  
 fullResponse 
  
 += 
  
 newChunkReceived 
 }

Java

  // Streaming inference 
 StringBuilder 
  
 fullResponse 
  
 = 
  
 new 
  
 StringBuilder 
 (); 
 generativeModelFutures 
 . 
 generateContent 
 ( 
 new 
  
 GenerateContentRequest 
 . 
 Builder 
 ( 
  
 ( 
 new 
  
 TextPart 
 ( 
 "Write a 3 sentence story about a magical dog" 
 ))). 
 build 
 (), 
  
 chunk 
  
 - 
>  
 { 
  
 Log 
 . 
 d 
 ( 
 TAG 
 , 
  
 chunk 
 ); 
  
 fullResponse 
 . 
 append 
 ( 
 chunk 
 ); 
  
 });

For more information about streaming and non-streaming inference, see Streaming versus non-streaming .

Latency optimization

To optimize for the first inference call, your application may optionally call warmup() . This loads Gemini Nano into memory and initializes runtime components.

Optional configurations

As part of each GenerateContentRequest , you can set the following optional parameters:

temperature : Controls the degree of randomness in token selection.
seed : Enables generating stable and deterministic results.
topK : Controls randomness and diversity in results.
candidateCount : Requests the number of unique responses returned. Note that the exact number of responses may not be the same as candidateCount because duplicate responses are automatically removed.
maxOutputTokens : Defines the maximum number of tokens that can be generated in the response.

For more guidance on setting optional configurations, see GenerateContentRequest .

Supported features and limitations

Input must be under 4000 tokens (or approximately 3000 English words). For more information, see the countTokens reference.
Use cases that require long output (more than 256 tokens) should be avoided.
AICore enforces an inference quota per app. For more information, see Quota per application .
The following languages have been validated for Prompt API:
- English
- Korean

Common setup issues

ML Kit GenAI APIs rely on the Android AICore app to access Gemini Nano. When a device is just setup (including reset), or the AICore app is just reset (e.g. clear data, uninstalled then reinstalled), the AICore app may not have enough time to finish initialization (including downloading latest configurations from server). As a result, the ML Kit GenAI APIs may not function as expected. Here are the common setup error messages you may see and how to handle them:

Example error message	How to handle
AICore failed with error type 4-CONNECTION_ERROR and error code 601-BINDING_FAILURE: AICore service failed to bind.	This could happen when you install the app using ML Kit GenAI APIs immediately after device setup or when AICore is uninstalled after your app is installed. Updating AICore app then reinstalling your app should fix it.
AICore failed with error type 3-PREPARATION_ERROR and error code 606-FEATURE_NOT_FOUND: Feature ... is not available.	This could happen when AICore hasn't finished downloading the latest configurations. When the device is connected to the internet, it usually takes a few minutes to a few hours to update. Restarting the device can speed up the update. Note that if the device's bootloader is unlocked, you'll also see this error—this API does not support devices with unlocked bootloaders.
AICore failed with error type 1-DOWNLOAD_ERROR and error code 0-UNKNOWN: Feature ... failed with failure status 0 and error esz: UNAVAILABLE: Unable to resolve host ...	Keep network connection, wait for a few minutes and retry.

Get started with Prompt API Stay organized with collections Save and categorize content based on your preferences.

Configure project

Implement generative model

Kotlin

Java

Kotlin

Java

Provide text-only input

Kotlin

Java

Kotlin

Java

Provide multimodal (image and text) input

Kotlin

Java

Process inference result

Kotlin

Java

Kotlin

Java

Latency optimization

Optional configurations

Supported features and limitations

Common setup issues

Get started with Prompt API