Preview
This product or feature is a Generative AI Preview offering, subject to the "Pre-GA Offerings Terms" of the Google Cloud Service Specific Terms . For this Generative AI Preview offering, Customers may elect to use it for production or commercial purposes, or disclose Generated Output to third-parties, and may process personal data as outlined in the Cloud Data Processing Addendum , subject to the obligations and restrictions described in the agreement under which you access Google Cloud.
Gemini 3.1 Flash-Lite is our most cost-efficient Gemini model, optimized for low latency use cases for high-volume, cost-sensitive LLM traffic. It provides a significant quality increase over Gemini 2.0 Flash-Lite and Gemini 2.5 Flash-Lite models, matching Gemini 2.5 Flash performance across key capability areas:
- Improved response quality:Aims to match 2.5 Flash performance.
- Improved instruction following:Targeted improvements to serve as a reliable migration path for complex chatbot and instruction-heavy workflows.
- Improved audio input:Improved audio-input quality for tasks like Automated Speech Recognition (ASR).
- Expanded thinking support:You can control how much reasoning the model performs by choosing from minimal, low, medium, or high thinking levels . This feature lets you balance response quality and speed for your specific use case.
Try in Vertex AI (Preview) Deploy example app
gemini-3.1-flash-lite-preview
- Inputs:
Text , Code , Images , Audio , Video , PDF - Outputs:
Text
- Maximum input tokens: 1,048,576
- Maximum output tokens: 65,535 (default)
- Supported
- Not supported
- Maximum images per prompt: 3,000
- Maximum file size per file for inline data or direct uploads through the console: 7 MB
- Maximum file size per file from Google Cloud Storage: 30 MB
- Maximum number of output images per prompt: 10
- Supported MIME types:
image/png,image/jpeg,image/webp,image/heic,image/heif
- Maximum number of files per prompt: 3,000
- Maximum number of pages per file: 1,000
- Maximum file size per file: 50 MB
- Supported MIME types:
application/pdf,text/plain
- Maximum video length (with audio): Approximately 45 minutes
- Maximum video length (without audio): Approximately 1 hour
- Maximum number of videos per prompt: 10
- Supported MIME types:
video/x-flv,video/quicktime,video/mpeg,video/mpegs,video/mpg,video/mp4,video/webm,video/wmv,video/3gpp
- Maximum audio length per prompt: Approximately 8.4 hours, or up to 1 million tokens
- Maximum number of audio files per prompt: 1
- Supported MIME types:
audio/x-aac,audio/flac,audio/mp3,audio/m4a,audio/mpeg,audio/mpga,audio/mp4,audio/ogg,audio/pcm,audio/wav,audio/webm
- Temperature: 0.0-2.0 (default 1.0)
- topP: 0.0-1.0 (default 0.95)
- topK: 64 (fixed)
- candidateCount: 1–8 (default 1)
Model availability
- Global
- global
-
gemini-3.1-flash-lite-preview - Launch stage: Public preview
- Release date: March 3, 2026

