Gemini 2.0 Flash-Lite is our fastest Gemini 2.0 model, optimized for cost efficiency and low latency.
Try in Vertex AI View in Model Garden (Preview) Deploy example app
Model ID
gemini-2.0-flash-lite
Supported inputs & outputs
- Inputs:
Text , Code , Images , Audio , Video - Outputs:
Text
Token limits
- Maximum input tokens: 1,048,576
- Maximum output tokens: 8,192 (default)
Capabilities
- Supported
- Not supported
- Grounding with Google Search
- Code execution
- Live API Preview feature
- Thinking
- Vertex AI RAG Engine
Usage types
- Supported
- Not supported
Input size limit
500 MB
Technical specifications
Images
- Maximum images per prompt: 3,000
- Maximum image size: 7 MB
- Maximum tokens per minute (TPM):
- High/Medium/Default media resolution:
- US/Asia: 6.7 M
- EU: 2.6 M
- Low media resolution:
- US/Asia: 2.6 M
- EU: 2.6 M
- High/Medium/Default media resolution:
- Supported MIME types:
image/png,image/jpeg,image/webp
Documents
- Maximum number of files per prompt: 3,000
- Maximum number of pages per file: 1,000
- Maximum file size per file for the API or Cloud Storage imports: 50 MB
- Maximum file size per file for direct uploads through the console: 7 MB
- Maximum tokens per minute (TPM) per project1:
- US/Asia: 3.4 M
- EU: 3.4 M
- Supported MIME types:
Video
- Maximum video length (with audio): Approximately 45 minutes
- Maximum video length (without audio): Approximately 1 hour
- Maximum number of videos per prompt: 10
- Maximum tokens per minute (TPM):
- High/Medium/Default media resolution:
- US/Asia: 6.3 M
- EU: 3.2 M
- Low media resolution:
- US/Asia: 3.2 M
- EU: 3.2 M
- High/Medium/Default media resolution:
- Supported MIME types:
video/x-flv,video/quicktime,video/mpeg,video/mpegs,video/mpg,video/mp4,video/webm,video/wmv,video/3gpp
Audio
- Maximum audio length per prompt: Appropximately 8.4 hours, or up to 1 million tokens
- Maximum number of audio files per prompt: 1
- Speech understanding for: Audio summarization, transcription, and translation
- Maximum tokens per minute (TPM):
- US/Asia: 3.5 M
- EU: 3.5 M
- Supported MIME types:
Parameter defaults
- Temperature: 0.0-2.0 (default 1.0)
- topP: 0.0-1.0 (default 0.95)
- topK: 64 (fixed)
- candidateCount: 1–8 (default 1)
Supported regions
Model availability
(Includes dynamic shared quota & Provisioned Throughput)
- Global
- global
- United States
- us-central1
- us-east1
- us-east4
- us-east5
- us-south1
- us-west1
- us-west4
- Europe
- europe-central2
- europe-north1
- europe-southwest1
- europe-west1
- europe-west4
- europe-west8
- europe-west9
ML processing
- United States
- Multi-region
- Europe
- Multi-region
Knowledge cutoff date
June 2024
Versions
-
gemini-2.0-flash-lite-001 - Launch stage: Generally available
- Release date: February 25, 2025
- Discontinuation date: February 25, 2026
Security controls
Online prediction
- Data residency (at rest) Supported
- Customer-managed encryption keys (CMEK) Supported
- VPC Service Controls Supported
- Access Transparency (AXT) Supported
Batch prediction
- Data residency (at rest) Supported
- Customer-managed encryption keys (CMEK) Not supported
- VPC Service Controls Supported
- Access Transparency (AXT) Not supported
Tuning
- Data residency (at rest) Not supported
- Customer-managed encryption keys (CMEK) Not supported
- VPC Service Controls Not supported
- Access Transparency (AXT) Not supported
Supported languages
Pricing

