Gemini 2.0 Flash-Lite is our fastest Gemini 2.0 model, optimized for cost efficiency and low latency.
Try in Vertex AI View in Model Garden (Preview) Deploy example app
Model ID
 
 gemini-2.0-flash-lite 
Supported inputs & outputs
 
 - Inputs: Text , Code , Images , Audio , Video 
- Outputs: Text 
Token limits
 
 - Maximum input tokens: 1,048,576
- Maximum output tokens: 8,192 (default)
Capabilities
 
 - Supported
- Not supported
- Grounding with Google Search
- Code execution
- Live API Preview feature
- Thinking
- Vertex AI RAG Engine
Usage types
 
 - Supported
- Not supported
Input size limit
 
 500 MB
 
Technical specifications
 
 Images 
  
 
 - Maximum images per prompt: 3,000
- Maximum image size: 7 MB
- Maximum tokens per minute (TPM): - High/Medium/Default media resolution: - US/Asia: 6.7 M
- EU: 2.6 M
 
- Low media resolution: - US/Asia: 2.6 M
- EU: 2.6 M
 
 
- High/Medium/Default media resolution: 
- Supported MIME types: image/png,image/jpeg,image/webp
 Documents 
  
 
 - Maximum number of files per prompt: 3,000
- Maximum number of pages per file: 1,000
- Maximum file size per file for the API or Cloud Storage imports: 50 MB
- Maximum file size per file for direct uploads through the console: 7 MB
- Maximum tokens per minute (TPM) per project1: - US/Asia: 3.4 M
- EU: 3.4 M
 
- Supported MIME types:
 Video 
  
 
 - Maximum video length (with audio): Approximately 45 minutes
- Maximum video length (without audio): Approximately 1 hour
- Maximum number of videos per prompt: 10
- Maximum tokens per minute (TPM): - High/Medium/Default media resolution: - US/Asia: 6.3 M
- EU: 3.2 M
 
- Low media resolution: - US/Asia: 3.2 M
- EU: 3.2 M
 
 
- High/Medium/Default media resolution: 
- Supported MIME types: video/x-flv,video/quicktime,video/mpeg,video/mpegs,video/mpg,video/mp4,video/webm,video/wmv,video/3gpp
 Audio 
  
 
 - Maximum audio length per prompt: Appropximately 8.4 hours, or up to 1 million tokens
- Maximum number of audio files per prompt: 1
- Speech understanding for: Audio summarization, transcription, and translation
- Maximum tokens per minute (TPM): - US/Asia: 3.5 M
- EU: 3.5 M
 
- Supported MIME types:
 Parameter defaults 
  
 
 - Temperature: 0.0-2.0 (default 1.0)
- topP: 0.0-1.0 (default 0.95)
- topK: 64 (fixed)
- candidateCount: 1–8 (default 1)
Supported regions
 
Model availability
(Includes dynamic shared quota & Provisioned Throughput)
- Global
- global
- United States
- us-central1
- us-east1
- us-east4
- us-east5
- us-south1
- us-west1
- us-west4
- Europe
- europe-central2
- europe-north1
- europe-southwest1
- europe-west1
- europe-west4
- europe-west8
- europe-west9
ML processing
- United States
- Multi-region
- Europe
- Multi-region
Knowledge cutoff date
 
 June 2024
 
Versions
 
 -  gemini-2.0-flash-lite-001
- Launch stage: Generally available
- Release date: February 25, 2025
- Discontinuation date: February 25, 2026
Security controls
 
 Online prediction 
 
 - Data residency (at rest) Supported
- Customer-managed encryption keys (CMEK) Supported
- VPC Service Controls Supported
- Access Transparency (AXT) Supported
 Batch prediction 
 
 - Data residency (at rest) Supported
- Customer-managed encryption keys (CMEK) Not supported
- VPC Service Controls Supported
- Access Transparency (AXT) Not supported
 Tuning 
 
 - Data residency (at rest) Not supported
- Customer-managed encryption keys (CMEK) Not supported
- VPC Service Controls Not supported
- Access Transparency (AXT) Not supported
Supported languages
 
  
Pricing
 
  

