Gemini 2.0 Flash-Lite is our fastest and most cost efficient Flash model. It's an upgrade path for 1.5 Flash users who want better quality for the same price and speed.
For even more detailed technical information on Gemini 2.0 Flash-Lite (such as performance benchmarks, information on our training datasets, efforts on sustainability, intended usage and limitations, and our approach to ethics and safety), see the model card for Gemini 2.0 Flash-Lite .
Try in Vertex AI View in Model Garden (Preview) Deploy example app
gemini-2.0-flash-lite
- Inputs:
Text , Code , Images , Audio , Video - Outputs:
Text
- Maximum input tokens: 1,048,576
- Maximum output tokens: 8,192 (default)
- Supported
- Not supported
- Grounding with Google Search
- Code execution
- Live API Preview feature
- Thinking
- Vertex AI RAG Engine
- Supported
- Not supported
- Maximum images per prompt: 3,000
- Maximum image size: 7 MB
- Maximum tokens per minute (TPM):
- High/Medium/Default media resolution:
- US/Asia: 6.7 M
- EU: 2.6 M
- Low media resolution:
- US/Asia: 2.6 M
- EU: 2.6 M
- High/Medium/Default media resolution:
- Supported MIME types:
image/png
,image/jpeg
,image/webp
- Maximum number of files per prompt: 3,000
- Maximum number of pages per file: 1,000
- Maximum file size per file: 50 MB
- Maximum tokens per minute (TPM):
- US/Asia: 3.4 M
- EU: 3.4 M
- Supported MIME types:
application/pdf
,text/plain
- Maximum video length (with audio): Approximately 45 minutes
- Maximum video length (without audio): Approximately 1 hour
- Maximum number of videos per prompt: 10
- Maximum tokens per minute (TPM):
- High/Medium/Default media resolution:
- US/Asia: 6.3 M
- EU: 3.2 M
- Low media resolution:
- US/Asia: 3.2 M
- EU: 3.2 M
- High/Medium/Default media resolution:
- Supported MIME types:
video/x-flv
,video/quicktime
,video/mpeg
,video/mpegs
,video/mpg
,video/mp4
,video/webm
,video/wmv
,video/3gpp
- Maximum audio length per prompt: Appropximately 8.4 hours, or up to 1 million tokens
- Maximum number of audio files per prompt: 1
- Speech understanding for: Audio summarization, transcription, and translation
- Maximum tokens per minute (TPM):
- US/Asia: 3.5 M
- EU: 3.5 M
- Supported MIME types:
audio/x-aac
,audio/flac
,audio/mp3
,audio/m4a
,audio/mpeg
,audio/mpga
,audio/mp4
,audio/opus
,audio/pcm
,audio/wav
,audio/webm
- Temperature: 0.0-2.0 (default 1.0)
- topP: 0.0-1.0 (default 0.95)
- topK: 64 (fixed)
- candidateCount: 1–8 (default 1)
Model availability
(Includes dynamic shared quota & Provisioned Throughput)
- Global
- global
- United States
- us-central1
- us-east1
- us-east4
- us-east5
- us-south1
- us-west1
- us-west4
- Europe
- europe-central2
- europe-north1
- europe-southwest1
- europe-west1
- europe-west4
- europe-west8
- europe-west9
ML processing
- United States
- Multi-region
- Europe
- Multi-region
-
gemini-2.0-flash-lite-001
- Launch stage: Generally available
- Release date: February 25, 2025
- Discontinuation date: February 25, 2026
- Data residency (at rest) Supported
- Customer-managed encryption keys (CMEK) Supported
- VPC Service Controls Supported
- Access Transparency (AXT) Supported
- Data residency (at rest) Supported
- Customer-managed encryption keys (CMEK) Not supported
- VPC Service Controls Supported
- Access Transparency (AXT) Not supported
- Data residency (at rest) Not supported
- Customer-managed encryption keys (CMEK) Not supported
- VPC Service Controls Not supported
- Access Transparency (AXT) Not supported