Gemini 2.0 Flash-Lite

Gemini 2.0 Flash-Lite is our fastest Gemini 2.0 model, optimized for cost efficiency and low latency.

Try in Vertex AI View in Model Garden (Preview) Deploy example app

Note: To use the "Deploy example app" feature, you need a Google Cloud project with billing and Vertex AI API enabled.

Model ID

gemini-2.0-flash-lite

Supported inputs & outputs

Inputs:
Text , Code , Images , Audio , Video
Outputs:
Text

Token limits

Maximum input tokens: 1,048,576
Maximum output tokens: 8,192 (default)

Capabilities

Supported

Not supported

Usage types

Supported

Not supported

Fixed quota

Input size limit

500 MB

Technical specifications

Images

Maximum images per prompt: 3,000
Maximum file size per file for inline data or direct uploads through the console: 7 MB
Maximum file size per file from Google Cloud Storage: 30 MB
Maximum tokens per minute (TPM):
- High/Medium/Default media resolution:
  - US/Asia: 6.7 M
  - EU: 2.6 M
- Low media resolution:
  - US/Asia: 2.6 M
  - EU: 2.6 M
Supported MIME types:
image/png , image/jpeg , image/webp , image/heic , image/heif

Documents

Maximum number of files per prompt: 3,000
Maximum number of pages per file: 1,000
Maximum file size per file for the API or Cloud Storage imports: 50 MB
Maximum file size per file for direct uploads through the console: 7 MB
Maximum tokens per minute (TPM) per project1:
- US/Asia: 3.4 M
- EU: 3.4 M
Supported MIME types:

Video

Maximum video length (with audio): Approximately 45 minutes
Maximum video length (without audio): Approximately 1 hour
Maximum number of videos per prompt: 10
Maximum tokens per minute (TPM):
- High/Medium/Default media resolution:
  - US/Asia: 6.3 M
  - EU: 3.2 M
- Low media resolution:
  - US/Asia: 3.2 M
  - EU: 3.2 M
Supported MIME types:
video/x-flv , video/quicktime , video/mpeg , video/mpegs , video/mpg , video/mp4 , video/webm , video/wmv , video/3gpp

Audio

Maximum audio length per prompt: Appropximately 8.4 hours, or up to 1 million tokens
Maximum number of audio files per prompt: 1
Speech understanding for: Audio summarization, transcription, and translation
Maximum tokens per minute (TPM):
- US/Asia: 3.5 M
- EU: 3.5 M
Supported MIME types:

Parameter defaults

Temperature: 0.0-2.0 (default 1.0)
topP: 0.0-1.0 (default 0.95)
topK: 64 (fixed)
candidateCount: 1–8 (default 1)

Supported regions

Model availability

(Includes dynamic shared quota & Provisioned Throughput)

Global

global

United States

us-central1
us-east1
us-east4
us-east5
us-south1
us-west1
us-west4

Europe

europe-central2
europe-north1
europe-southwest1
europe-west1
europe-west4
europe-west8
europe-west9

ML processing

United States

Multi-region

Europe

Multi-region

See Deployments and endpoints for more information.

Knowledge cutoff date

June 2024

Versions

gemini-2.0-flash-lite-001

Launch stage: GA
Release date: February 25, 2025
Discontinuation date: February 25, 2026

Security controls

Online prediction

Data residency (at rest) Supported
Customer-managed encryption keys (CMEK) Supported
VPC Service Controls Supported
Access Transparency (AXT) Supported

Batch prediction

Data residency (at rest) Supported
Customer-managed encryption keys (CMEK) Not supported
VPC Service Controls Supported
Access Transparency (AXT) Not supported

Tuning

Data residency (at rest) Not supported
Customer-managed encryption keys (CMEK) Not supported
VPC Service Controls Not supported
Access Transparency (AXT) Not supported

See Security controls for more information.

Supported languages

See Supported languages .

Pricing

See Pricing .

Gemini 2.0 Flash-Lite Stay organized with collections Save and categorize content based on your preferences.

Gemini 2.0 Flash-Lite