Gemini 2.0 Flash

Gemini 2.0 Flash delivers next-generation features and improved capabilities designed for the agentic era, including superior speed, built-in tool use, multimodal generation, and a 1M token context window. Gemini 2.0 Flash improves upon our previous Flash model and offers enhanced quality at similar speeds.

2.0 Flash

Caution: As of March 6, 2026, gemini-2.0-flash-001 and gemini-2.0-flash-lite-001 are only available for existing customers. This includes both model serving and Provisioned Throughput. New projects should use gemini-2.5-flash , gemini-2.5-flash-lite , or more recent releases.

Try in Vertex AI View in Model Garden (Preview) Deploy example app

Note: To use the "Deploy example app" feature, you need a Google Cloud project with billing and Vertex AI API enabled.

Model ID

gemini-2.0-flash

Supported inputs & outputs

Inputs:
Text , Code , Images , Audio , Video
Outputs:
Text

Token limits

Maximum input tokens: 1,048,576
Maximum output tokens: 8,192 (default)

Capabilities

Supported

Not supported

Consumption options

Supported

Not supported

See Consumption options for more information.

Input size limit

500 MB

Technical specifications

Images

Maximum images per prompt: 3,000
Maximum file size per file for inline data or direct uploads through the console: 7 MB
Maximum file size per file from Google Cloud Storage: 30 MB
Maximum tokens per minute (TPM) per project:
- High/Medium/Default media resolution:
  - US/Asia: 40 M
  - EU: 10 M
- Low media resolution:
  - US/Asia: 10 M
  - EU: 2.6 M
Supported MIME types:
image/png , image/jpeg , image/webp , image/heic , image/heif

Documents

Maximum number of files per prompt: 3,000
Maximum number of pages per file: 1,000
Maximum file size per file for the API or Cloud Storage imports: 50 MB
Maximum file size per file for direct uploads through the console: 7 MB
Maximum tokens per minute (TPM) per project1:
- US/Asia: 3.4 M
- EU: 3.4 M
Supported MIME types:
application/pdf , text/plain

Video

Maximum video length (with audio): Approximately 45 minutes
Maximum video length (without audio): Approximately 1 hour
Maximum number of videos per prompt: 10
Maximum tokens per minute (TPM):
- High/Medium/Default media resolution:
  - US/Asia: 38 M
  - EU: 10 M
- Low media resolution:
  - US/Asia: 10 M
  - EU: 2.5 M
Supported MIME types:
video/x-flv , video/quicktime , video/mpeg , video/mpegs , video/mpg , video/mp4 , video/webm , video/wmv , video/3gpp

Audio

Maximum audio length per prompt: Approximately 8.4 hours, or up to 1 million tokens
Maximum number of audio files per prompt: 1
Speech understanding for: Audio summarization, transcription, and translation
Maximum tokens per minute (TPM):
- US/Asia: 3.5 M
- EU: 3.5 M
Supported MIME types:
audio/x-aac , audio/flac , audio/mp3 , audio/m4a , audio/mpeg , audio/mpga , audio/mp4 , audio/ogg , audio/pcm , audio/wav , audio/webm

Parameter defaults

Temperature: 0.0-2.0 (default 1.0)
topP: 0.0-1.0 (default 0.95)
topK: 64 (fixed)
candidateCount: 1–8 (default 1)

Supported regions

Model availability

Global

global

United States

us-central1
us-east1
us-east4
us-east5
us-south1
us-west1
us-west4

Europe

europe-central2
europe-north1
europe-southwest1
europe-west1
europe-west4
europe-west8
europe-west9

ML processing

See Deployments and endpoints for more information.

Knowledge cutoff date

June 2024

Versions

gemini-2.0-flash-001

Launch stage: GA
Release date: February 5, 2025
Discontinuation date: June 1, 2026

Security controls

Online prediction

Data residency
CMEK
VPC-SC
AXT

Batch prediction

Data residency
CMEK
VPC-SC
AXT

Tuning

Data residency
CMEK
VPC-SC
AXT

Context caching

Data residency
CMEK
VPC-SC
AXT

RAG Engine

Data residency
CMEK
VPC-SC
AXT

Grounding with Google Search and Grounding with Google Maps

Data residency
CMEK
VPC-SC
AXT

See Security controls for more information.

Supported languages

See Supported languages .

Pricing

See Pricing .

Live API

Try in Vertex AI

Model ID

gemini-2.0-flash-live-preview-04-09

Supported inputs & outputs

Inputs:
Audio , Video
Outputs:
Audio

Token limits

Maximum input tokens: 32,768
Maximum output tokens: 8,192 (default)

Capabilities

Supported

Not supported

Consumption options

Supported

Standard PayGo

Not supported

See Consumption options for more information.

Input size limit

500 MB

Technical specifications

Video

Maximum video length (with audio): Approximately 45 minutes
Maximum video length (without audio): Approximately 1 hour
Maximum number of videos per prompt: 10
Maximum tokens per minute (TPM):
- High/Medium/Default media resolution:
  - US/Asia: 37.9 M
  - EU: 9.5 M
- Low media resolution:
  - US/Asia: 1 G
  - EU: 2.5 M
Supported MIME types:
video/x-flv , video/quicktime , video/mpeg , video/mpegs , video/mpg , video/mp4 , video/webm , video/wmv , video/3gpp

Audio

Maximum audio length per prompt: Approximately 8.4 hours, or up to 1 million tokens
Maximum number of audio files per prompt: 1
Speech understanding for: Audio summarization, transcription, and translation
Maximum tokens per minute (TPM):
- US/Asia: 1.7 M
- EU: 0.4 M
Supported MIME types:
audio/x-aac , audio/flac , audio/mp3 , audio/m4a , audio/mpeg , audio/mpga , audio/mp4 , audio/ogg , audio/pcm , audio/wav , audio/webm

Parameter defaults

Temperature: 0.0-2.0 (default 1.0)
topP: 0.0-1.0 (default 0.95)
topK: 64 (fixed)
candidateCount: 1–8 (default 1)

Supported regions

Model availability

United States

us-central1

See Deployments and endpoints for more information.

Knowledge cutoff date

June 2024

Versions

gemini-2.0-flash-live-preview-04-09

Launch stage: Public preview
Release date: April 9, 2025

Supported languages

See Supported languages .

Pricing

See Pricing .

Gemini 2.0 Flash Stay organized with collections Save and categorize content based on your preferences.

2.0 Flash

Live API

Gemini 2.0 Flash