Llama 4 Maverick 17B-128E

Llama 4 Maverick 17B-128E is Llama 4's largest and most capable model. It uses the Mixture-of-Experts (MoE) architecture and early fusion to provide coding, reasoning, and image capabilities.

Managed API (MaaS) specifications

Try in Vertex AI View model card in Model Garden

Model ID

llama-4-maverick-17b-128e-instruct-maas

Launch stage

Supported inputs & outputs

Inputs:
Text , Code , Images
Outputs:
Text

Capabilities

Supported

Not supported

Llama Guard

Usage types

Supported

Not supported

Fixed quota

Knowledge cutoff date

August 2024

Versions

llama-4-maverick-17b-128e-instruct-maas

Launch stage: GA
Release date: April 29, 2025

Supported regions

Model availability

United States

us-east5

ML processing

United States

Multi-region

Quota limits

us-east5:

Max output: 8,192
Context length: 524,288

Pricing

See Pricing .

Deploy as a self-deployed model

To self-deploy the model, navigate to the Llama 4 Maverick 17B-128E model card in the Model Garden console and click Deploy model. For more information about deploying and using partner models, see Deploy a partner model and make prediction requests .

Llama 4 Maverick 17B-128E Stay organized with collections Save and categorize content based on your preferences.

Managed API (MaaS) specifications

Deploy as a self-deployed model

Llama 4 Maverick 17B-128E