Llama 4 Maverick 17B-128E is Llama 4's largest and most capable model. It uses the Mixture-of-Experts (MoE) architecture and early fusion to provide coding, reasoning, and image capabilities.
Managed API (MaaS) specifications
Try in Vertex AI View model card in Model Garden
llama-4-maverick-17b-128e-instruct-maas
- Inputs:
Text , Code , Images - Outputs:
Text
- Supported
- Not supported
- Supported
- Not supported
-
llama-4-maverick-17b-128e-instruct-maas - Launch stage: GA
- Release date: April 29, 2025
Model availability
- United States
-
us-east5
ML processing
- United States
-
Multi-region
us-east5:
- Max output: 8,192
- Context length: 524,288
Deploy as a self-deployed model
To self-deploy the model, navigate to the Llama 4 Maverick 17B-128E model card in the Model Garden console and click Deploy model. For more information about deploying and using partner models, see Deploy a partner model and make prediction requests .

