Llama 4 Scout 17B-16E is a multmodal model that uses the Mixture-of-Experts (MoE) architecture and early fusion, delivering state-of-the-art results for its size class.
Try in Vertex AI View model card in Model Garden
Model ID
llama-4-scout-17b-16e-instruct-maas
Launch stage
GA
Supported inputs & outputs
- Inputs:
Text , Code , Images - Outputs:
Text
Capabilities
- Supported
- Not supported
Usage types
- Supported
- Not supported
Knowledge cutoff date
August 2024
Versions
-
llama-4-scout-17b-16e-instruct-maas - Launch stage: GA
- Release date: April 29, 2025
Supported regions
Model availability
- United States
-
us-east5
ML processing
- United States
-
Multi-region
Quota limits
us-east5:
- Max output: 8,192
- Context length: 1,310,720
Pricing

