Llama 4 Scout 17B-16E

Llama 4 Scout 17B-16E is a multmodal model that uses the Mixture-of-Experts (MoE) architecture and early fusion, delivering state-of-the-art results for its size class.

Try in Vertex AI View model card in Model Garden

Model ID

llama-4-scout-17b-16e-instruct-maas

Launch stage

Supported inputs & outputs

Inputs:
Text , Code , Images
Outputs:
Text

Capabilities

Supported

Batch predictions
Function calling
Structured output

Not supported

Llama Guard

Usage types

Supported

Dynamic shared quota

Not supported

Fixed quota
Provisioned Throughput

Knowledge cutoff date

August 2024

Versions

llama-4-scout-17b-16e-instruct-maas

Launch stage: GA
Release date: April 29, 2025

Supported regions

Model availability

United States

us-east5

ML processing

United States

Multi-region

Quota limits

us-east5:

Max output: 8,192
Context length: 1,310,720

Pricing

See Pricing .

Llama 4 Scout 17B-16E Stay organized with collections Save and categorize content based on your preferences.

Llama 4 Scout 17B-16E