Llama 4 Scout 17B-16E

Llama 4 Scout 17B-16E is a multimodal model that uses the Mixture-of-Experts (MoE) architecture and early fusion, delivering state-of-the-art results for its size class.

Managed API (MaaS) specifications

Try in Vertex AI View model card in Model Garden

Model ID
llama-4-scout-17b-16e-instruct-maas
Launch stage
GA
Supported inputs & outputs
  • Inputs:
    Text , Code , Images
  • Outputs:
    Text
Capabilities
Usage types
Knowledge cutoff date
August 2024
Versions
  • llama-4-scout-17b-16e-instruct-maas
    • Launch stage: GA
    • Release date: April 29, 2025
Supported regions

Model availability

  • United States
    • us-east5

ML processing

  • United States
    • Multi-region
Quota limits

us-east5:

  • Max output: 8,192
  • Context length: 1,310,720
Pricing
See Pricing .

Deploy as a self-deployed model

To self-deploy the model, navigate to the Llama 4 Scout 17B-16E model card in the Model Garden console and click Deploy model. For more information about deploying and using partner models, see Deploy a partner model and make prediction requests .

Design a Mobile Site
View Site in Mobile | Classic
Share by: