Run LLM inference on Cloud Run GPUs with Hugging Face Transformers.js

The following codelab shows how to run a backend service that runs the Transformers.js package . The Transformers.js package is functionally equivalent to the Hugging Face transformers python library together with Google's Gemma 2 model.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

View Site in Mobile | Classic

Share by:

Run LLM inference on Cloud Run GPUs with Hugging Face Transformers.js Stay organized with collections Save and categorize content based on your preferences.