Choose a transcription function

This document provides a comparison of the transcription functions available in BigQuery ML, which are ML.GENERATE_TEXT and ML.TRANSCRIBE .

You can use the information in this document to help you decide which function to use in cases where the functions have overlapping capabilities.

At a high level, the difference between these functions is as follows:

  • ML.GENERATE_TEXT is a good choice for transcription of audio clips that are 10 minutes or shorter, and you can also use it to perform natural language processing (NLP) tasks. Audio transcription with ML.GENERATE_TEXT is less expensive than with ML.TRANSCRIBE when you use the gemini-1.5-flash model.

  • ML.TRANSCRIBE is a good choice for performing transcription on audio clips that are longer than 10 minutes. It also supports a wider range of languages than ML.GENERATE_TEXT .

Supported models

Supported models are as follows:

Supported tasks

Supported tasks are as follows:

  • ML.GENERATE_TEXT : you can perform audio transcription and natural language processing (NLP) tasks.
  • ML.TRANSCRIBE : you can perform audio transcription.

Pricing

Pricing is as follows:

Supervised tuning

Supervised tuning support is as follows:

  • ML.GENERATE_TEXT : supervised tuning is supported for some models.
  • ML.TRANSCRIBE : supervised tuning isn't supported.

Queries per minute (QPM) limit

QPM limits are as follows:

  • ML.GENERATE_TEXT : 60 QPM in the default us-central1 region for gemini-1.5-pro models, and 200 QPM in the default us-central1 region for gemini-1.5-flash models. For more information, see Generative AI on Vertex AI quotas .
  • ML.TRANSCRIBE : 900 QPM per project. For more information, see Quotas and limits .

To increase your quota, see Request a quota adjustment .

Token limit

Token limits are as follows:

  • ML.GENERATE_TEXT : 700 input tokens, and 8196 output tokens. This output token limit means that ML.GENERATE_TEXT has a limit of approximately 39 minutes for an individual audio clip.
  • ML.TRANSCRIBE : No token limit. However, this function does have a limit of 480 minutes for an individual audio clip.

Supported languages

Supported languages are as follows:

Region availability

Region availability is as follows:

  • ML.GENERATE_TEXT : available in all Generative AI for Vertex AI regions .
  • ML.TRANSCRIBE : available in the EU and US multi-regions for all speech recognizers.
Create a Mobile Website
View Site in Mobile | Classic
Share by: