English
Deutsch
Español – América Latina
Français
Português – Brasil
中文 – 简体
日本語
한국어

Contact Us Start free

How-to guides

Perform speech recognition

Transcribe short audio files

Perform synchronous speech recognition on local and remote audio files.
Transcribe audio from streaming input

Perform streaming speech recognition on local audio files.
Transcribe long audio files

Perform batch speech recognition on local and remote audio files.

Configuring recognition requests

Create and use Recognizers

Store and reuse recognition configuration using Recognizers.
Automatically detect spoken language

Provide multiple language codes for audio transcription requests sent to Cloud Speech-to-Text.
Select a transcription model

Select a specialized machine learning model for audio transcription.
Transcribe audio with multiple-channels

Transcribe audio files that include more than one channel.
Get automatic punctuation

Include punctuation in transcription results from Speech-to-Text.
Enable word-level confidence

Specify that transcriptions should contain an accuracy indication for individual words.
Enable spoken punctuation and spoken emojis

Perform speech recognition on a remote file and include time offset (timestamp) values for recognized words.
Encrypt Speech-to-Text resources

Encrypt Speech-to-Text resources using customer-managed encryption keys (CMEK).

Learn about transcription models

Base64 encoding

Learn how to base64 encode audio.

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License , and code samples are licensed under the Apache 2.0 License . For details, see the Google Developers Site Policies . Java is a registered trademark of Oracle and/or its affiliates.

Create a Mobile Website

View Site in Mobile | Classic

Share by: