Speech-to-Text is an API that is powered by Google's artificial intelligence (AI) technology. You send your audio data to Speech-to-Text, then receive a text transcription of your audio data in response.
For more information about how to construct a Speech-to-Text request, see the requests page.
Overview
Before you can begin sending requests to Speech-to-Text, you must enable the API in the Google Cloud console. The steps on this page walk you through the following actions:
- Enable Speech-to-Text on a project.
- Make sure billing is enabled for Speech-to-Text.
- (Optional) Create a new Google Cloud Storage bucket to store your audio data.
Before You Begin
There are two ways to access the service: by using the REST API, or by using the Speech-to-Text Console. We provide code samples that show you how to make a request to the REST API and receive a response. You can learn how to use these samples by following the Speech-to-Text quickstarts and how-to guides . If you prefer to use Speech-to-Text with minimal coding, you can use the Cloud Speech-to-Text Console .
This guide walks you through the steps necessary to start sending requests to the REST API. If you are new to coding, we recommend that you start with the step-by-step in-console tutorials in Google Cloud Platform before beginning this quickstart.
Set up your Google Cloud project for Speech-to-Text
-
Go to the project selector page
You can either choose an existing project or create a new one. For more information about creating a project, see Creating and managing projects .
-
If you create a new project, you will be prompted to link a billing account to this project. If you are using a pre-existing project, make sure that you have billing enabled.
Learn how to confirm that billing is enabled for your project
-
Once you have selected a project and linked it to a billing account, you can enable the Speech-to-Text API. Go to the Search products and resourcesbar at the top of the page and type in "speech". Select the Cloud Speech-to-Text APIfrom the list of results.
-
To try Speech-to-Text without linking it to your project, choose the TRY THIS APIoption. To enable the Speech-to-Text API for use with your project, click ENABLE.
-
(Optional) Enable data logging. By opting in to data logging, you allow Google to record any audio data that you send to Speech-to-Text. This data is used to improve the Speech-to-Text models. Users who opt in to data logging benefit from lower pricing. See the pricing and data logging terms and conditions pages for more information.

Optional: Create a Cloud Storage bucket
If you intend to transcribe audio longer than 60 seconds or with a file size larger than 10 MB, you must store the audio data in a Cloud Storage bucket before you can transcribe it using Speech-to-Text. The following steps walk you through the process of creating a new bucket.
- For Name your bucket , enter a unique bucket name. Don't include sensitive information in the bucket name, because the bucket namespace is global and publicly visible.
- In the Choose where to store your data
section, do the following:
- Select a Location type .
- Choose a location where your bucket's data is permanently stored from the Location type
drop-down menu.
- If you select the dual-region location type, you can also choose to enable turbo replication by using the relevant checkbox.
- To set up cross-bucket replication
, select Add cross-bucket replication via Storage Transfer Service
and
follow these steps:
Set up cross-bucket replication
- In the Bucket menu, select a bucket.
-
In the Replication settings section, click Configure to configure settings for the replication job.
The Configure cross-bucket replication pane appears.
- To filter objects to replicate by object name prefix, enter a prefix that you want to include or exclude objects from, then click Add a prefix .
- To set a storage class for the replicated objects, select a storage class from the Storage class menu. If you skip this step, the replicated objects will use the destination bucket's storage class by default.
- Click Done .
- In the Choose how to store your data
section, do the following:
- Select a default storage class for the bucket or Autoclass for automatic storage class management of your bucket's data.
- To enable hierarchical namespace , in the Optimize storage for data-intensive workloads section, select Enable hierarchical namespace on this bucket .
- In the Choose how to control access to objects section, select whether or not your bucket enforces public access prevention , and select an access control method for your bucket's objects.
- In the Choose how to protect object data
section, do the
following:
- Select any of the options under Data protection
that you
want to set for your bucket.
- To enable soft delete , click the Soft delete policy (For data recovery) checkbox, and specify the number of days you want to retain objects after deletion.
- To set Object Versioning , click the Object versioning (For version control) checkbox, and specify the maximum number of versions per object and the number of days after which the noncurrent versions expire.
- To enable the retention policy on objects and buckets, click the Retention (For compliance)
checkbox, and then do the following:
- To enable Object Retention Lock , click the Enable object retention checkbox.
- To enable Bucket Lock , click the Set bucket retention policy checkbox, and choose a unit of time and a length of time for your retention period.
- To choose how your object data will be encrypted, expand the Data encryption section ( ), and select a Data encryption method .
- Select any of the options under Data protection
that you
want to set for your bucket.
Disable the Speech-to-Text API
Complete the following steps if you no longer need to use the Speech-to-Text API in the future.
- Navigate to your Google Cloud dashboard and click the Go to APIs overviewlink in the APIsbox.
- Select Cloud Speech-to-Text API.
- Click the DISABLE APIbutton at the top of the Cloud Speech-to-Text API page.
What's next
Learn how to send a transcription request to the Speech-to-Text API using client libraries , gcloud , the command line , or the Speech-to-Text UI .

