Transcribe speech to text by using the API

This page shows you how to send a speech recognition request to Speech-to-Text using the REST interface and the curl command.

Speech-to-Text enables easy integration of Google speech recognition technologies into developer applications. You can send audio data to the Speech-to-Text API, which then returns a text transcription of that audio file. For more information about the service, see Speech-to-Text basics .

Before you begin

Before you can send a request to the Speech-to-Text API, you must have completed the following actions. See the before you begin page for details.

Enable Speech-to-Text on a Google Cloud project.
- Make sure billing is enabled for Speech-to-Text.
Verify that you have the permissions required to complete this guide . If you created a new project for this guide, then you already have the required permissions.
Install the Google Cloud CLI. After installation, initialize the Google Cloud CLI by running the following command:
```
gcloud  
init
```
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity .
(Optional) Create a new Google Cloud Storage bucket to store your audio data.

Required roles

To get the permissions that you need to transcribe speech to text, ask your administrator to grant you the Service Usage Consumer ( roles/serviceusage.serviceUsageConsumer ) IAM role on your project. For more information about granting roles, see Manage access to projects, folders, and organizations .

You might also be able to get the required permissions through custom roles or other predefined roles .

Make an audio transcription request

Now you can use Speech-to-Text to transcribe an audio file to text. Use the following code sample to send a recognize REST request to the Speech-to-Text API.

Create a JSON request file with the following text, and save it as a sync-request.json plain text file:
```
{
  "config": {
      "encoding":"FLAC",
      "sampleRateHertz": 16000,
      "languageCode": "en-US",
      "enableWordTimeOffsets": false
  },
  "audio": {
      "uri":"gs://cloud-samples-tests/speech/brooklyn.flac"
  }
}
```
This JSON snippet indicates that the audio file has a FLAC encoding format, a sample rate of 16000 Hz, and that the audio file is stored on Google Cloud Storage at the given URI. The audio file is publicly accessible, so you don't need authentication credentials to access the file.

Use curl to make a speech:recognize request, passing it the filename of the JSON request you set up in step 1:

The sample curl command uses the gcloud auth print-access-token command to get an authentication token.

curl -s -H "Content-Type: application/json" \
    -H "Authorization: Bearer "$(gcloud auth print-access-token) \
    https://speech.googleapis.com/v1/speech:recognize \
    -d @sync-request.json

Note that to pass a filename to curl you use the -d option (for "data") and precede the filename with an @ sign. This file should be in the same directory in which you execute the curl command.

You should see a response similar to the following:

 { 
  
 "results" 
 : 
  
 [ 
  
 { 
  
 "alternatives" 
 : 
  
 [ 
  
 { 
  
 "transcript" 
 : 
  
 "how old is the Brooklyn Bridge" 
 , 
  
 "confidence" 
 : 
  
 0.98267895 
  
 } 
  
 ] 
  
 } 
  
 ] 
 }

Congratulations! You've sent your first request to Speech-to-Text.

If you receive an error or an empty response from Speech-to-Text, take a look at the troubleshooting and error mitigation steps.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

Use the Google Cloud console to delete your project if you do not need it.

What's next

Practice transcribing short audio files .
Learn how to batch long audio files for speech recognition .
Learn how to transcribe streaming audio like from a microphone.
Get started with the Speech-to-Text in your language of choice by using a Speech-to-Text client library .
Work through the sample applications .
For best performance, accuracy, and other tips, see the best practices documentation.