Audio data is binary data. You can read the binary data directly from a gRPC response; however, JSON is used when responding to a REST request. Because JSON is a text format that does not directly support binary data, Text-to-Speech returns a response string encoded in Base64 . You must convert the base64-encoded text data from the response to binary before you can play it on a device.
JSON responses from the Text-to-Speech include base64-encoded audio
content in the audioContent
field. For example:
{ "audioContent": " //NExAARqoIIAAhEuWAAAGNmBGMY4EBcxvABAXBPmPIAF//yAuh9Tn5CEap3/o... " }
To decode base64 into an audio file:
Linux
-
Copy only the base-64 encoded content into a text file.
-
Decode the source text file using the base64 command line tool by using the
-d
flag:
$ base64 SOURCE_BASE64_TEXT_FILE -d > DESTINATION_AUDIO_FILE
Mac OSX
-
Copy only the base-64 encoded content into a text file.
-
Decode the source text file using the base64 command line tool:
$ base64 --decode -i SOURCE_BASE64_TEXT_FILE > DESTINATION_AUDIO_FILE
Windows
-
Copy only the base-64 encoded content into a text file.
-
Decode the source text file using the
certutil
command.
certutil -decode SOURCE_BASE64_TEXT_FILE
DESTINATION_AUDIO_FILE