Export Speech-to-Text transcript to Cloud Storage (Beta)
Stay organized with collections
Save and categorize content based on your preferences.
This sample demonstrates how to export a speech-to-text transcript to a Cloud Storage bucket.
Code sample
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License
, and code samples are licensed under the Apache 2.0 License
. For details, see the Google Developers Site Policies
. Java is a registered trademark of Oracle and/or its affiliates.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],[],[],[],null,["# Export Speech-to-Text transcript to Cloud Storage (Beta)\n\nThis sample demonstrates how to export a speech-to-text transcript to a Cloud Storage bucket.\n\nCode sample\n-----------\n\n### Python\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n from google.cloud import speech\n from google.cloud import https://cloud.google.com/python/docs/reference/storage/latest/\n from google.cloud.speech_v1 import types\n\n\n def export_transcript_to_storage_beta(\n audio_uri: str,\n output_bucket_name: str,\n output_filename: str,\n ) -\u003e types.LongRunningRecognizeResponse:\n \"\"\"Transcribes an audio file from Cloud Storage and exports the transcript to Cloud Storage bucket.\n Args:\n audio_uri (str): The Cloud Storage URI of the input audio, e.g., gs://[BUCKET]/[FILE]\n output_bucket_name (str): Name of the Cloud Storage bucket to store the output transcript.\n output_filename (str): Name of the output file to store the transcript.\n Returns:\n types.LongRunningRecognizeResponse: The response containing the transcription results.\n \"\"\"\n\n audio = speech.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.types.RecognitionAudio.html(uri=audio_uri)\n output_storage_uri = f\"gs://{output_bucket_name}/{output_filename}\"\n\n # Pass in the URI of the Cloud Storage bucket to hold the transcription\n output_config = speech.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.types.TranscriptOutputConfig.html(gcs_uri=output_storage_uri)\n\n # Speech configuration object\n config = speech.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.types.RecognitionConfig.html(\n encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,\n sample_rate_hertz=8000,\n language_code=\"en-US\",\n )\n\n # Compose the long-running request\n request = speech.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.types.LongRunningRecognizeRequest.html(\n audio=audio, config=config, output_config=output_config\n )\n\n # Create the speech client\n speech_client = speech.SpeechClient()\n # Create the storage client\n storage_client = https://cloud.google.com/python/docs/reference/storage/latest/.https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.client.Client.html()\n\n # Run the recognizer to export transcript\n operation = speech_client.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.services.speech.SpeechClient.html#google_cloud_speech_v1_services_speech_SpeechClient_long_running_recognize(request=request)\n print(\"Waiting for operation to complete...\")\n operation.result(timeout=90)\n\n # Get bucket with name\n bucket = storage_client.https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.client.Client.html#google_cloud_storage_client_Client_get_bucket(output_bucket_name)\n # Get blob (file) from bucket\n blob = bucket.https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.bucket.Bucket.html#google_cloud_storage_bucket_Bucket_get_blob(output_filename)\n\n # Get content as bytes\n results_bytes = blob.https://cloud.google.com/python/docs/reference/storage/latest/google.cloud.storage.blob.Blob.html#google_cloud_storage_blob_Blob_download_as_bytes()\n # Get transcript exported in storage bucket\n storage_transcript = types.https://cloud.google.com/python/docs/reference/speech/latest/google.cloud.speech_v1.types.LongRunningRecognizeResponse.html.from_json(\n results_bytes, ignore_unknown_fields=True\n )\n\n # Each result is for a consecutive portion of the audio. Iterate through\n # them to get the transcripts for the entire audio file.\n for result in storage_transcript.results:\n # The first alternative is the most likely one for this portion.\n print(f\"Transcript: {result.alternatives[0].transcript}\")\n print(f\"Confidence: {result.alternatives[0].confidence}\")\n\nWhat's next\n-----------\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=speech)."]]