Transcribe a local multi-channel file
Stay organized with collections
Save and categorize content based on your preferences.
Transcribe a local audio file that includes more than one channel.
Explore further
For detailed documentation that includes this code sample, see the following:
Code sample
Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License
, and code samples are licensed under the Apache 2.0 License
. For details, see the Google Developers Site Policies
. Java is a registered trademark of Oracle and/or its affiliates.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],[],[],[],null,["Transcribe a local audio file that includes more than one channel.\n\nExplore further\n\n\nFor detailed documentation that includes this code sample, see the following:\n\n- [Transcribe audio with multiple channels](/speech-to-text/docs/multi-channel)\n\nCode sample \n\nJava\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Java API\nreference documentation](/java/docs/reference/google-cloud-speech/latest/overview).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n /**\n * Transcribe a local audio file with multi-channel recognition\n *\n * @param fileName the path to local audio file\n */\n public static void transcribeMultiChannel(String fileName) throws Exception {\n Path path = Paths.get(fileName);\n byte[] content = Files.readAllBytes(path);\n\n try (SpeechClient speechClient = SpeechClient.create()) {\n // Get the contents of the local audio file\n RecognitionAudio recognitionAudio =\n RecognitionAudio.newBuilder().setContent(ByteString.copyFrom(content)).build();\n\n // Configure request to enable multiple channels\n RecognitionConfig config =\n RecognitionConfig.newBuilder()\n .setEncoding(AudioEncoding.LINEAR16)\n .setLanguageCode(\"en-US\")\n .setSampleRateHertz(44100)\n .setAudioChannelCount(2)\n .setEnableSeparateRecognitionPerChannel(true)\n .build();\n\n // Perform the transcription request\n RecognizeResponse recognizeResponse = speechClient.recognize(config, recognitionAudio);\n\n // Print out the results\n for (SpeechRecognitionResult result : recognizeResponse.getResultsList()) {\n // There can be several alternative transcripts for a given chunk of speech. Just use the\n // first (most likely) one here.\n SpeechRecognitionAlternative alternative = result.getAlternatives(0);\n System.out.format(\"Transcript : %s\\n\", alternative.getTranscript());\n System.out.printf(\"Channel Tag : %s\\n\", result.getChannelTag());\n }\n }\n }\n\nNode.js\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Node.js API\nreference documentation](/nodejs/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n const fs = require('fs');\n\n // Imports the Google Cloud client library\n const speech = require('https://cloud.google.com/nodejs/docs/reference/speech/latest/overview.html').v1;\n\n // Creates a client\n const client = new speech.https://cloud.google.com/nodejs/docs/reference/speech/latest/overview.html();\n\n /**\n * TODO(developer): Uncomment the following lines before running the sample.\n */\n // const fileName = 'Local path to audio file, e.g. /path/to/audio.raw';\n\n const config = {\n encoding: 'LINEAR16',\n languageCode: 'en-US',\n audioChannelCount: 2,\n enableSeparateRecognitionPerChannel: true,\n };\n\n const audio = {\n content: fs.readFileSync(fileName).toString('base64'),\n };\n\n const request = {\n config: config,\n audio: audio,\n };\n\n const [response] = await client.recognize(request);\n const transcription = response.results\n .map(\n result =\u003e\n ` Channel Tag: ${result.channelTag} ${result.alternatives[0].transcript}`\n )\n .join('\\n');\n console.log(`Transcription: \\n${transcription}`);\n\nPython\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n\n from google.cloud import speech\n\n\n def transcribe_file_with_multichannel(audio_file: str) -\u003e speech.RecognizeResponse:\n \"\"\"Transcribe the given audio file synchronously with multi channel.\n Args:\n audio_file (str): Path to the local audio file to be transcribed.\n Example: \"resources/multi.wav\"\n Returns:\n cloud_speech.RecognizeResponse: The full response object which includes the transcription results.\n \"\"\"\n client = speech.SpeechClient()\n\n with open(audio_file, \"rb\") as f:\n audio_content = f.read()\n\n audio = speech.RecognitionAudio(content=audio_content)\n\n config = speech.RecognitionConfig(\n encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,\n sample_rate_hertz=44100,\n language_code=\"en-US\",\n audio_channel_count=2,\n enable_separate_recognition_per_channel=True,\n )\n\n response = client.recognize(config=config, audio=audio)\n\n for i, result in enumerate(response.results):\n alternative = result.alternatives[0]\n print(\"-\" * 20)\n print(f\"First alternative of result {i}\")\n print(f\"Transcript: {alternative.transcript}\")\n print(f\"Channel Tag: {result.channel_tag}\")\n\n return result\n\nRuby\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n # audio_file_path = \"path/to/audio.wav\"\n\n require \"google/cloud/speech\"\n\n speech = Google::Cloud::https://cloud.google.com/ruby/docs/reference/google-cloud-speech-v1/latest/Google-Cloud-Speech.html.https://cloud.google.com/ruby/docs/reference/google-cloud-speech/latest/Google-Cloud-Speech.html version: :v1\n\n config = {\n encoding: :LINEAR16,\n sample_rate_hertz: 44_100,\n language_code: \"en-US\",\n audio_channel_count: 2,\n enable_separate_recognition_per_channel: true\n }\n\n audio_file = File.binread audio_file_path\n audio = { content: audio_file }\n\n response = speech.recognize config: config, audio: audio\n\n results = response.results\n\n results.each_with_index do |result, i|\n alternative = result.alternatives.first\n puts \"-\" * 20\n puts \"First alternative of result #{i}\"\n puts \"Transcript: #{alternative.transcript}\"\n puts \"Channel Tag: #{result.channel_tag}\"\n end\n\nWhat's next\n\n\nTo search and filter code samples for other Google Cloud products, see the\n[Google Cloud sample browser](/docs/samples?product=speech)."]]