Note:
This documentation is for version3.0.0-beta09of the library.
Some samples may not work with other versions.
Installation
Install theGoogle.Cloud.Speech.V1P1Beta1package from NuGet. Add it to
your project in the normal way (for example by right-clicking on the
project in Visual Studio and choosing "Manage NuGet Packages...").
Please ensure you enable pre-release packages (for example, in the
Visual Studio NuGet user interface, check the "Include prerelease"
box). Some of the following samples might only work with the latest
pre-release version (3.0.0-beta09) ofGoogle.Cloud.Speech.V1P1Beta1.
Authentication
When running on Google Cloud, no action needs to be taken to authenticate.
Otherwise, the simplest way of authenticating your API calls is to
set up Application Default Credentials.
The credentials will automatically be used to authenticate. SeeSet up Application Default Credentialsfor more details.
Getting started
The simplest option is to use the synchronous, one-shot API as shown
below in the sample code. More complex scenarios are considered further down this page.
Note that the audio data should be mono rather than stereo, and the
format needs to be explicitly specified in the request.
Sample code
Constructing a RecognitionAudio object
There are various factory methods on theRecognitionAudioclass to allow
instances to be constructed from files, streams, byte arrays and URIs.
RecognitionAudio audio1 = RecognitionAudio.FromFile("Sound/SpeechSample.flac");
RecognitionAudio audio2 = RecognitionAudio.FetchFromUri("https://.../HostedSpeech.flac");
RecognitionAudio audio3 = RecognitionAudio.FromStorageUri("gs://my-bucket/my-file");
byte[] bytes = ReadAudioData(); // For example, from a database
RecognitionAudio audio4 = RecognitionAudio.FromBytes(bytes);
using (Stream stream = OpenAudioStream()) // Any regular .NET stream
{
RecognitionAudio audio5 = RecognitionAudio.FromStream(stream);
}
The underlying RPC API contains three modes of operation.
The simplest is via the Recognize method. You make a single
request, and get a single response with the result of the analysis.
The LongRunningRecognize method still requires all of the audio data to be
passed in a single request, but the response from the RPC is a
Google.LongRunning.Operation, representing an operation which could
take some time to complete. It contains a token which can be used to
retrieve the results later - you can think of it as a more
persistent and remoteTask<T>to a first approximation.
Finally, the RPC API supports StreamingRecognize, which is a
bidirectional streaming API: the client makes a number of requests,
and the server emits a number of responses. This enables a
conversation to be transcribed in near real time, for example,
without the client needing to split it into chunks for single
operations.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003e\u003ccode\u003eGoogle.Cloud.Speech.V1P1Beta1\u003c/code\u003e is a .NET client library for the Google Cloud Speech API, and the documentation provided pertains to version \u003ccode\u003e3.0.0-beta08\u003c/code\u003e.\u003c/p\u003e\n"],["\u003cp\u003eThe library can be installed via NuGet, ensuring that pre-release packages are enabled to access the latest beta version.\u003c/p\u003e\n"],["\u003cp\u003eAuthentication is automatically handled on Google Cloud, and otherwise, Application Default Credentials can be set up for easy authentication.\u003c/p\u003e\n"],["\u003cp\u003eThe Speech API offers three operation modes: \u003ccode\u003eRecognize\u003c/code\u003e for immediate results, \u003ccode\u003eLongRunningRecognize\u003c/code\u003e for operations that may take time, and \u003ccode\u003eStreamingRecognize\u003c/code\u003e for real-time transcription.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eRecognitionAudio\u003c/code\u003e class offers multiple ways to instantiate audio objects like from files, streams, bytes or URI's, to be used in the speech recognition process.\u003c/p\u003e\n"]]],[],null,["Version latestkeyboard_arrow_down\n\n- [3.0.0-beta09 (latest)](/dotnet/docs/reference/Google.Cloud.Speech.V1P1Beta1/latest)\n- [3.0.0-beta08](/dotnet/docs/reference/Google.Cloud.Speech.V1P1Beta1/3.0.0-beta08)\n- [2.0.0-beta09](/dotnet/docs/reference/Google.Cloud.Speech.V1P1Beta1/2.0.0-beta09) \n\nGoogle.Cloud.Speech.V1P1Beta1\n=============================\n\n`Google.Cloud.Speech.V1P1Beta1` is a.NET client library for the [Google Cloud Speech API](https://cloud.google.com/speech).\n\nNote:\nThis documentation is for version `3.0.0-beta09` of the library.\nSome samples may not work with other versions.\n\nInstallation\n------------\n\nInstall the `Google.Cloud.Speech.V1P1Beta1` package from NuGet. Add it to\nyour project in the normal way (for example by right-clicking on the\nproject in Visual Studio and choosing \"Manage NuGet Packages...\").\nPlease ensure you enable pre-release packages (for example, in the\nVisual Studio NuGet user interface, check the \"Include prerelease\"\nbox). Some of the following samples might only work with the latest\npre-release version (`3.0.0-beta09`) of `Google.Cloud.Speech.V1P1Beta1`.\n\nAuthentication\n--------------\n\nWhen running on Google Cloud, no action needs to be taken to authenticate.\n\nOtherwise, the simplest way of authenticating your API calls is to\nset up Application Default Credentials.\nThe credentials will automatically be used to authenticate. See\n[Set up Application Default Credentials](https://cloud.google.com/docs/authentication/provide-credentials-adc) for more details.\n\nGetting started\n---------------\n\nThe simplest option is to use the synchronous, one-shot API as shown\nbelow in the sample code. More complex scenarios are considered further down this page.\n\nNote that the audio data should be mono rather than stereo, and the\nformat needs to be explicitly specified in the request.\n\nSample code\n-----------\n\nConstructing a RecognitionAudio object\n--------------------------------------\n\nThere are various factory methods on the\n[RecognitionAudio](/dotnet/docs/reference/Google.Cloud.Speech.V1P1Beta1/latest/Google.Cloud.Speech.V1P1Beta1.RecognitionAudio) class to allow\ninstances to be constructed from files, streams, byte arrays and URIs. \n\n RecognitionAudio audio1 = RecognitionAudio.FromFile(\"Sound/SpeechSample.flac\");\n RecognitionAudio audio2 = RecognitionAudio.FetchFromUri(\"https://.../HostedSpeech.flac\");\n RecognitionAudio audio3 = RecognitionAudio.FromStorageUri(\"gs://my-bucket/my-file\");\n\n byte[] bytes = ReadAudioData(); // For example, from a database\n RecognitionAudio audio4 = RecognitionAudio.FromBytes(bytes);\n\n using (Stream stream = OpenAudioStream()) // Any regular .NET stream\n {\n RecognitionAudio audio5 = RecognitionAudio.FromStream(stream);\n }\n\nDetect speech in a single file\n------------------------------\n\n SpeechClient client = SpeechClient.Create();\n RecognitionConfig config = new RecognitionConfig\n {\n Encoding = AudioEncoding.Linear16,\n SampleRateHertz = 16000,\n LanguageCode = LanguageCodes.English.UnitedStates,\n UseEnhanced = true\n };\n RecognizeResponse response = client.Recognize(config, audio);\n Console.WriteLine(response);\n\nImmediate, long-running and streaming operations\n------------------------------------------------\n\nThe underlying RPC API contains three modes of operation.\n\nThe simplest is via the Recognize method. You make a single\nrequest, and get a single response with the result of the analysis.\n\nThe LongRunningRecognize method still requires all of the audio data to be\npassed in a single request, but the response from the RPC is a\nGoogle.LongRunning.Operation, representing an operation which could\ntake some time to complete. It contains a token which can be used to\nretrieve the results later - you can think of it as a more\npersistent and remote `Task\u003cT\u003e` to a first approximation.\n\nFinally, the RPC API supports StreamingRecognize, which is a\nbidirectional streaming API: the client makes a number of requests,\nand the server emits a number of responses. This enables a\nconversation to be transcribed in near real time, for example,\nwithout the client needing to split it into chunks for single\noperations."]]