Send a transcription request to Cloud Speech-to-Text On-Prem

Prerequisites

  1. Complete all required steps in the before you begin quickstart.
  2. Deploy the API .
  3. Query the API to make sure it's working.

Install dependencies

  1. Clone python-speech and change the directory to the sample directory.

     $  
    git  
    clone  
    https://github.com/googleapis/python-speech.git
    $  
     cd 
      
    python-speech/samples/snippets 
    
  2. Install pip and virtualenv if you have not already done so. Refer to the Google Cloud Platform Python Development Environment Setup Guide for more information.

  3. Create a virtualenv . The samples below are compatible with Python 2.7 and 3.4+.

     $  
    virtualenv  
    env
    $  
     source 
      
    env/bin/activate 
    
  4. Install the dependencies needed to run the samples.

     $  
    pip  
    install  
    -r  
    requirements.txt 
    

Code sample

The code sample below uses the google-cloud-speech library . You can use GitHub to browse the source and report issues .

Transcribe an audio file

You can use the code sample below to transcribe an audio file using either a public IP or cluster level IP. For more information on IP types, see the documentation on querying the API .

Public IP:

   
#  
Using  
a  
Public  
IP  
$  
python  
transcribe_onprem.py  
--file_path="../resources/two_channel_16k.wav"  
--api_endpoint= ${ 
 PUBLIC_IP 
 } 
:443 

Cluster level IP:

 # Using a cluster level IP
    $ kubectl port-forward -n $NAMESPACE $POD 10000:10000
    $ python transcribe_onprem.py --file_path="../resources/two_channel_16k.wav" --api_endpoint="0.0.0.0:10000" 

Python

To authenticate to Speech-to-Text, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  def 
  
 transcribe_onprem 
 ( 
 local_file_path 
 : 
 str 
 , 
 api_endpoint 
 : 
 str 
 , 
 ) 
 - 
> speech_v1p1beta1 
 . 
 RecognizeResponse 
 : 
  
 """ 
 Transcribe a short audio file using synchronous speech recognition on-prem 
 Args: 
 local_file_path: The path to local audio file, e.g. /path/audio.wav 
 api_endpoint: Endpoint to call for speech recognition, e.g. 0.0.0.0:10000 
 Returns: 
 The speech recognition response 
 { 
 """ 
 # api_endpoint = '0.0.0.0:10000' 
 # local_file_path = '../resources/two_channel_16k.raw' 
 # Create a gRPC channel to your server 
 channel 
 = 
 grpc 
 . 
 insecure_channel 
 ( 
 target 
 = 
 api_endpoint 
 ) 
 transport 
 = 
 speech_v1p1beta1 
 . 
 services 
 . 
 speech 
 . 
 transports 
 . 
 SpeechGrpcTransport 
 ( 
 channel 
 = 
 channel 
 ) 
 client 
 = 
 speech_v1p1beta1 
 . 
 SpeechClient 
 ( 
 transport 
 = 
 transport 
 ) 
 # The language of the supplied audio 
 language_code 
 = 
 "en-US" 
 # Sample rate in Hertz of the audio data sent 
 sample_rate_hertz 
 = 
 16000 
 # Encoding of audio data sent. This sample sets this explicitly. 
 # This field is optional for FLAC and WAV audio formats. 
 encoding 
 = 
 speech_v1p1beta1 
 . 
 RecognitionConfig 
 . 
 AudioEncoding 
 . 
 LINEAR16 
 config 
 = 
 { 
 "encoding" 
 : 
 encoding 
 , 
 "language_code" 
 : 
 language_code 
 , 
 "sample_rate_hertz" 
 : 
 sample_rate_hertz 
 , 
 } 
 with 
 io 
 . 
 open 
 ( 
 local_file_path 
 , 
 "rb" 
 ) 
 as 
 f 
 : 
 content 
 = 
 f 
 . 
 read 
 () 
 audio 
 = 
 { 
 "content" 
 : 
 content 
 } 
 response 
 = 
 client 
 . 
 recognize 
 ( 
 request 
 = 
 { 
 "config" 
 : 
 config 
 , 
 "audio" 
 : 
 audio 
 }) 
 for 
 result 
 in 
 response 
 . 
 results 
 : 
 # First alternative is the most probable result 
 alternative 
 = 
 result 
 . 
 alternatives 
 [ 
 0 
 ] 
 print 
 ( 
 f 
 "Transcript: 
 { 
 alternative 
 . 
 transcript 
 } 
 " 
 ) 
 return 
 response 
 
Create a Mobile Website
View Site in Mobile | Classic
Share by: