Represents the natural language speech audio to be processed.
| JSON representation | 
|---|
|  { 
 "config" 
 : 
 { 
 object (  | 
| Fields | |
|---|---|
| config |   Required. Instructs the speech recognizer how to process the speech audio. | 
| audio |   Required. The natural language speech audio to be processed. A single request can contain up to 2 minutes of speech audio data. The transcribed text cannot contain more than 256 bytes for virtual agent interactions. A base64-encoded string. | 

