This page shows you how to use Gemini Enterprise Agent Platform Studio to convert speech to text.
To learn how to convert text to speech, see Convert text to speech .
Convert speech to text
To convert speech to text, do the following:
-
In the Agent Platform section of the Google Cloud console, go to the Gemini Enterprise Agent Platform Studiopage.
-
Click Generate speech.
-
Select the Speech-to-texttab.
-
In Speech, click Browseto select the audio file that you want to convert to text.
-
In the Languageselector box, select the language of the speech in the audio file.
-
Click Submit.
The converted text appears in Text.
Limitations
- Audio files can be a maximum 60 seconds or 10 MB (whichever is less).
- Files are transcribed with the Chirp model.
- Only 16-bit linear PCM WAV files are supported.
You can use the Speech-to-Text UI directly to overcome these limitations.
What's next
- For more models, advanced features, and ability to transcribe files up to 8 hours, see Speech-to-Text .

