4 Replies
Voiceflow doesn’t yet offer a built-in connector for Google Cloud Speech-to-Text.
Today the Telephony channel lets you pick from the supported STT engines below, and Google isn’t on the list:
• Cartesia Ink-Whisper (Whisper-based)
• AssemblyAI Universal
• Deepgram (Nova-2, Nova-3, Nova-3 Medical)
If you want to use Google’s model you would need to run it outside of Voiceflow – e.g., forward the raw audio stream from Twilio (or another carrier) to your own middleware, call Google’s API there, then pass the transcript back to Voiceflow through an API/Function step. That custom proxy is currently the only workaround until Google STT is added to the native provider list.
Docs: https://docs.voiceflow.com
Voiceflow
Welcome to Voiceflow's docs
Build, manage, and deliver chat and voice agents for customer support and beyond.
ok, how can I forward the audio stream from voiceflow to googles stt and get the transcript back in voiceflow?
Can someone answer the previous question? How can I forward an audio stream from voiceflow to googles text to speech and get it back?
I'm not familiar with the Google voice service, but a way to achieve what you want would be to create an automation on Make where you fetch the audio file, you create a transcription with Whisper (OpenAI) and send it to Voiceflow via the API.
If you need we can have a call
Hey @Francois🇫🇷 thanks for the reply! For my use case, the available speech models didn't transcribe the speech correctly, and the one from Google seemed like the most likely to succeed. I got it sort of working with twiml custom action step and a cloud run job that does the processing with googles STT. Then on Voiceflow side I just try to poll the cloud run job till it returns a success state and the transcribed text. I tried the Voiceflow api route, but couldn't get it to work. If you send something back to voiceflow project via the voiceflow api, how/where does it get captured?