eager-peach
eager-peach2y ago

Voice AI Integrations with Twilio

So I'm currently connecting Voiceflow and ElvenLabs to give the speech capabilities a boost in audio, with the intention of integrating all this into calling and number software like Twilio. How we initially had it working is I call the ElevenLabs text-to-speech API have it finish and then convert that into 64bit audio - which exactly follows Niko's tutorial to do so. What happened was if the character length exceeded 250, the function would always return invalid payload=null - so then we had to make a custom API directly to handle this externally. The problem we ran into after doing all this is that with this current system, we can't seem to get it integrated into any external platforms that can actually use it - for example, we can't seem to get it properly working with Twilio, which is where we would like to be able to have people conversate with it through. And we've only seen the repository from @NiKo regarding text-based integrations into Twilio - but we haven't been able to adapt it to get it working for voice.. so any help you could offer on this would be greatly appreciated!
21 Replies
Tico | Voiceflow Assistant
Integrating Voiceflow with Twilio for voice capabilities, especially after enhancing audio quality through ElevenLabs, involves a few steps and considerations. Firstly, it's important to ensure that your text-to-speech conversion through ElevenLabs does not exceed character limits that could lead to errors. For integrating with external platforms like Twilio, you might need to adapt existing text-based integration examples to suit voice functionalities. Voiceflow provides guidance on connecting your agent to a phone number using Twilio IVR through open-source examples. This involves leveraging the Dialog Manager API and possibly customizing the integration to fit your specific needs. Key points to consider: - Ensure text-to-speech conversions are within character limits to avoid errors. - Adapt text-based integration guides for voice functionalities when necessary. - Use Voiceflow's guidance and open-source examples for Twilio IVR integration. - Customize the integration using the Dialog Manager API to fit your requirements. For detailed guidance on integrating with Twilio IVR, you can refer to our documentation here.
eager-peach
eager-peachOP2y ago
Thanks. In the documentation you referred to, I notice that it is calling "VOICEFLOW_API_URL": "https://general-runtime.voiceflow.com", which refers to a feature in the images that is only present in the text-based chatbots. Is there documentation on how this needs to be adapted in order for this to properly work with voice-based chatbots as well? And either way, how can we adapt existing text-based integration examples to suit voice functionalities? We are struggling to get this portion of it working.
absent-sapphire
absent-sapphire2y ago
Good question - so either assistant will work. Are you creating a voice experience for phone? Twilio has its own Speech recognition and text to speech that is uses. So I dont believe you'd be able to use Eleven Labs with it. @Utkarsh-Marwaha for your voice assistant are you just using whatsapp voicenots or some other telphony system?
optimistic-gold
optimistic-gold2y ago
Hey! I'm using twilio. I have found a way to use twilio just for routing calls. the audio chunks are processed on my system where i can choose any TTS or STT provider.
eager-peach
eager-peachOP2y ago
That's awesome @Utkarsh-Marwaha - would you be willing to offer any insight/help on this? We're trying to accomplish something similar.
optimistic-gold
optimistic-gold2y ago
DMd you
eager-peach
eager-peachOP2y ago
Yes I am trying to create a voice experience for the phone. Currently the problem is Twilio isn't properly accepting the voice functionality coming from Voiceflow (and we wrote a custom API between ElevenLabs and Voiceflow in the back of this). If you have a solution to this it would be greatly appreciated as currently there is no real way to actually use the voice funtionality we have built lol @Daniel
KimLooo
KimLooo2y ago
@NiKo | Voiceflow any thoughts here?
absent-sapphire
absent-sapphire2y ago
Yea gonna defer to Nico on this - Twilio is a bit challenging since it forces you to use their stack (which isnt great)
NiKo | Voiceflow
I will investigate. Can you explain a bit more the 'Twilio isn't properly accepting the voice functionality coming from Voiceflow'?
manual-pink
manual-pink2y ago
Hey @NiKo | Voiceflow so currently we are trying to set up a a connection between Voiceflow voice assistant and Twilio calls. We have the voice assistant using an audio step to speak an eleven labs voice which is a base64 audio from the code you wrote between Eleven Labs and Voiceflow (And we set it up as a custom API to avoid character limitations). And when I use the Twilio connection from the Github repo you guys shared and try to connect to a phone call through Twilio no audio plays. We even tried setting a basic speak step and still got no audio to play.
eager-peach
eager-peachOP2y ago
@NiKo | Voiceflow this is my problem as well by the way — we’re just both struggling with it
NiKo | Voiceflow
Ok, thanks for the details. I will check the integration. You should be able to generate and return the audio within the integration instead of the VF project. Also need to double check what audio can Twilio handle. If I recall correctly, the actual integration used Twilio voice to render text, not raw audio.
manual-pink
manual-pink2y ago
Ok thanks for clearing that up! Do you know of a way to make Twilio handle the raw audio?
eager-peach
eager-peachOP2y ago
Just tagging you in this in case it doesn’t notify otherwise @NiKo | Voiceflow
NiKo | Voiceflow
On update has been pushed to our code example to support audio (dataURI and Url).
manual-pink
manual-pink2y ago
Hey @NiKo | Voiceflow, so I've been trying to get it working on my end and I must be missing something. I followed the readme with the IVR connection and set all the connections. I'm using a voice agent with the url: https://general-runtime.voiceflow.com/state/user/userID/interact When I run it on replit I see that the connection is working and I even get the text when Twilio changes my voice to text. Console: Express server listening on port 3000 GET / 200 5.657 ms - 46 GET / 200 0.628 ms - 46 POST /ivr/launch 200 780.138 ms - 229 Utterance: POST /ivr/interaction 200 234.870 ms - 214 Utterance: POST /ivr/interaction 200 219.326 ms - 214 Utterance: What services do you provide POST /ivr/interaction 200 6657.896 ms - 229 Utterance: POST /ivr/interaction 200 360.386 ms - 214 Utterance: POST /ivr/interaction 200 436.102 ms - 214 I tested the voiceflow project separately and it returns the audioURI like this: [{'type': 'speak', 'payload': {'message': '', 'type': 'audio', 'src': 'data:audio/mpeg;base64,the rest'}}] But I'm still hearing nothing 😦 Would love some more help on this thanks.
NiKo | Voiceflow
What is the value of the BASE_URL in your .env (or secrets on Replit)? Should be your replit address here. Also double check that the 'tmp' folder is present. Should be automatically created but maybe Replit doesn't allow that.
manual-pink
manual-pink2y ago
@NiKo | Voiceflow sorry for taking so long to reply. But for BASE_URL are you referring to the voiceflow_URL ? This is what my secrets look like (They have the values though) { "TWILIO_ACCOUNT_SID": "My SID", "VOICEFLOW_VERSION_ID": "MY VOICEFLOW VERSION ID", "VOICEFLOW_API_KEY": "VF.DM.API KEY, "VOICEFLOW_API_URL": "https://general-runtime.voiceflow.com", "TWILIO_PHONE_NUMBER": "Number", "TWILIO_AUTH_TOKEN": "Token" } And then I don't see a tmp folder in the github or replit I must be missing it.
NiKo | Voiceflow
Add “BASE_URL”:”https://your_replit_endpoint” in your secrets in Replit as you need this to generate a full audio URL for Twilio to play the audio. Also create a “tmp” folder in the root folder to allow the audio files to be saved.
manual-pink
manual-pink17mo ago
I'll try that thanks

Did you find this page helpful?