Streaming Voice AI with Asterisk ARI Interface

Introduction

Here is my contribution towards Asterisk Open Source in Voice AI. I used ARI to build Call Streaming service and integrate it with bot server. With the help of external media channel I managed to stream media of call to external application and process the media as per need. I made complete Voice Bot solution from scratch that can be used to create Voice Assistants with Asterisk.

There are two main servers in my project, one is telephony_server and another is bot_server. The telephony_server uses ARI application that manages to produce media stream and send it to bot server in the form of json events. The bot_server accepts the events and process them for STT, LLM, TTS operations.

Architecture

In below image, have mentioned the overall call flow happening between components.

First, call comes on Asterisk server then it is sent to Stasis Dial plan application. Once call enters into Stasis application the Telephony Server (ARI application) handles it and produce media streams. The Bot server receives the media streams and process them. Here, both Telephony server and Bot server are connected via web-socket. The events which are exchanged between two servers are mentioned via colored arrows. The role of Telephony server is to produce media streams and send it to Bot server over web-socket. The role of Bot server is to process media streams and use AI intelligence to produce meaningful responses to user. Inside the Bot server the STT (speech-to-text), LLM (large language model) query and TTS (text-to-speech) operations are happening. Generated TTS responses are getting played back on call to user.

How to Setup

For setting up you can visit Github Link: https://github.com/ankitjayswal87/AsteriskVoiceBot
Necessary steps are mentioned there to implement and try it. The source code is written into Python.

Features

The solution has the following features:
– media streaming with proper json events to Bot server
– filler prompts to play for the gap of silence till the Bot process and answer for user query
– Any STT, TTS and LLM engines can be integrated (Here OpenAI is used)

Use Cases

This can be your quick startup code to build AI Voice Assistants. You can clone the repository and start building RAG Voice Assistants, your custom journey based conversational call flows, turn your call center into AI call center which can provide human like call support any time, etc.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *