A Hyper Scalable and Secure Media Server for VoiceAI Applications
IraVoice is a bidirectional media server that brings together Dialer, Recorder, and BotStream modules to enable seamless telephony services for both legacy CX systems and modern conversational AI applications.
It delivers enterprise-grade capabilities including API-driven call control, real-time recording, conferencing, media streaming, trunk management, and QoS monitoring. Designed for flexibility, IraVoice integrates with PSTN or any PBX over E1 or SIP, while connecting to VoiceBots and AI platforms through secure WebSocket interfaces.
Built on the proven Freeswitch telephony platform and deployed on Kubernetes, IraVoice offers high performance, horizontal scalability, and fault-tolerant operation. With low-latency messaging powered by NATS, it ensures reliable communication across distributed data centers.
By unifying dialer functions, compliance-grade recording, and real-time AI streaming in a single platform, IraVoice provides a versatile, future-ready backbone for mission-critical voice applications.
Use Cases
- Telephony Media Server for Agent Desktop and Call Centre Applications
- API Dialer for Outbound Campaign Manager Applications
- Adding AI Capabilities to Legacy PBX and Contact Centre System.
- Call Tapping & Media Streaming for Real time VoiceAI analytics for CX use cases.
- Telephony Stack for VoiceBot Applications
Load Testing Suite
Telephony Control, Compliance Recording,
and AI Streaming In One Platform.
Unified Architecture
Combines Dialer, Recorder, and BotStream modules in a single media server.
Media Streaming
Real-time access to raw audio streams for AI/ML enrichment.
PSTN & PBX Integration
Seamless inbound/outbound connectivity over E1 or SIP with any PBX.
NATS Messaging
High-speed, low-latency communication between telephony switches and VoiceAI engines.
Auto-Scaling
Kubernetes-native scaling to handle fluctuating traffic loads.
Answering Machine Detection
Improve efficiency by distinguishing live voice, fax, or machines.
Lightweight Deployment
Run on-premise or in cloud containers with minimal resource usage.
Secure Audio
Encrypted media streams with TLS and Secure RTP.
Enterprise-Grade Reliability
99.999% availability with compliance to industry security standards.
Built-in Telephony Features
Conferencing, DTMF, barge-in, audio playback, and more out of the box.
Real-Time Analytics Ready
Streamlined for monitoring, reporting, and AI-driven insights.
Future-Ready Platform
Modular design enabling seamless integration with next-gen CX and AI systems.
IraVoice
FAQs
What is VoiceAPI?
VoiceAPI is a set of tools that lets you add calling(make call/receive call) capabilities to your applications using an Application Programming Interface(API). Our telephony layer takes care of handling all the telephony functions while you can focus on your area of expertise.
What are the common use cases of using VoiceAPIs?
Our products are being used in various industry verticals such as Lead generation, Debt collection, Voicebots, Consent collection,Election Campaigns,Outbound Campaign Managers.
What are the benefits of using VoiceAPIs?
Businesses can add value to their existing customer engagement channels by adding voice. This gives a powerful feature to those who are running legacy softwares or do not want to get into the telephony domain and leverage our years of domain knowledge to improve their customer experience.
Can these APIs be used to create omnichannel solutions?
Yes, along with VoiceAPIs our CPaaS platform provides APIs for WhatsApp, SMS etc which can be consumed by the enterprise applications to create an omnichannel solution.
Do you need telephony knowledge for using these VoieAPIs?
No, our solution takes care of handling all SIP trunking, gateway management, any troubleshooting required at the telephony end. Your application can use the APIs published to get required features added.
Do you support media streaming?
Yes, the media of the established calls can be sent over the websocket to the required application using a secure protocol.
Any concurrency limitations?
IraVoice, when hosted on an 8-core, 8GB RAM server, can handle up to 800 simultaneous
voicebot sessions with recording enabled, and up to 1000 simultaneous sessisons without
recording.
Besides CPU resource constraints, the call capacity also depends on the SIP trunk configuration,
including the number of channels allocated and the CPS (Calls Per Second) limit defined by the
telecom provider.
Do you support WhatsApp Business Calling API?
Yes, IraVoice supports supports WhatsApp Business Calling.
Where or how can we adjust dial limits?
Dial limit can be managed via the IraVoice Trunk Manager, either through the HTTP API or
directly in the Trunk Manager interface.
Is there support for 16kHz audio streaming?
IraVoice supports streaming in both 8khz and 16khz.
How are inbound calls configured on your platform?
Inbound calls in IraVoice are configured through dialplans within the setup. These dial plans
are typically implemented by the Epicode team based on your inbound routing requirements.
Are there any parameters available for VAD configuration : Speech threshold, silence threshold?
Parameters can be set within the call_params as follows:
• VAD Mode (“enable_vad”: true)
– Audio is delivered as complete utterances whenever the user finishes speaking.
• Non-VAD Mode (“enable_vad”: false)
– Audio is streamed in chunks.
– chunk_size can be configured under call_params.
– Default: 3200 bytes (200 ms of audio).
• Silence Threshold (“silence_threshold”)
– Audio level threshold to consider a segment as silence.
– Can range from 1 to 20, with a default of 5.
– Silence Duration: Calculated as threshold value × 250 ms to determine when a segment is considered silent.
• Speech Threshold (“speech_threshold”)
– Speech level threshold to consider a segment as speech
– Defines the minimum amplitude level required to classify a segment as speech.
– Works best when set between 500 and 600. Default: 800
How can I set custom SIP headers for inbound/outbound calls?
Custom SIP headers for IraVoice outbound calls can be configured using the channel_vars
parameter in the make_call API: “channel_vars”: { “<sip-header-name>”: “value string” }
For Inbound Calls, custom SIP headers can be defined within the IraVoice dialplan configuration.
What are the possible causes of latency and Voice distortion in IraVoice and how can they be minimized or avoided?
Possible causes of latency and voice distortion in IraVoice include:
• Network issues: High jitter, packet loss, or unstable bandwidth between SIP trunks, media
servers, and VoiceAI endpoints.
• Server resource constraints: CPU or memory saturation on the host machine.
• Inconsistent streaming configurations: Mismatched sample rates or chunk sizes
between endpoints leading to distorted audio.
• Media routing complexity: Long network paths or multiple proxy hops introducing
transmission delays.
• VoiceBot response time: Slow response from AI models or APIs used in VoiceBots adding
to overall call latency.
To minimize or avoid these issues:
• Use dedicated bandwidth and maintain network jitter below 30 ms.
• Allocate adequate CPU and memory resources based on expected concurrency.
• Ensure consistent audio streaming configurations across all endpoints.
• Optimize VoiceBot applications to minimize response delays between streaming chunks.
What storage options are available to upload the recordings ?
We support recording uploads to your AWS s3 buckets, Azure Cloud Storage, Google cloud
storage, or directly to your webhook endpoint.
How are the recordings shared with us? Do we receive downloadable links for each call in a consolidated format (e.g., Excel or API)?
To upload call recordings to your cloud, we will require the necessary credentials for your cloud
storage. Recordings will be uploaded as calls are completed.
If you prefer to use non-cloud storage, please provide a server with adequate storage capacity.
Epicode will upload the recordings to this server and share a secure HTTP endpoint for
download access.
If we provide our own storage bucket, what naming convention will be used for the recording files (e.g., call_uid or additional metadata)?
If you prefer to use your own cloud storage bucket, You can share the necessary credentials to
us to upload the recordings. The recording files will follow a standardized naming convention
that includes the call UUID.
Example: e88e852e-da28-4e1f-bc97-fc00469faebf.mp3