Voicegain
Speech Analytics API

 
SPEECH ANALYTICS API
Voicegain Speech Analytics APIs allows developers to analyze audio recordings of calls, web meetings, voicemails and more. These recordings get accurately transcribed and analyzed using AI. We inspect both audio and the transcript to extract sentiment plus other metrics and entities. The annotated recordings can accessed using our our APIs. We provide a reference Web UI that can be customized by clients.
INGEST AUDIO FROM DIVERSE SOURCES

Voicegain Speech Analytics can ingest audio from diverse sources like files, object storage (e.g. S3), Web APIs, SIP/SIPREC. 

SPEECH ANALYTICS AT THE EDGE

Voicegain platform can offer control, data privacy and security by providing analytics at the Edge (i.e. on client’s infrastructure). Edge also offers lower pricing.

CUSTOM ACOUSTIC MODELS

Voicegain can train the acoustic model to include utterances related to industry jargon, speaker style and accents, etc. Custom trained models cut the word-error-rate in half.

AFFORDABLE PRICING

Voicegain is disruptively priced to analyze 100% of recordings. Why sample, when you can know it all.
Pricing is based on usage, so there are no fixed costs.

BOTH AUDIO AND TEXT 
ANALYZED BY AI

Our AI algorithms are applied both to call audio and the transcript in order to extract features like sentiment throughout the call, relevant entities, etc.

CONTACT CENTER
SUPPORT

Use with Voicegain Speech-to-Text APIs during the IVR and live agent interaction to enhance automation with Voice Bots

KEY FEATURES
 

Voicegain 

Speech-to-Text  APIs

Voicegain’s Speech-to-Text APIs enable developers to invoke both offline and real-time Speech-to-Text over http, gRPC and WebSockets. The Transcribe API provides speech-to-text using  a large vocabulary model, while the Recognize API supports use of speech grammars as the language model.  
SUPPORT FOR STREAMING

Real-time streaming input is supported over web-sockets, gRPC and SDK.  We also directly integrate with Twilio Media Streams.

DEPLOY AT THE EDGE OR ON CLOUD

Access APIs on our Cloud infrastructure or through containerized deployment on the Edge  

AFFORDABLE PRICE AT GREAT ACCURACY

Our disruptive pricing enables mass adoption of speech. We have a free tier that can help you get started with your app immediately.

RESTFUL API

Voicegain provides standard RESTful APIs. Documentation is provided in Open API 3.0 format and has been verified to work with code generation tools.

COMPREHENSIVE SET OF APIs

We provide a comprehensive set of APIs for a wide array of use cases. For example, a word-tree output for large vocabulary and n-best results for grammar-based speech-to-text 

CUSTOM ACOUSTIC MODELS

Voicegain provides APIs and tools to train the acoustic model to their specific needs. Training does not require time annotation in transcripts.

KEY FEATURES
 

Voicegain RTC

Callback APIs

Voicegain Realtime Communication (RTC) Callback APIs enable backend web developers to build interactive voice apps (Voice Bot, Speech IVR, phone system) in a language of their choice. You can establish a single realtime session with our platform; after which your code interacts using JSON commands and callbacks. We support telephony using SIP/RTP and browser using WebRTC to handle poor network connections.
LONG AUDIO INTERACTION 

RTC Callback APIs are a good fit for long-lasting audio where there are multiple interactions on the same audio stream 

RE-USE CHATBOT LOGIC FOR VOICE

Speed up deployment of Voicebots by using our Callback APIs with your existing chatbot logic 

LARGE VOCABULARY, SPEECH GRAMMARS

Callback APIs support both speech grammars and large vocabulary speech-to-text.  Speech grammars provide entities and intent in addition to utterance.

TELEPHONY /SIP SUPPORT 

BYOC (Bring Your own SIP Carrier) or use Voicegain’s integrated SIP provider

COMMAND and CALLBACK APIs

Invoke commands to answer/ make a call, play a prompt, get a spoken response and transfer a call. Get results and other events on the session via Web Callbacks.

APP/BOT LOGIC IN ANY LANGUAGE

Developers can write application logic in a programming language of their choice – Python, Node.js or Java 

KEY FEATURES
 

Voicegain Transcribe
& Captioning

REAL-TIME TRANSCRIPTION
Voicegain Transcribe supports real-time transcription of audio that is streamed to the Voicegain platform using our web APIs. This may be used for captioning live streamed events, or providing instant transcript for lectures, talks, and conference presentations, etc. Additionally, we provide convenient export to multiple formats, including several used for closed captioning.
OFFLINE TRANSCRIPTION
Voicegain Transcribe also offers transcription of recorded audio at high accuracy and low cost. We include  a Web UI  that makes it very easy to review and correct the transcript in a minimum amount of time. We provide convenient methods to both ingest the audio and to distribute the generated transcript. Use it in your workflow, to provide transcription services, or just to keep notes. 
KEY FEATURES
TIMESTAMP INFORMATION

Voicegain provides a timestamp for each word transcribed. Time data is retained when editing /correcting the transcript and can be included in exported files

TRANSCRIPTION AT THE EDGE

Voicegain platform offers control, data privacy and security by providing transcription at the Edge (i.e. on client’s infrastructure).

CUSTOM ACOUSTIC MODELS

Voicegain can train the acoustic model to include utterances related to industry jargon, Speaker style and accents, etc.

CUSTOM VOCABULARY

Voicegain provides the ability to add new words (that may be specific to the client’s domain) to  the language model. 

SUPPORT FOR A VARIETY OF AUDIO INPUTS

We provide multiple ways to submit audio and receive text output that cover most common use cases.

PUNCTUATION

We add capitalization and punctuation automatically using deep learning, so that the output is more intelligible and can be used with minimal editing. 

 

Voicegain MRCP ASR

SPEECH RECOGNITION FOR TELEPHONY BASED VOICEXML IVRs 
Voicegain MRCP ASR is our realtime speech recognition engine for VoiceXML based IVR applications. It supports speech grammars using GRXML &  JSGF and provides n-best results. We also provide results from our large vocabulary model for utterances that are out-of-Grammar(OOG) or have low confidence results. 
TEST AND OPTIMIZATION TOOLS
Voicegain ASR comes with easy-to-use tools to analyze, tune, test, and optimize speech grammars. We also have tools and APIs that allow developers to train a custom acoustic model. This can offer much better recognition performance for a client's domain. 
VOICEXML/MRCP SUPPORT

Voicegain ASR is invoked from any VoiceXML IVR platform over MRCP. We are compatible with VXML platforms like Avaya, Genesys, Cisco, etc.

DEPLOY AT THE EDGE OR ON CLOUD

The software is available on the Cloud or deployable On-Premise/at the Edge (i.e. on client’s infrastructure).

CUSTOM ACOUSTIC MODELS

Clients can train the acoustic model to include utterances related to industry jargon, Speaker style and accents, etc.

TUNING &
TESING TOOLS

Voicegain provides tools for tuning and testing grammars. Same tools can be used to collect data for acoustic model training.

LICENSING BASED ON USAGE

Voicegain ASR is licensed based on usage. This helps clients avoid significant upfront capital outlays.

SPEECH GRAMMARS +
LARGE VOCABULARY  

We provide full support for grammars like GRXML and JSGF. We include a library of built-in grammars and support large vocabulary models too.

KEY FEATURES
Contact Us