Voicegain
Speech Analytics API

 
Voicegain Speech-to-Text canbe used for Real-Time Transcription (Live Transcription)
SPEECH ANALYTICS API
Voicegain Speech Analytics APIs allows developers to analyze streamed/live and recorded audio - of phone calls, meetings, videos and more. Audio gets accurately transcribed and analyzed using AI. We mine both audio and the transcript to extract sentiment, intents and other metrics and entities. Annotated audio can be accessed real-time and offline using our our APIs. We provide source-code license to a reference Web UI that can be customized by clients.
INGEST AUDIO FROM DIVERSE SOURCES

Voicegain can ingest live & recorded audio from diverse sources like files, object storage (e.g. S3), Web APIs, SIP/SIPREC. 

SPEECH ANALYTICS AT THE EDGE

Clients can gain control and data privacy and meet compliance goals by deploying speech analytics at the Edge/On-Premise.

CUSTOM ACOUSTIC MODELS

Tools to train the underlying acoustic and language model with client data. This can cut the word-error-rate in half.

AFFORDABLE PRICING

Voicegain is disruptively priced to analyze 100% of recordings. Why sample, when you can know it all.
Pricing is based on usage, so you can ramp and down based 

BOTH AUDIO AND TEXT 
ANALYZED BY AI

Our AI algorithms are applied both to call audio and the transcript in order to extract features like intent sentiment, and named entities throughout a call

CONTACT CENTER
SUPPORT

The same platform can also be used to drive automation with IVRs and Voice Bots

KEY FEATURES
 

Voicegain 

Speech-to-Text  APIs

Voicegain Speech-to-Text platform is accesible over Web API
Voicegain’s Speech-to-Text APIs enable developers to invoke both offline and real-time Speech-to-Text over web and telephony platforms. The Transcribe API provides speech-to-text using  a large vocabulary model, while the Recognize API supports use of speech grammars as the language model.  
SUPPORT FOR STREAMING

Allows realtime streaming of audio over web-sockets, gRPC, SIPREC and SDK.  We also directly integrate with Media Stream APIs of CPaaS platforms.

DEPLOY AT THE EDGE OR ON CLOUD

Access APIs on our Cloud infrastructure or through containerized deployment on the Edge  

AFFORDABLE PRICE AT GREAT ACCURACY

Our disruptive pricing enables mass adoption of speech. We have a free tier that can help you get started with your app immediately.

RESTFUL API

Voicegain provides standard RESTful APIs. Documentation is provided in Open API 3.0 format and has been verified to work with code generation tools.

COMPREHENSIVE SET OF APIs

We provide a comprehensive set of APIs for a wide array of use cases. For example, a word-tree output for large vocabulary and n-best results for grammar-based speech-to-text 

CUSTOM ACOUSTIC MODELS

Voicegain provides APIs and tools to train the acoustic model to their specific needs. Training does not require time annotation in transcripts.

KEY FEATURES
 

Voicegain Telephony

Bot APIs

VG_RTC Callback API Icon_Orange over Gra
Telephony Bot APIs (formerly called RTC Callback APIs) enable developers to build voice bots & speech IVRs that interface over SIP based telephony. The API supports using both large vocabulary transcription and grammar based recognition. It is a web callback style API with commands/verbs for prompt playback, speech recognition and more. The API can integrate with Bot frameworks like RASA & Google Dialogflow.
SIMPLICITY OF INTEGRATION

Telephony Bot APIs are simple to integrate. You invite us to a single session for as long you need to communicate with a caller. 

SUPPORT BOTH NLU AND DIRECTED DIALOG

API is easy to integrate with popular NLU engines like RASA and Dialog-flow. We also offer easy options to specify directed dialog logic.

LARGE VOCABULARY, SPEECH GRAMMARS

We support both large vocabulary and speech grammars. 

INTEGRATE WITH YOUR CPAAS

The API integrates with CPaas Platforms like Twilio, SignalWire, and others. Basically any platform from which you can  do SIP INVITE

CALLBACK APIs
 

The callback API makes requests to your app logic upon significant events. In response your app sends commands to play a prompt, or convert an utterance to text. 

APP/BOT LOGIC IN ANY LANGUAGE

Developers can write app logic in a language of their choice – Python, Node.js, etc. They can also specify the logic in a YAML document.

KEY FEATURES
 

Voicegain Transcribe
& Captioning

VG_Transcribe Real Time Icon_Orange over
REAL-TIME TRANSCRIPTION
Voicegain Transcribe supports real-time transcription of audio that is streamed to the Voicegain platform using our web APIs. This may be used for captioning live streamed events, or providing instant transcript for lectures, talks, and conference presentations, etc. Additionally, we provide convenient export to multiple formats, including several used for closed captioning.
VG_Transcribe Offline Icon_Orange over G
OFFLINE TRANSCRIPTION
Voicegain Transcribe also offers transcription of recorded audio at high accuracy and low cost. We include  a Web UI  that makes it very easy to review and correct the transcript in a minimum amount of time. We provide convenient methods to both ingest the audio and to distribute the generated transcript. Use it in your workflow, to provide transcription services, or just to keep notes. 
KEY FEATURES
TIMESTAMP INFORMATION

Voicegain provides a timestamp for each word transcribed. Time data is retained when editing /correcting the transcript and can be included in exported files

TRANSCRIPTION AT THE EDGE

Voicegain platform offers control, data privacy and security by providing transcription at the Edge (i.e. on client’s infrastructure).

CUSTOM ACOUSTIC MODELS

Voicegain can train the acoustic model to include utterances related to industry jargon, Speaker style and accents, etc.

CUSTOM VOCABULARY

Voicegain provides the ability to add new words (that may be specific to the client’s domain) to  the language model. 

SUPPORT FOR A VARIETY OF AUDIO INPUTS

We provide multiple ways to submit audio and receive text output that cover most common use cases.

PUNCTUATION

We add capitalization and punctuation automatically using deep learning, so that the output is more intelligible and can be used with minimal editing. 

 

Voicegain MRCP ASR

Voicegain Automated Speech Recognition (ASR) for IVR
SPEECH RECOGNITION FOR TELEPHONY BASED VOICEXML IVRs 
Voicegain MRCP ASR is our realtime speech recognition engine for VoiceXML based IVR applications. It supports speech grammars using GRXML &  JSGF and provides n-best results. We also provide results from our large vocabulary model for utterances that are out-of-Grammar(OOG) or have low confidence results - see... 
Voicegain offers advanced IVR test and optimization tools
TEST AND OPTIMIZATION TOOLS
Voicegain ASR comes with easy-to-use tools to analyze, tune, test, and optimize speech grammars. We also have tools and APIs that allow developers to train a custom acoustic model. This can offer much better recognition performance for a client's domain. 
VOICEXML/MRCP SUPPORT

Voicegain ASR is invoked from any VoiceXML IVR platform over MRCP. We are compatible with VXML platforms like Avaya, Genesys, Cisco, etc.

DEPLOY AT THE EDGE OR ON CLOUD

The software is available on the Cloud or deployable On-Premise/at the Edge (i.e. on client’s infrastructure).

CUSTOM ACOUSTIC MODELS

Clients can train the acoustic model to include utterances related to industry jargon, Speaker style and accents, etc.

TUNING &
TESING TOOLS

Voicegain provides tools for tuning and testing grammars. Same tools can be used to collect data for acoustic model training.

LICENSING BASED ON USAGE

Voicegain ASR is licensed based on usage. This helps clients avoid significant upfront capital outlays.

SPEECH GRAMMARS +
LARGE VOCABULARY  

We provide full support for grammars like GRXML and JSGF. We include a library of built-in grammars and support large vocabulary models too.

KEY FEATURES