Voicegain
Speech Analytics API

 
Voicegain Speech-to-Text canbe used for Real-Time Transcription (Live Transcription)
SPEECH ANALYTICS API
Voicegain Speech Analytics APIs allows developers to analyze audio recordings of calls, web meetings, voicemails and more. These recordings get accurately transcribed and analyzed using AI. We inspect both audio and the transcript to extract sentiment plus other metrics and entities. The annotated recordings can accessed using our our APIs. We provide a reference Web UI that can be customized by clients.
INGEST AUDIO FROM DIVERSE SOURCES

Voicegain Speech Analytics can ingest audio from diverse sources like files, object storage (e.g. S3), Web APIs, SIP/SIPREC. 

SPEECH ANALYTICS AT THE EDGE

Voicegain platform can offer control, data privacy and security by providing analytics at the Edge (i.e. on client’s infrastructure). Edge also offers lower pricing.

CUSTOM ACOUSTIC MODELS

Voicegain can train the acoustic model to include utterances related to industry jargon, speaker style and accents, etc. Custom trained models cut the word-error-rate in half.

AFFORDABLE PRICING

Voicegain is disruptively priced to analyze 100% of recordings. Why sample, when you can know it all.
Pricing is based on usage, so there are no fixed costs.

BOTH AUDIO AND TEXT 
ANALYZED BY AI

Our AI algorithms are applied both to call audio and the transcript in order to extract features like sentiment throughout the call, relevant entities, etc.

CONTACT CENTER
SUPPORT

Use with Voicegain Speech-to-Text APIs during the IVR and live agent interaction to enhance automation with Voice Bots

KEY FEATURES
 

Voicegain 

Speech-to-Text  APIs

Voicegain Speech-to-Text platform is accesible over Web API
Voicegain’s Speech-to-Text APIs enable developers to invoke both offline and real-time Speech-to-Text over http, gRPC and WebSockets. The Transcribe API provides speech-to-text using  a large vocabulary model, while the Recognize API supports use of speech grammars as the language model.  
SUPPORT FOR STREAMING

Real-time streaming input is supported over web-sockets, gRPC and SDK.  We also directly integrate with Twilio Media Streams.

DEPLOY AT THE EDGE OR ON CLOUD

Access APIs on our Cloud infrastructure or through containerized deployment on the Edge  

AFFORDABLE PRICE AT GREAT ACCURACY

Our disruptive pricing enables mass adoption of speech. We have a free tier that can help you get started with your app immediately.

RESTFUL API

Voicegain provides standard RESTful APIs. Documentation is provided in Open API 3.0 format and has been verified to work with code generation tools.

COMPREHENSIVE SET OF APIs

We provide a comprehensive set of APIs for a wide array of use cases. For example, a word-tree output for large vocabulary and n-best results for grammar-based speech-to-text 

CUSTOM ACOUSTIC MODELS

Voicegain provides APIs and tools to train the acoustic model to their specific needs. Training does not require time annotation in transcripts.

KEY FEATURES
 

Voicegain Telephony

Bot APIs

VG_RTC Callback API Icon_Orange over Gra
Telephony Bot APIs (formerly called RTC Callback APIs) enable developers to build voice bots & speech IVRs that interface over SIP/RTP. The API is flexible and supports combining an open conversational experience with a directed dialog in voice bots. The API is a callback style JSON API and supports commands for prompt playback and speech recognition. The API can easily integrate with NLU engines like RASA & Dialoglfow for intent capture and CPaas platforms like Twilio & SignalWire for SIP/RTP.
SIMPLICITY OF INTEGRATION

Telephony Bot APIs are simple to integrate. You invite us to a single session for as long you need to communicate with a caller. 

SUPPORT BOTH NLU AND DIRECTED DIALOG

API is easy to integrate with popular NLU engines like RASA and Dialog-flow. We also offer easy options to specify directed dialog logic.

LARGE VOCABULARY, SPEECH GRAMMARS

We support both large vocabulary and speech grammars. You can use grammars  to constrain the recognizer for high accuracy & simpler intent capture.

INTEGRATE WITH YOUR CPAAS

The API integrates with CPaas Platforms like Twilio, SignalWire, Amazon Voice Connector. Basically any platform from which you can  do SIP INVITE

CALLBACK APIs
 

The callback API makes requests to your app logic upon significant events. In response you may invoke commands to play a prompt, ask a question, or convert an utterance to text. 

APP/BOT LOGIC IN ANY LANGUAGE

Developers can write app logic in a programming language of their choice – Python, Node.js or Java.

KEY FEATURES
 

Voicegain Transcribe
& Captioning

VG_Transcribe Real Time Icon_Orange over
REAL-TIME TRANSCRIPTION
Voicegain Transcribe supports real-time transcription of audio that is streamed to the Voicegain platform using our web APIs. This may be used for captioning live streamed events, or providing instant transcript for lectures, talks, and conference presentations, etc. Additionally, we provide convenient export to multiple formats, including several used for closed captioning.
VG_Transcribe Offline Icon_Orange over G
OFFLINE TRANSCRIPTION
Voicegain Transcribe also offers transcription of recorded audio at high accuracy and low cost. We include  a Web UI  that makes it very easy to review and correct the transcript in a minimum amount of time. We provide convenient methods to both ingest the audio and to distribute the generated transcript. Use it in your workflow, to provide transcription services, or just to keep notes. 
KEY FEATURES
TIMESTAMP INFORMATION

Voicegain provides a timestamp for each word transcribed. Time data is retained when editing /correcting the transcript and can be included in exported files

TRANSCRIPTION AT THE EDGE

Voicegain platform offers control, data privacy and security by providing transcription at the Edge (i.e. on client’s infrastructure).

CUSTOM ACOUSTIC MODELS

Voicegain can train the acoustic model to include utterances related to industry jargon, Speaker style and accents, etc.

CUSTOM VOCABULARY

Voicegain provides the ability to add new words (that may be specific to the client’s domain) to  the language model. 

SUPPORT FOR A VARIETY OF AUDIO INPUTS

We provide multiple ways to submit audio and receive text output that cover most common use cases.

PUNCTUATION

We add capitalization and punctuation automatically using deep learning, so that the output is more intelligible and can be used with minimal editing. 

 

Voicegain MRCP ASR

Voicegain Automated Speech Recognition (ASR) for IVR
SPEECH RECOGNITION FOR TELEPHONY BASED VOICEXML IVRs 
Voicegain MRCP ASR is our realtime speech recognition engine for VoiceXML based IVR applications. It supports speech grammars using GRXML &  JSGF and provides n-best results. We also provide results from our large vocabulary model for utterances that are out-of-Grammar(OOG) or have low confidence results - see... 
Voicegain offers advanced IVR test and optimization tools
TEST AND OPTIMIZATION TOOLS
Voicegain ASR comes with easy-to-use tools to analyze, tune, test, and optimize speech grammars. We also have tools and APIs that allow developers to train a custom acoustic model. This can offer much better recognition performance for a client's domain. 
VOICEXML/MRCP SUPPORT

Voicegain ASR is invoked from any VoiceXML IVR platform over MRCP. We are compatible with VXML platforms like Avaya, Genesys, Cisco, etc.

DEPLOY AT THE EDGE OR ON CLOUD

The software is available on the Cloud or deployable On-Premise/at the Edge (i.e. on client’s infrastructure).

CUSTOM ACOUSTIC MODELS

Clients can train the acoustic model to include utterances related to industry jargon, Speaker style and accents, etc.

TUNING &
TESING TOOLS

Voicegain provides tools for tuning and testing grammars. Same tools can be used to collect data for acoustic model training.

LICENSING BASED ON USAGE

Voicegain ASR is licensed based on usage. This helps clients avoid significant upfront capital outlays.

SPEECH GRAMMARS +
LARGE VOCABULARY  

We provide full support for grammars like GRXML and JSGF. We include a library of built-in grammars and support large vocabulary models too.

KEY FEATURES