that learns

Speech Recognition Platform & APIs for Transcription, Speech Analytics, Voice Bots, Telephony IVRs, and more

Free Tier available

Train Speech-to-Text Models: Acoustic Models, Language Models, Pronuncition Models.

Train custom models to maximize accuracy

  • Train acoustic model & configure language model using your datasets to maximize accuracy
  • Tune ASR to your audio characteristics - length, recording quality, background noise, accents, etc.
Speech-to-Text Transcription (Audio-to-Text Transcription)

Transcribe Speech and extract insights with NLU

  • Transcribe speech into text realtime and offline 
  • Extract text analytics (keywords, topics, sentiment) from transcribed text
  • Integrate with Contact Center and Unified Communication systems - both cloud & on-premise
Call center agent headset for listening to phone caller speech

Build Voice Bots, Voice Assistants or Speech IVRs  

  • Build a Voice bot, Voice assistants or Speech IVR  using our Telephony Bot APIs
  • Supports both large vocabulary and speech grammars; integrates with 3rd Party NLU engines
  • Stream audio realtime using multiple protocols - Websockets, gRPC, SIP/RTP, MRCP and WebRTC
Flexibility of Cloud or Edge Deployment

Deploy in your datacenter or use our Cloud  

  • Access Speech-to-text as a cloud service on our Google Cloud GPU infrastructure.
  • Install Voicegain as a containerized application on your GPU infrastructure - Datacenter or Pvt Cloud.

Heading 4

Achieve high accuracy at an awesome price 

Voicegain is a speech-to-text engine with great combination of accuracy and price.

Based of on our benchmarks on public data (see blog post for data details) the out-of-the-box accuracy is just a few % behind the Big 4 cloud speech recognizers.

We do this at an amazingly competitive price of $0.0095/minute.

Most importantly, we can build a custom acoustic model has been shown to gain about 10% more accuracy, all this at the same low price (custom model requires use commitments).



Base and Trained accuracy of Voicegain ASR compared to Big 4


Base and Trained accuracy of Voicegain ASR compared to Big 4

Voicegain Technolgy: Deep Neural Networks, Kubernetes Container Orchestration.

Voicegain’s Speech to Text engine utilizes multiple Deep Neural Network models running on modern GPUs to achieve high recognition accuracy. 


Voicegain can be deployed either as a containerized application on the Edge or accessed using our APIs on our modern cloud infrastructure. 

Read more

Voicegain Applications: Transcription (Speech-to-Text), Recogniton (IVR ASR), Web API

Voicegain supports both large vocabulary speech-to-text as well as recognition using context free grammars enabling applications like: 

  • Speech analytics and transcription – both real-time and offline 

  • Voice Bots/Assistants that allow users to speak to the application

  • Live Agent Assist and Speech IVR in call centers or help desks 

  • Embedding Speech-to-Text into products

Read more

Voicegain Difference: Edge Deployment, Custom Modes, Tunig Tools, MRCP and GRXML support
  1. Full featured APIs that cover most scenarios for your apps

  2. Telephony-ready APIs for Bots, Speech IVRs and Agent Assist

  3. Privacy and Control with Edge deployment 

  4. Custom Acoustic Model to improve accuracy 

  5. Stream audio over gRPC, SIP,  WebRTC, WebSockets

Read more

Pricing is based on usage of platform resources.
Volume discounts are available above 1M mins/month. Edge pricing has minimum revenue commitments. 
Click here for more information.


per minute
per hour
per minute
per hour
per minute
per hour

Get started with the Voicegain platform today!

Free Tier Offer: For a limited time, new accounts on Voicegain receive free 600 minutes* of monthly platform use. No credit card required if you sign up today. Transcribe recordings of calls, meetings or podcasts using our UI. Use the microphone capture to test our real-time model. If you are a developer, use our APIs (described below) with your app.

Features available immediately after sign-up

  • Transcription of audio files using Web Console

  • Real-time Microphone Capture using Web Console 

  • Speech-to-Text APIs - supports Web & Telephony (SIP/RTP)

  • Telephony Bot APIs - Build telephony based voice bots

  • Speech Analytics APIs - AHT optimization in Contact centers 

  • MRCP ASR - Speech recognition for VoiceXML platforms 

Have questions - visit Free Tier FAQs on our support website.


If you are interested in Edge Deployment for Speech IVR, check out our $9999 offer.

** Free 600 minutes applicable only to Speech-to-Text price. For realtime streaming, we require credit-card for billing of the Session Time

1505 LBJ Fwy, Ste 255

Dallas, TX 75234

Contact: 972-518-0863