that works

Speech Recognition Platform and APIs for Bots, IVR, Assistants, Speech Analytics, Transcription and more

Free Tier available

Build custom models trained on your voice data

  • Significantly enhance recognition accuracy by customizing the core acoustic DNN model.
  • Optimize speech-to-text to your context - audio quality, jargon, background noise, accents, etc.

Transcribe Speech and extract insights with NLU

  • Transcribe speech into text both realtime and offline
  • Extract analytics (keywords, topics, sentiment) from calls, meetings, podcasts (both live and recorded) 
  • Integrate with Contact Center and Unified Communication systems - both cloud & on-premise

Build Voice Bots, Voice Assistants or Speech IVRs 

  • Build a voice bot, speech IVR, voice assistants or real-time Agent Assist using our APIs
  • Supports both large vocabulary and speech grammars; integrates with multiple NLU engines
  • Stream audio realtime using multiple protocols - Websockets, gRPC, SIP/RTP, MRCP and WebRTC

Deploy on the Edge or access via Cloud  

  • Access Speech-to-text as a cloud service on our Google Cloud GPU infrastructure.
  • Install Voicegain as a containerized application on your GPU infrastructure - Datacenter or Pvt Cloud.

Heading 4

Amazing accuracy at an awesome price 

Voicegain is a speech-to-text engine with great combination of accuracy and price.

Based on an accuracy benchmark conducted in September 2020, Voicegain offers accuracy on average slightly better than Amazon Transcribe .


We do this at an amazingly competitive price of 1 cent/minute.

Most importantly, we can build a custom acoustic model that can perform even better.

You can read more about this dataset and our continuous improvements in accuracy in our blog post.



Voicegain’s Speech to Text engine utilizes multiple Deep Neural Network models running on modern GPUs to achieve high recognition accuracy. 


Voicegain can be deployed either as a containerized application on the Edge or accessed using our APIs on our modern cloud infrastructure. 

Read more


Voicegain supports both large vocabulary speech-to-text as well as recognition using context free grammars enabling applications like: 

  • Speech analytics and transcription – both real-time and offline 

  • Voice Bots/Assistants that allow users to speak to the application

  • Live Agent Assist and Speech IVR in call centers or help desks 

  • Embedding Speech-to-Text into products

Read more

  1. Full featured APIs that cover most scenarios for your apps

  2. Enhanced RTC APIs for Bots, Speech IVRs and Agent Assist

  3. Privacy and Control with Edge deployment 

  4. Custom Acoustic Model to improve accuracy 

  5. Stream audio over gRPC, SIP,  WebRTC, WebSockets

Read more

Pricing is based on usage of platform resources.
Volume discounts are available above 500K mins/month. Edge pricing has minimum revenue commitments. 
Click here for more information.


0.70 cents
per minute
0.95 cents
per minute
0.25 cents
per minute

Build something awesome with the Voicegain platform today!

Free Tier Offer: For a limited time, new accounts on Voicegain cloud receive free 600 minutes* of monthly platform use. No credit card required if you sign up today. Use our application to transcribe your audio files - call recordings, meetings or podcasts. If you are a developer, use our APIs to build an amazing Voice Bot or Assistant, integrate with NLU Engines & SIP telephony, or build a call or meeting analytics app to analyze your audio. 

Features available immediately after sign-up

  • Full set of RESTful Speech-to-Text (STT) APIs

  • Realtime streaming with gRPC, Web-sockets and MRCP

  • Transcription of audio using Voicegain User Interface

  • Live Transcription with Broadcast via Web-sockets  

  • RTC Callback APIs for Telephony Apps (STT integrated with SIP/RTP sessions)

Early access Alpha features

  • Speech Analytics APIs - for contact centers

Have questions - visit Free Tier FAQs on our support website.


If you are interested in Edge Deployment for Speech IVR, check out our $9999 offer.

** Free 600 minutes applicable only to Speech-to-Text price. For realtime streaming, we require credit-card for billing of the Session Time

1505 LBJ Fwy, Ste 255

Dallas, TX 75234

Contact: 972-518-0863

Contact Us