Announcing the launch of Voicegain Whisper ASR/Speech Recognition API for Gen AI developers

Today we are really excited to announce the launch of Voicegain Whisper, an optimized version of Open AI's Whisper Speech recognition/ASR model that runs on Voicegain managed cloud infrastructure and accessible using Voicegain APIs. Developers can use the same well-documented robust APIs and infrastructure that processes over 60 Million minutes of audio every month for leading enterprises like Samsung, Aetna and other innovative startups like Level.AI, Onvisource and DataOrb.

The Voicegain Whisper API is a robust and affordable batch Speech-to-Text API for developersa that are looking to integrate conversation transcripts with LLMs like GPT 3.5 and 4 (from Open AI) PaLM2 (from Google), Claude (from Anthropic), LLAMA 2 (Open Source from Meta), and their own private LLMs to power generative AI apps. Open AI open-sourced several versions of the Whisper models released. With today's release Voicegain supports Whisper-medium, Whisper-small and Whisper-base. Voicegain now supports transcription in over multiple languages that are supported by Whisper. 

Here is a link to our product page

There are four main reasons for developers to use Voicegain Whisper over other offerings:

1. Support for Private Cloud/On-Premise deployment (integrate with Private LLMs)

While developers can use Voicegain Whisper on our multi-tenant cloud offering, a big differentiator for Voicegain is our support for the Edge. The Voicegain platform has been architected and designed for single-tenant private cloud and datacenter deployment. In addition to the core deep-learning-based Speech-to-text model, our platform includes our REST API services, logging and monitoring systems, auto-scaling and offline task and queue management. Today the same APIs are enabling Voicegain to processes over 60 Million minutes a month. We can bring this practical real-world experience of running AI models at scale to our developer community.

Since the Voicegain platform is deployed on Kubernetes clusters, it is well suited for modern AI SaaS product companies and innovative enterprises that want to integrate with their private LLMs.

2. Affordable pricing - 40% less expensive than Open AI 

At Voicegain, we have optimized Whisper for higher throughput. As a result, we are able to offer access to the Whisper model at a price that is 40% lower than what Open AI offers.

3. Enhanced features for Contact Centers & Meetings.

Voicegain also offers critical features for contact centers and meetings. Our APIs support two-channel stereo audio - which is common in contact center recording systems. Word-level timestamps is another important feature that our API offers which is needed to map audio to text. There is another feature that we have for the Voicegain models - enhanced diarization models - which is a required feature for contact center and meeting use-cases - will soon be made available on Whisper.

4. Premium Support and uptime SLAs.

We also offer premium support and uptime SLAs for our multi-tenant cloud offering. These APIs today process over 60 millions minutes of audio every month for our enterprise and startup customers.

About OpenAI-Whisper Model

OpenAI Whisper is an open-source automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. The architecture of the model is based on encoder-decoder transformers system and has shown significant performance improvement compared to previous models because it has been trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection.

OpenAI Whisper model encoder-decoder transformer architecture


Getting Started with Voicegain Whisper

Learn more about Voicegain Whisper by clicking here. Any developer - whether a one person startup or a large enterprise - can access Voicegain Whisper model by signing up for a free developer account. We offer 15,000 mins of free credits when you sign up today.

There are two ways to test Voicegain Whisper. They are outlined here. If you would like more information or if you have any questions, please drop us an email

Voicegain: Voice AI Under Your Control

Voicegain: Build Voice AI apps with our Speech-to-Text and LLM-powered NLU APIs. Record & Transcribe meetings, contact center calls, videos, etc. Get LLM-powered Summary, Sentiment and more. Build Conversational Voice Bots that integrate with your On-prem or cloud CCaaS platform. Get started today.

See how Voicegain works — get a demo of Voicegain today.

Sign up for an app today
* No credit card required.


Interested in customizing the ASR or deploying Voicegain on your infrastructure?

Contact Us → 
Voicegain - Speech-to-Text
Under Your Control