By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.
Our Blog

News, Insights, sample code & more!

ASR,Benchmark
Speech-to-Text Accuracy Benchmark - June 2022

It has been over 7 months since we published our last speech recognition accuracy benchmark. Back then the results were as follows (from most accurate to least): Microsoft and Amazon (close 2nd), then Voicegain and Google Enhanced, and then, far behind, IBM Watson and Google Standard.

Since then we have obtained more training data and added additional features to our training process. This resulted in a further increase in the accuracy of our model.

As far as the other recognizers are concerned:

  • Microsoft and Amazon both improved, with Microsoft improving a lot on the more difficult files from the benchmark set
  • Google has released a new model "latest-long" which is quite a bit better than the previous Google's best Video Enhanced model. Accuracy of Video Enhanced stayed pretty much unchanged.

We have decided to no longer report on Google Standard and IBM Watson accuracy, which were always far behind in accuracy.


Methodology

We have repeated the test using similar methodology as before: used 44 files from the Jason Kincaid data set and 20 files published by rev.ai and removed all files where none of the recognizers could achieve a Word Error Rate (WER) lower than 25%.

This time only one file was that difficult. It was a bad quality phone interview (Byron Smith Interview 111416 - YouTube).

The Results

You can see boxplots with the results above. The chart also reports the average and median Word Error Rate (WER)

All of the recognizers have improved (Google Video Enhanced model stayed much the same but Google now has a new recognizer that is better).

Google latest-long, Voicegain, and Amazon are now very close together, while Microsoft is better by about 1 %.

Best Recognizer

Let's look at the number of files on which each recognizer was the best one.

  • Microsoft was best on 35 out of the 63 files
  • Amazon was best on 15 files (note that in the October 2021 benchmark Amazon was best on 29 files).
  • Voicegain was close behind Amazon by being best on 12 audio files
  • Google latest-long was best on 4
  • Google Video Enhanced wins a participation trophy by being best on 1 file, which was a very easy "The Art of War by Sun Tzu Full" Librivox Audiobook - WER of 1.79%

Note, the numbers do not add to 63 because there were a few files where two recognizers had identical results (to 2 digits behind comma).

Improvements over time

We now have done the same benchmark 4 times so we can draw charts showing how each of the recognizers has improved over the last 1 year and 9 months. (Note for Google the latest result is from latest-long model, other Google results are from video enhanced.)

You can clearly see that Voicegain and Amazon started quite bit behind Google and Microsoft but have since caught up.

Google seems to have the longest development cycles with very little improvement since Sept. 2021 till very recently. Microsoft, on the other hand, releases an improved recognizer every 6 months. Our improved releases are even more frequent than that.

As you can see the field is very close and you get different results on different files (the average and median do not paint the whole picture). As always, we invite you to review our apps, sign-up and test our accuracy with your  data.

Out-of-the-box accuracy is not everything

When you have to select speech recognition/ASR software, there are other factors beyond out-of-the-box recognition accuracy. These factors are, for example:

  • Ability to customize the Acoustic Model - Voicegain model may be trained on your audio data - we have several blogposts describing both research and real use-case model customization. The improvements can vary from several percent on more generic cases, to over 50% to some specific cases, in particular for voicebots.
  • Ease of integration - Many Speech-to-Text providers offer limited APIs especially for developers building applications that require interfacing with  telephony or on-premise contact center platforms.
  • Price - Voicegain is 60%-75% less expensive compared to other Speech-to-Text/ASR software providers while offering almost comparable accuracy. This makes it affordable to transcribe and analyze speech in large volumes.
  • Support for On-Premise/Edge Deployment - The cloud Speech-to-Text service providers offer limited support to deploy their speech-to-text software in client data-centers or on the private clouds of other providers. On the other hand, Voicegain can be installed on any Kubernetes cluster - whether managed by a large cloud provider or by the client.

Take Voicegain for a test drive!

1. Click here for instructions to access our live demo site.

2. If you are building a cool voice app and you are looking to test our APIs, click here to sign up for a developer account  and receive $50 in free credits

3. If you want to take Voicegain as your own AI Transcription Assistant to meetings, click here.

Read more → 
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Benchmark
Speech-to-Text Accuracy Benchmark - October 2021

[UPDATE 1/23/22: After training on additional data, the Voicegain recognizer now achieves an average WER of 11.89% (an improvement of 0.35%) and a median WER of 10.82% (an improvement of 0.21%) on this benchmark.

Voicegain is now better than Google Enhanced on 44 files (previously 39).

Voicegain is now the most accurate recognizer on 12 of the files (previously 10).

We have additional data on which we will be training soon and will then provide a complete new set of results and comparison.]

It has been over 4 months since we published our last speech recognition accuracy benchmark. Back then the results were as follows (from most accurate to least): Amazon and Microsoft (close 2nd), then Google Enhanced and Voicegain (also close 4th) and then, far behind, IBM Watson and Google Standard.

Since then we have tweaked the architecture of our model and trained it on more  data. This resulted in a further increase in the accuracy of our model. As far as the other recognizers are concerned, Microsoft improved the accuracy of their model the most, while the accuracy of others stayed more or less the same.

Methodology

We have repeated the test using similar methodology as before: used 44 files from the Jason Kincaid data set and 20 files published by rev.ai and removed all files where the best recognizer could not achieve a Word Error Rate (WER) lower than 25%. Note: previously, we used 20% as the threshold, but this time we decided to keep more files with low accuracy to illustrate the differences on that type of files between recognizers.  

Only three files were so difficult that none of the recognizers could achieve 25% WER. The two removed files were both radio phone interviews with bad quality of the recording.

Voicegain now better than Google Enhanced

As you can see in the results chart above, Voicegain is now better than Google Enhanced, both on average and median WER. Looking at the individual files the results also show the Voicegain accuracy is in most of the case better than Google:

  • Voicegain was better than Google Enhanced on 39 files
  • Google Enhanced was better on 20 files
  • They were tied on 2 files.

Other results

Key observations about other results:

  • When you consider the average and median WER then Voicegain looks tied with Amazon having the median value better by 0.07% but the average value worse by 0.76%
  • When you consider the average and median WER then Microsoft recognizer is better than Amazon with average better by 0.49% and median better by 0.69%
  • When you look at the individual audio files the best scoring recognizers were:
  • Amazon - was best on 29 files
  • Microsoft - was best on 20 files
  • Voicegain - was best on 10 files
  • Google Enhanced - was best on 2 files

As you can see the field is very close and you get different results on different files (the average and median do not paint the whole picture). As always, we invite you to review our apps, sign-up and test our accuracy with your  data.

Out-of-the-box accuracy is not everything

When you have to select speech recognition/ASR software, there are other factors beyond out-of-the-box recognition accuracy. These factors are, for example:

  • Ability to customize the Acoustic Model - Voicegain model may be trained on your audio data - we have demonstrated improvement in accuracy of 7-10%. In fact for one of our customers with adequate training data and good quality audio we were able achieve a WER of 0.5% (99.5% accuracy)
  • Ease of integration - Many Speech-to-Text providers offer limited APIs especially for developers building applications that require interfacing with  telephony or on-premise contact center platforms.
  • Price - Voicegain is 60%-75% less expensive compared to other Speech-to-Text/ASR software providers while offering almost comparable accuracy. This makes it affordable to transcribe and analyze speech in large volumes.
  • Support for On-Premise/Edge Deployment - The cloud Speech-to-Text service providers offer limited support to deploy their speech-to-text software in client data-centers or on the private clouds of other providers. On the other hand, Voicegain can be installed on any Kubernetes cluster - whether managed by a large cloud provider or by the client.

Take Voicegain for a test drive!

1. Click here for instructions to access our live demo site.

2. If you are building a cool voice app and you are looking to test our APIs, click here to sign up for a developer account  and receive $50 in free credits

3. If you want to take Voicegain as your own AI Transcription Assistant to meetings, click here.

Read more → 
Benchmark
Voice Bot
How to build a Voicebot using Voicegain, Twilio, RASA, and AWS Lambda

You can find the complete code (minus the RASA logic - you will have to supply your own) at our github repository.

What does it do ?

The setup allows you to call a phone number and then interact with a Voicebot that uses RASA as the dialog logic engine.

How does it work ?

The Components

  • Twilio Programmable Voice - We configure a Twilio phone number to point to a TwiML App that has the AWS Lambda function as the callback URL.
  • AWS Lambda function - a single Node.js function with an API Gateway trigger (simple HTTP API type).
  • Voicegain STT API - we are using /asr/transcribe/async api with input via websocket stream and output via a callback. Callback is to the same AWS Lambda function but Voicegain callback is POST while Twilio callback is GET.
  • RASA - dialog logic is provided by RASA NLU Dialog server which is accessible over RestInput API.
  • AWS S3 for storing the transcription results at each dialog turn.

November 2021 Update: We do not recommend S3 and AWS Lambda for a production setup. A more up to date review of various options to build a Voice Bot is described here. You should consider replacing the functionality of S3 and AWS Lambda with a web server that is able to maintain state - like Node.js or Python Flask.

The Steps

The sequence diagram is provided below. Basically, the sequence of operations is as follows:

  1. Call a Twilio phone number
  2. Twilio makes an initial callback to the Lambda function
  3. Lambda function sends "Hi" RASA and RASA responds with the initial dialog prompt
  4. Lambda function calls Voicegain to start an async transcription session. Voicegain responds with a url of a websocket for audio streaming
  5. Lambda function responds to Twilio with a TwiML command <Connect><Stream> to open a Media Stream to Voicegain. The command will also contain the text of the question prompt.
  6. Voicegain uses TTS to generate from the text of the RASA question an audio prompt and streams it via websocket to Twilio for playback
  7. The Caller hears the prompt and says something in response
  8. Twilio streams caller audio to Voicegain ASR for speech recognition
  9. Voicegain ASR transcribes the speech to text and makes a callback with the result of transcription to Lambda function
  10. Lambda function stores the transcription result in S3
  11. Voicegain closes the websocket session with Twilio
  12. Twilio notices end of session with ASR and makes a callback to Lambda function to find out what to do next
  13. Lambda function retrieves result of recognition from S3 and passes it to RASA.
  14. RASA processes the answer and generates next question in the dialogue
  15. We continue next turn same as in step 4.



Read more → 
Voice Bot
Voice Bot
Building Voice-bots using Voicegain Speech-to-Text/ASR

The purpose of this blog post is to further elaborate on other posts in which we described various ways you can build a Voice Bot using Voicegain ASR/Speech-to-Text. We also plan to announce a new feature that will soon make Voice Bot development even easier.

Just a quick recap - what is a Voice Bot? A Voice Bot allows users to speak freely and naturally in response to questions asked by the Bot. It can extract multiple "intents" from what a customer says and can respond intelligently. By implementing Voice bots, customers can retire their legacy IVRs and also use a unified Bot platform to power both chatbots and Voice Bots.

It is important to note that Voicegain ASR/Speech-to-Text only provides the "mouth" and the "ear"  of the Voice Bot. For building the bot logic and all the back-end integrations (i.e., the brains), a developer has to select a bot framework like Google Dialogflow, RASA, Kore.ai, Microsoft Azure Bot Service, or AWS Lex.

So here are ways you can build a Voice Bot.

1. Using Twilio and Voicegain Telephony Bot API

This method is described in the blog post: How to build a Voicebot using Voicegain, Twilio, RASA, and AWS Lambda

The important thing to note is that the described setup of using AWS Lambda and S3 to handle the callbacks is for demo purpose only and not ideal for production deployment. The callback server has to be able to handle callbacks from Twilio and from Voicegain and pass information between the two. Because AWS Lambda is stateless the information is being passed in this example via S3 - it makes the end-to-end process slow because of the need for polling. That will not provide a fast response time for your Voice Bot.

For a production-ready setup we suggest replacing AWS Lambda and S3 with a proper web-server that is able to maintain session state - you could use Node.js or Python Flask for that.

2. Using inbuilt CPaaS of Voicegain Telephony Bot API

This method is described in the blog post: Easy How-To: Build a Voicebot using Voicegain, RASA, and AWS Lambda

This is easier than the method described above. The Voicegain Telephony Bot API uses the Amazon Chime CPaas to provide the functionality otherwise provided by Twilio and this is internally integrated with Voicegain STT API. It uses callbacks, so it needs an intermediate web-service to handle the interaction with a bot platform, e.g. RASA. This web-service may be stateless because Telephone Bot API is capable of maintaining state information.

The example described in the above blog post uses SIP Trunks and phone numbers provided by Amazon Chime which is embedded as part of Voicegain Telephony Bot API. If you would rather retain your CPaaS/Telephony provider (e.g. SignalWire, Twilio, Telnyx, or Bandwidth.com) you can do that and connect to the Telephone Bot API using SIP INVITE. This is described in the blog post: SIP INVITE Voicegain from Twilio, SignalWire, Telnyx CPaaS

3. Using AudioCodes Voice AI Connect

This method is described in the blog post: Voicegain announces integration with Audiocodes Voice AI connect.

AudioCodes VoiceAI Connect (VAIC) enables enterprises to connect a bot framework and speech services, such as text-to-speech (TTS) and speech-to-text (STT), to the enterprises’ voice and telephony channels to power Voice Bots, conversational IVRs and Agent Assist use-cases.

AudioCodes provides native integration with Bot Frameworks like Kore.ai, Google Dialogflow and Microsoft Bot Framework.

4. Directly connect from Telephone Bot API to your Bot Platform

This setup allows you to directly specify a Voice Bot endpoint instead of specifying a generic http callback destination. The benefit of this is that you do not have to deal with having to provide the callback web-service. Notice that in this setup any back-end requests from your application logic to e.g. data services will now need to be done from the bot platform.

The bot platforms that we already support are RASA and Google Dialogflow. We are currently working on integrating with Microsoft Bot Framework. We hope to have this integration finished in time for the first release of Voicegain-Bot Platform integration. We also plan to very soon work on an integration with Kore.ai.


Read more → 
Voice Bot
Edge
4 ways to integrate FreeSWITCH with Voicegain Speech-to-Text

FreeSWITCH is a very capable telephony platform suitable for building various telephony applications. Some of those applications will rely speech-to-text conversion, for example: ACDs (automatic call distribution), IVRs, Voice-Bots, Real-Time Agent Assist, real-time conference call transcription, call monitoring, etc.

Voicegain Speech-to-Text platform can be used with FreeSWITCH in a variety of ways.

1. mod_unimrcp for IVRs

Voicegain STT platform has supported MRCP (Media Resource Control Protocol) for a long time now. Our ASR can be accessed using MRCP and we support both grammar-based recognition (e.g. GRXML) and large-vocabulary transcription. MRCP is a communication protocol designed to connect telephony based IVRs and Voice Bots with speech recognizers (ASR) and speech synthesizers (TTS).

FreeSWITCH can interact with MRCP based recognizers using the included mod_unimrcp module. Voicegain STT has been tested with mod_unimrcp and interfaces with it without problems. You can learn more about using Voicegain STT via mod_unimrcp in this blog post.  

Voicegain supports MRCP both in the Cloud and on the Edge (on-prem). We will soon be releasing in OpenSource a recognizer plugin for unimrcp server that will give you even more options in deploying FreeSWITCH with Voicegain and MRCP.

2. Bridge into Voicegain Telephony Bot API

Voicegain provides a Telephony Bot API which is a callback API - similar in style to Twilio TwiML. You can place a call to Voicegain endpoint either using a phone number obtained from Voicegain or using a SIP endpoint unique to your Voicegain application. When a call arrives you will get a web callback and the response you will provide will determine actions that the Voicegain platform will perform, like e.g. play a prompt, recognize speech, detect DTMF, etc.

You can learn more about this API from the following blog posts:

If you have a FreeSWITCH application and you would like to recognize spoken speech you can bridge into Voicegain SIP endpoint and in a callback specify a prompt and the type of speech capture (grammar-based or large vocabulary). Once the recognition finishes you will get a callback and then you can either issue a disconnect command which will transfer call flow back to your Freeswitch app, or you can continue with additional questions and recognitions on Voicegain platform as needed.

Below is an example of a simple interaction with 4 participants:

  • FreeSWITCH
  • Your control logic for FS application, e.g., a Lua script
  • Webservice that will handle callbacks from Voicegain Telephone Bot API. It has to be able to maintain session data.
  • Voicegain Telephone Bot API platform



3. mod_voicegain for using Voicegain ASR from FS apps/scripts

This is still not Generally Available - please contact us if you are interested in testing.

mod_voicegain will give you capabilities similar to using mod_unimrcp with Voicegain but without the whole overhead of using an MRCP protocol - mod_voicegain talks directly to Voicegain ASR.

mod_voicegain taps into the FreeSWITCH inbound audio stream and sends the audio data to Voicegain ASR in the Cloud or on the Edge. Voicegain ASR processes the audio according to the invocation parameters specified in the data argument. It then communicates the result of transcription or recognition in an Event.

mod_voicegain installs on FreeSWITCH as an app and can be invoked as a such, e.g.:

<action application="vg_asr_start" data=""/>

or from LUA script:

session:execute("vg_asr_start", "");

Results will always be returned as a FreeSWITCH event but it is also possible to get the results in a callback to the url specified in callback.uri

The FreeSWITCH event will be of custom type (Event-Name: CUSTOM)  and Event-Subclass will be "voicegain_asr_update". The relevant payload will be in the  "ASR-Response" field formatted as JSON.

You can read more about mod_voicegain is this Knowledge Base article.


4. mod_vg_tap for real-time transcription

mod_vg_tap has been developed with applications like Real-Time Agent Assist in mind. These apps need access to the audio stream from a FreeSWITCH call but do not otherwise need to interact with FreeSWITCH (unlike IVR and Voice-Bots).

mod_vg_tap installs as an app and has simple commands to start/stop streaming to Voicegain Speech-to-Text engine.

The start command can specify the following destinations:

  • websocket URL(s) - returned from a POST command that starts new speech-to-text session
  • socket IP:port for socket communication - this is only supported for Voicegain deployed on Edge (on-prem)
  • (on the roadmap) - complete JSON body to start a new speech-to-text session and start streaming to it

The results from transcription are generally not returned to a FreeSWITCH app but will be delivered to the destination specified when starting speech-to-text session - the results can be delivered via websocket, polling, or callback.

If you want more information about any of these methods of integrating Voicegain with FreeSWITCH, please email us at support@voicegain.ai.


Read more → 
Edge
Announcement
Onvisource partners with Voicegain for ASR on the Edge powered by NVIDIA GPUs

Dallas, Texas - October 26, 2021: OnviSource, a leading provider of intelligent automation solutions for workforce optimization, contact center operation analytics and automation, customer experience management, and business process automation, announced today a strategic partnership with Voicegain, an innovative Speech-to-Text/ASR company. OnviSource has integrated Voicegain’s deep learning-based speech-to-text platform into its Intellecta™ multichannel analytics solution which utilizes speech-to-text and natural language understanding to analyze customer interactions and audio-based content to discover actionable knowledge and extract business insights.

OnviSource will leverage the Voicegain platform to serve its growing enterprise client base from various industries such as nationwide wireless service providers, banking, financial services, utilities, insurance and others.

“We are pleased to announce this partnership with Voicegain as their AI-driven ASR further augments our AI-driven intelligent automation solutions and our hyper-automation platform that offers integrated AI, conversational AI, RPA, BPA and analytics,” said Ray Naeini, Chairman and CEO of OnviSource. “Our partnership will allow both companies to jointly develop highly sophisticated and customized AI models for various applications and industries in order to deliver unmatched accuracy and performance.”

To achieve high performance, OnviSource deployed the Voicegain ASR Engine on servers with NVIDIA GPUs in its data center. This architecture is referred to as an Edge deployment. While Voicegain also offers a multi-tenant cloud solution, an Edge deployment architecture has two important benefits for OnviSource.

The first major benefit is that it allows OnviSource to meet strict customer contractual commitments related to data privacy, security and control. The second benefit is that it delivers approximately a 75% reduction in costs for OnviSource compared to usage-based pricing models provided by other providers, empowering OnviSource to offer its feature-rich solutions at highly affordable and flexible prices.

“We are excited to be selected by OnviSource for its call center and enterprise speech analytics products. This decision validates the ‘3As’ on which Voicegain differentiates itself in the ASR market – Accuracy, Affordability and Accessibility,” said Arun Santhebennur, Co-founder & CEO of Voicegain. “Our joint product enhancements will deliver highly accurate Speech-to-Text models for complex business applications.”

Selection of the Voicegain product by OnviSource was based on comprehensive trials and pilot programs related to accuracy, performance and applicability of Voicegain’s product, combined with detailed comparative analysis with other products in the market.

Additionally, the Voicegain product offers simplicity in deployment and usage as the entire platform is deployed on a Kubernetes cluster. Its Edge deployment offers a simple script to download and deploy all the packages and dependencies on any server with NVIDIA GPUs.

About OnviSource

For more than a decade, OnviSource has enabled several hundred small-to-large companies across a broad range of industries to cost-effectively manage, automate and improve their customer experience and business processes by offering advanced solutions in multichannel data and media capture, unification, analysis, decision making and automation for their entire enterprise, including their contact centers, back offices and IT organizations.

OnviSource ia.Enterprise Intelligently Automated (IA) solutions offer Workforce Optimization and Workforce Management (WFO/WFM), inclusive Teleservice Customer Engagement Management, Multichannel Customer Engagement Analytics, intelligently automated Customer Survey, Process Automation through Robotic Process Automation (RPA) and Intelligent Process Automation (IPA) and Intelligent Virtual Agent (IVA). The Company delivers its solutions as software products, cloud or Software-as-a-Service (SaaS), managed services, or any combination. OnviSource’s special Advantage Platinum program assures that solutions work for customers’ specific needs by offering a series of customer assistance programs with no obligations. These programs include consultation, proof-of-concept  and hands-on operation assistance. OnviSource is headquartered in Plano, Texas (North Dallas area), with an additional operation center in Oklahoma.

About Voicegain

Voicegain is a deep neural network-based Speech-to-Text platform that is focused on developers of voice applications. Voicegain offers a full suite of APIs, SDKs and SaaS apps on top of its platform to automate and analyze voice-based interactions in contact centers, sales and meetings. To learn more, visit Voicegain.ai or create a free account to get started.

Press Contact:Voicegain: Arun Santhebennur, CEO

arun@voicegain.ai


OnviSource: Deborah Cromwell, Marketing Manager

deborah.cromwell@onvisource.com

Read more → 
Announcement
Voice Bot
Easy How-To: Build a Voicebot using Voicegain, RASA, and AWS Lambda

One of the previous blog posts described a Voice Bot built using Twilio, Voicegain, RASA, and AWS Lambda. Twilio was used for telephony (phone numbers, SIP Trunking, TwiML for call control) Voicegain provided the ASR/speech recognition, while AWS Lambda was coordinating the actions. The setup works but is involved. The need to pass the speech recognition results via S3 (as Lambda is stateless and does not have memory between function calls) may occasionally cause delays in requests and responses.

Simple Inbuilt CPaaS Option

Voicegain now integrates with Amazon Chime Voice Connector to offer a pay as you go SIP Trunking service directly from the Voicegain web console. You can also purchase phone numbers and receive inbound calls. Support for making outbound Speech IVR calls is in the works.

Of course, we continue to support developer that use Twilio and SignalWire using simple SIP INVITE - this blog describes how.  

How does it work ?

The Components
  • AWS Lambda function - a single Node.js function with an API Gateway trigger (simple HTTP API type).
  • Voicegain Telephony Bot API - the Telephony Bot API  works with web callbacks. For Twilio and SignalWire developers, it is similar to working with Twilio TwiML and SignalWire LaML respectively.
  • RASA - dialog logic is provided by RASA NLU Dialog server which is accessible over RestInput API.
The Steps

The sequence diagram is provided below. It is very simple. Basically, the sequence of operations is as follows:

  1. Call a phone number provided by Voicegain (powered by Amazon Chime Voice Connector)
  2. Voicegain Telephony Bot API makes call to a callback function on AWS Lambda.
  3. Lambda function sends "Hi" RASA and RASA responds with the initial dialog prompt
  4. Lambda function responds to Voicegain callback with the prompt received from RASA and tell Voicegain Speech-to-Text to capture callers response.
  5. Voicegain uses TTS to generate from the text of the RASA question an audio prompt and plays it over the telephone to the caller
  6. The Caller hears the prompt and says something in response
  7. Voicegain ASR transcribes the speech to text and makes a callback with the result of transcription to Lambda function
  8. Lambda function invokes RASA and passes to it the text of the response.
  9. RASA processes the answer and generates next question in the dialogue
  10. We continue next turn same as in step 4.

The sample code for the Lambda function (in python and node.js versions) is available on our github.



Take Voicegain for a test drive!

1. Click here for instructions to access our live demo site.

2. If you are building a cool voice app and you are looking to test our APIs, click hereto sign up for a developer account  and receive $50 in free credits


Read more → 
Voice Bot
Category 1
This is some text inside of a div block.
by Jacek Jarmulak • 10 min read

Donec sagittis sagittis ex, nec consequat sapien fermentum ut. Sed eget varius mauris. Etiam sed mi erat. Duis at porta metus, ac luctus neque.

Read more → 
Category 1
This is some text inside of a div block.
by Jacek Jarmulak • 10 min read

Donec sagittis sagittis ex, nec consequat sapien fermentum ut. Sed eget varius mauris. Etiam sed mi erat. Duis at porta metus, ac luctus neque.

Read more → 
Category 1
This is some text inside of a div block.
by Jacek Jarmulak • 10 min read

Donec sagittis sagittis ex, nec consequat sapien fermentum ut. Sed eget varius mauris. Etiam sed mi erat. Duis at porta metus, ac luctus neque.

Read more → 
Category 1
This is some text inside of a div block.
by Jacek Jarmulak • 10 min read

Donec sagittis sagittis ex, nec consequat sapien fermentum ut. Sed eget varius mauris. Etiam sed mi erat. Duis at porta metus, ac luctus neque.

Read more → 
Category 1
This is some text inside of a div block.
by Jacek Jarmulak • 10 min read

Donec sagittis sagittis ex, nec consequat sapien fermentum ut. Sed eget varius mauris. Etiam sed mi erat. Duis at porta metus, ac luctus neque.

Read more → 
Category 1
This is some text inside of a div block.
by Jacek Jarmulak • 10 min read

Donec sagittis sagittis ex, nec consequat sapien fermentum ut. Sed eget varius mauris. Etiam sed mi erat. Duis at porta metus, ac luctus neque.

Read more → 
Sign up for an app today
* No credit card required.

Enterprise

Interested in customizing the ASR or deploying Voicegain on your infrastructure?

Contact Us →