Explore of our key APIs or sign up to access the complete set
This article describes how users on free or unpaid Zoom plans can get AI generated meeting transcripts, summaries, and actions items.
There are many compelling generative AI powered SaaS offerings for transcription, summarization, and action item extraction for Zoom Meetings. These include companies like Otter, Grain, Read, Fireflies, Krisp, Superhuman and others. However, all these cloud-based Meeting AI SaaS solutions require paid Zoom accounts – and this because they integrate with Zoom Cloud recording which is a feature in the paid Zoom plan.
Now paid Zoom plans are quite affordable – the Pro Zoom plan(as of the date of this post) is priced at $16/month. However, many businesses – whether they are a small startup, a mid-size business or an enterprise customer– use free Zoom plans for a vast majority of the employees in the company. In speaking with prospective customers, we estimate that for many businesses only 5 – 10% of the employee base has a paid Zoom plan.
Meetings on a free Zoom plan can only be up to 40 minutes – which is adequate for most meetings. Hence it works quite well for a large segment of users. Now if these meetings need to be transcribed and summarized, users would need to upgrade to a paid plan. For many businesses, since 90%+of the users are on free Zoom plans, upgrading all them to a paid plan can be a very significant expense.
Voicegain Transcribe is an AI meeting assistant that integrates with Zoom Local Recording. Zoom Local Recording allows users to save the Zoom recording to their local computer instead of Zoom’s Cloud. A big advantage of Zoom Local Recording is that it is available on free Zoom plans. As a result, there is no need to upgrade to a paid Zoom license. Voicegain Transcribe also has a free tier that is good for 5 hours (300 minutes) every month. As a result, users that host or attend up to 10 half-hour Zoom meetings can get transcription and LLM-powered insights like summarization and action item extraction for free.
Of-course, the other major benefit of local recording is data privacy. Many businesses do not like to store sensitive meeting content on Zoom’sCloud or for that matter on any another vendor’s cloud – but they are forced to do so because of lack of options. Especially in the age of AI and LLMs, there is a lot of concern and paranoia around proprietary information being used to train AI models.
While any business can started a trial with Voicegain’s multi-tenant cloud SaaS offering, our entire solution can be deployed as a single-tenant solution in your private cloud. Voicegain transcribe can operate fully independently - without the need to connect to our cloud for any service.
You can get started and evaluate our offering by clicking here. As shared above, we offer 5 hours (300 minutes) of free transcription and LLM powered summarization every month.
If you have any questions, please send us an email to email@example.com
This article describes ideas for a business with a speech-enabled IVR to plan its upgrade/transition to a modern generative AI powered conversational Voice Bot on its own timeline and at an affordable cost.
Businesses of all sizes have an IVR system that acts as a front-door for all their customer voice conversations. In terms of functionality, these IVRs systems vary widely; they can range from performing basic call-routing and triaging to automating simple calls - like taking payments, scheduling appointments, or providing account balance etc. While most of them accept touch-tone/DTMF as input, the more advanced ones also accept natural language speech as input and hence referred to as speech-enabled IVRs.
However these IVRs are getting obsolete and there is a growing demand to upgrade to a more conversational experience.
Traditionally Speech IVR applications were deployed on-premise; built on the same platform as the main contact center ACD/Switch. But soon, IVRs were deployed on the Cloud too. The on-premise IVR vendors include Avaya, Genesys and Cisco and cloud-based IVRs include vendors like Five9, RingCentral, Mitel and 8x8.
For speech recognition, the most popular option in the past had been Nuance. Nuance’s ASR technology – which gained popularity in the early2000s - preceded today’s neural-network-based engines. It was pre-Alexa and pre-Siri– and so both the vocabulary (i.e what the customer could actually say in response to a prompt) and the accuracy was limited compared to today’s neural-network-based speech-to-Text. In addition, the protocol for communication between Nuance and the telephony stack was MRCP – a protocol that is not being actively developed for many years now.
Modern Conversational AI Stack for Voice Bots include a modern neural ASR/Speech-to-Text engine and neural Text-to-Speech and a NLU based Bot Framework. It is much more capable than what was available to build directed dialog Speech IVRs in the past.
Today’s neural ASR/STT engines can transcribe not just a few words or phrases, but entire sentences and they also do it very accurately. As consumers get used to such experiences with their voice assistants at home or in their cars, they expect the same when they contact a business over the phone.
There also been significant advances with modern no-code NLU Bot frameworks that are used to build the Bot Logic and conversation flow. These Bot frameworks are also evolving with the advent of generativeAI technologies like ChatGPT.
While the above two paragraphs describe good reasons to upgrade IVRs, there are some key factors that are driving a rather rushed timeline for businesses to plan this IVR migration
Companies with on-premise Contact Centers are increasingly migrating to the Cloud. Even the on-premise contact center vendors too are focused on migrating their install base to the Cloud. So when an enterprise plans to migrate the contact center platform to the cloud, they would need to migrate the IVRs too.
As explained above, modern AI/neural-network-based ASR/STT engines are more accurate and support a conversational experience. Hence ASR/STT vendors are focused on selling these newer offerings. It is not possible for businesses to use these newer ASRs with existing telephony stack. Both the protocol support (Web sockets and gRPC vs MRCP) and the application development method (grammar based vs. large vocabulary transcription with intent capture) are very different.
In the past companies built the application logic for Chatbot and IVR independently; very often different vendors provided the Chatbot and VoiceBot. However, given the powerful and flexible Conversational AI platforms that are available in the market, they want to use the same platform to drive the conversation turns of a Chatbot interaction and a Voice Bot interaction.
As explained above, migrating from the traditional IVR stack to a modern Conversational AI stack entails not just rewriting the application logic but it is also likely to involve moving the infrastructure from on-premise to the cloud. This can be an expensive undertaking.
At Voicegain, we think that can help companies should be able to this at their own timeline.
We have developed an ASR that can support both (a) grammar-based recognition using MRCP and (b) large vocabulary transcription on audio streamed using modern protocols like Websockets. Also our platform can be deployed on-premise or in your VPC. So our platform supports both an existing application without any rewrite while also being capable of supporting a conversation voice bot when it is developed at some point in the future.
As a result, customers can take control of when to migrate/upgrade their IVRs. Most importantly, they would not be forced into invest in an upgrade/migration of their entire IVR application just because an existing ASR vendor would stop supporting an older version of the software.
If you have any questions or you would like to schedule a discussion to understand your IVR upgrade options, contact us on firstname.lastname@example.org.
To test our MRCP grammar-based ASR or our large vocabulary ASR, please sign up for a free developer account. Instructions are provided here.