Speech Analytics Comparison: NER Capabilities & Accuracy
Updated: Jul 23
This post is the first in a series of posts that compares the performance of Voicegain Speech Analytics against Google and Amazon. This post compares the capabilities and accuracy of recognition/extraction of Named Entities. The Google APIs used for comparison were those under Cloud Natural Language and the Amazon APIs were under AWS Comprehend.
Named Entity Recognition (NER) or extraction of Named Entities is a one of the features of the Voicegain Speech Analytics API. Named Entities Recognition locates and classifies named entities in unstructured text that may be obtained e.g. from the transcription of the audio files. Although there is a lot of overlap between Google, Amazon and Voicegain with respect to the classification categories, there are also some significant differences which are summarized below.
Supported NER Categories
The full spreadsheet linked here shows the named entities extracted by the Voicegain Speech Analytics API and it compares them to the named entity categories available in Google and Amazon Comprehend APIs. Amazon has two NER API: Entity, and PII Entity.
If you look at the spreadsheet you will see that Amazon non-PII Entity API offers little granularity in the named entity categories. For example, it groups a lot of numerical named entities into single QUANTITY category. It groups dates and time (of day) into a single category DATE. On the other hand then PII Entity API has a lot of fine categories related items typically PII-redacted, but it misses a lot of other common entity categories.
Google API seems to cover the usual categories but misses some entities used in call-center application, e.g. CC, SNN, EMAIL>
A category that Voicegain does not support is OTHER. This category which is available in Google and Amazon requires additional application logic to interpret the string that it matches.
We have tested all 4 APIs on a set of call center calls.
The overall results show that Voicegain and Amazon non-PII PAI detect similar named entities (with the caveat that Amazon NER categories are less specific). Compared to these two, Google NER API misses more entities, but it also marks many additional words falling into the OTHER categories (which is generally is not very useful, at least not when analyzing call center calls.
Looking at the Amazon PII Entities we noticed that:
was good on NAME, BANK_ACCOUNT_NUMBER
EMAIL and PHONE worked mostly OK, but had some strange false positives
CREDIT_DEBIT_NUMBER had false positives (e.g. from phone) or partial matches
DATE_TIME was not picking all phrases that the description said this category should recognize
ADDRESS was working with mixed success - sometimes not picking clear address text or recognizing only part of it
EXPIRY_DATE had many false positives - combinations of 4 digits that clearly were not valid expiry dates
Where Voicegain has a matching entity category for AWS PII Entity it performed same or better. As you see it is difficult to summarize the results because the entities are not directly comparable. If you want to know how Voicegain NER will perform on your data we suggest you test the Voicegain Speech Analytics API which includes NER, keyword, phrase detection, sentiment analysis, etc.
For testing, you have two options:
You can create a free developer account on the Voicegain Platform. Here is how you can sign up. Once you sign up, please use the Transcribe+ feature. If you have any questions, please email us at email@example.com
You can also use the beta version of our Speech Analytics app and upload your 2 channel audio recording. To get access, please email us at firstname.lastname@example.org