PII Text and Audio Redaction now available in Speech Analytics API
Our latest release (1.24.0) expands Voicegain Speech Analytics and Transcription API with ability to redact sensitive data both in transcript and in audio. This allows our customers to be compliant with standards like HIPAA, GDPR, CCPA, PCI or PIPEDA.
Any of the following types of Named Entities can be redacted in transcript text and/or the audio file.
ADDRESS - Postal address.
CARDINAL - Numerals that do not fall under another type.
CC - Credit Card
DATE - Absolute or relative dates or periods.
EMAIL - (coming soon) Email address
EVENT - Named hurricanes, battles, wars, sports events, etc.
FAC - Buildings, airports, highways, bridges, etc.
GPE - Countries, cities, states.
NORP - Nationalities or religious or political groups.
MONEY - Monetary values, including unit.
ORDINAL - "first", "second", etc.
ORG - Companies, agencies, institutions, etc.
PERCENT - Percentage, including "%".
PERSON - People, including fictional.
PHONE - (coming soon) Phone number.
QUANTITY - Measurements, as of weight or distance.
SSN - Social Security number
TIME - Named documents made into laws.
ZIP - (coming soon) Zip Code (if not part of an Address)
In the audio they are replaced with silence and in the transcript they are replaced with a string specified when making the API request.
This feature is supported both in Cloud and on the Edge (on-prem).
Two typical use cases are:
Enable redaction as part of normal processing, of e.g. call center calls
Do a bulk processing of previously underacted audio in storage to achieve compliance. Combined with low per minute price of Voicegain APIs, this allows our customers to cost effectively process large qualities of audio data.