Text classification
Given the following text input:
Please update my records with the following information: Email address: foo@example.com National Provider Identifier: 1245319599 Driver's license: AC333991
The output is a list of findings, organized into the following categories:
-
InfoType -
Likelihood -
Offset(Where in the string the potentialInfoTypewas found)
Example output is shown in the table below.
InfoType
|
Likelihood
|
Offset
|
|---|---|---|
US_HEALTHCARE_NPI
|
VERY_LIKELY
|
122
|
EMAIL_ADDRESS
|
LIKELY
|
72
|
US_DRIVERS_LICENSE_NUMBER
|
LIKELY
|
155
|
CANADA_BC_PHN
|
VERY_UNLIKELY
|
122
|
UK_TAXPAYER_REFERENCE
|
VERY_UNLIKELY
|
122
|
CANADA_PASSPORT
|
VERY_UNLIKELY
|
155
|
Automatic text redaction
Automatic redaction produces an output with sensitive data matches removed instead of giving you a list of findings.
Example automation redaction input:
Please update my records with the following information: Email address: foo@example.com National Provider Identifier: 1245319599 Driver's license: AC333991
Example output using a placeholder of "***":
Please update my records with the following information: Email address: *** National Provider Identifier: *** Driver's license: ***
Resources
- For more information about using Sensitive Data Protection to redact text, see Redacting Sensitive Data From Text Content .
- For more information about using Sensitive Data Protection to de-identify sensitive data in text content—which includes "masking" sensitive data, replacing sensitive data with a "token" string, and encrypting and replacing sensitive data using a randomly generated or pre-determined key—see De-identifying sensitive data in text content .

