In this article, you will learn more about:
- What are partially matched classifications
- Where to find partially matched classifications
- Partial match reasons
- Examples of partial match reasons
Introduction: What are partially matched classifications
Partially matched classifications show up in situations when a file didn’t match a classification rule completely, or a threshold wasn’t reached. Partially matched classifications can reveal potentially sensitive information and provide useful insights for improving your existing data classifications, such as adding new rules to improve accuracy.
Where to find partially matched classifications
Partially matched classifications can be found in the file detail > Classification tab.
Partial match reasons
There are several reasons why a classification is not matched fully:
- Detection trigger not met
- Insufficient match proximity
- Invalid predefined algorithm checksum
- Incomplete match
- Missing context
- Unmatched file type
- Missing existing classification
Examples of partial match reasons
Detection trigger not met
This partial match reason is shown when the detection trigger of a classification rule is not reached.
Example: You have a classification rule with the predefined algorithm Credit card numbers, and you set the detection trigger to 10.
Files that contain more than 10 credit card numbers will be matched.
However, if a file contains less than 10 credit card numbers, it will not be matched, but the partial match reason Detection trigger not met will be displayed in the file detail.
Insufficient match proximity
This partial match reason is shown when all elements of a classification rule are found in the file, but they are not close enough to each other.
Example: You have a classification rule with the keyword “credit card” AND the predefined algorithm Credit card numbers.
Combinations like credit card 0123 4567 8910 1112 or credit card number is 0123 4567 8910 1112 or credit card XXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX 0123 4567 8910 1112 will be matched.
However, if the term "credit card" is found somewhere in the document and the credit card number is found too far away, the classification won’t be matched, but the partial match reason Insufficient match proximity will be displayed in the file detail.
Invalid predefined algorithm checksum
This partial match reason is shown when a checksum validation for a predefined algorithm fails.
Example: Credit card numbers use the Luhn algorithm as a checksum to distinguish valid numbers from mistyped or otherwise incorrect numbers. If a number has the correct format but fails the checksum, the data classification will not be matched, but the partial match reason Invalid predefined algorithm checksum will be displayed in the file detail.
Incomplete match
This partial match reason is shown when only some elements of a classification rule are found in the file, but not all.
Example: You have a classification rule with the keyword “credit card” AND the predefined algorithm Credit card numbers.
Combinations like credit card 0123 4567 8910 1112 or credit card number is 0123 4567 8910 1112 will be matched.
However, if the file only contains the credit card number but not the term “credit card”, the data classification will not be matched, but the partial match reason Incomplete match will be displayed in the file detail.
Missing context
This partial match reason is shown when elements from the If the file was ever transferred from... section (such as App category, File path, or Website) were not found in the file.
Example: You have a classification rule with the predefined algorithm Credit card numbers AND the App category set to CRM.
If the file contains the credit card number, but was not exported from CRM, the data classification will not be matched, but the partial match reason Missing context will be displayed in the file detail.
Unmatched file type
This partial match reason is shown when the file type is not matched.
Example: You have a classification rule with the keyword “credit card” AND the File type PDF.
All PDF files that contain the term credit card will be matched.
However, if the term credit card is found in an XLSX file instead of a PDF, the data classification will not be matched, but the partial match reason Unmatched file type will be displayed in the file detail.
Missing existing classification
This partial match reason is shown when the existing classification (e.g., MIP labels) is not matched.
Example: You have a classification rule with the predefined algorithm Credit card numbers AND you the files are classified via MIP labels.
A file that contains a credit card number AND has an MIP label will be matched.
However, if the file contains the credit card number, but not the MIP label, the data classification will not be matched, but the partial match reason Missing existing classification will be displayed in the file detail.