Learn how to optimize the detection of sensitive content and get better results during the content scanning process.
Safetica NXT searches the text of documents for sensitive content defined in data categories. Events where a match is found are highlighted in the Data security > Overview > Event overview table with a corresponding data category label.
In this article, you will learn:
- What are definitions and conditions
- How to create a data category
- Can I import a dictionary with a list of keywords?
What are definitions and conditions
Definitions and conditions are two layers that define the scope of a data category. They help to increase accuracy and optimize the content scanning process.
1. Definitions - by adding definitions, you broaden the scope of a data category. A definition may contain one or more conditions.
During content scanning, at least 1 definition must be matched for the data category to be applied to that event (OR relationship between definitions).
2. Conditions - by adding conditions (such as built-in algorithms, keywords, or regular expressions), you refine each definition.
During content scanning, all conditions must be matched for the definition to be applied to that event (AND relationship between conditions).
How to create a data category
1. Go to Data security > Sensitive data.
2. Click Add category.
3. Enter the name and description of the data category.
4. Click Add definition, enter definition name, and specify the threshold.
Threshold determines how many times a definition must be found in a file.
5. Add individual conditions.
There are 3 types of conditions you can add to recognize specific types of sensitive data and optimize the scanning process - built-in algorithms, custom-defined keywords, or regular expressions.
You can also use our templates for region-specific sensitive data detection.
6. You can also edit or delete already existing definitions.
Can I import a dictionary with a list of keywords?
Importing a dictionary is not possible for now.
However, you can create large custom dictionaries by copy-pasting keywords from the clipboard and processing multiple terms in one go. Individual keywords must be separated by space.
You can also use our templates for region-specific sensitive data detection.
Limitations on duplicates and minimum keyword length (3 characters) is still applied.
Want to learn more? Read next:
Data security - Sensitive data
How to investigate files with sensitive content
How to edit or delete a data category
Templates for region-specific sensitive data detection