Decide which file types will Safetica analyze for sensitive content to optimize both system performance and the accuracy of results.
In this article, you will find more about:
- Why analyze specific file types
- Where to configure file types for content analysis
- For what file types can you perform content analysis
- More granular file type control
- Where to see the results of content analysis
Introduction: Why analyze specific file types
In Safetica you have granular control of what specific file types (such as .docx and .xmlx) are analyzed for sensitive content. This way, you unburden your environment, because all other files are not analyzed. At the same time, results are more accurate, since they only come from the files types specified.
Where to configure file types for content analysis
- Open Safetica console.
- Go to Data Classification and click Content analysis settings.
For what file types can you perform content analysis
With the Content analysis file types drop-down, you can specify what types of files need to be analyzed for sensitive info.
You have 3 options:
- All - Safetica launches content analysis for all files for which it is technically possible. Using this option may impact device performance.
- Recommended – Safetica will analyze the content of selected, best-practice file types: txt, .xml, .html, .htm, .rtf, .zip, .csv, .pdf, .doc, .docx, .docm, .xls, .xlsx, .xlsm, .ppt, .pptx, .pptm, .pps, .ppsx, .ppsm, .msg, .eml, .one, .odt, .ods, .odp, .md, .epub
- Custom - you can specify file type categories or individual file extensions that will be analyzed for sensitive content. Files of all other types will be skipped.
If OCR is enabled, it will also be applied only to the selected file types (e.g. if you enter .jpeg, OCR will only run on .jpeg files).
More granular file type control
You can control file types even more granularly via data classifications. You can both limit and extend the custom set of file types.
Example: You have entered .pdf. and .jpg as custom file types for which to perform content analysis. If you create a data classification that searches files for “credit card numbers” and add a rule that specifies .xlsx files as the file type that should be searched, then all three file types (.pdf, .jpeg, and .xlsx) will be searched for sensitive content.
Where to see the results of content analysis
Results from content analysis are visible in the Data section in the Data classification column.
Hover over a classification label to see the rules that were matched, or click the classification label to display all classification details.
Read next:
Data classification in Safetica
Data classification: What is Safetica unified classification
Data classification: How to create a new data classification