Search for PII in Backups of Files and Emails

With Content Analyzer and entity extraction, you can identify the data in your environment that contains one or more specific types of PII (personally identifiable information). You can then use that knowledge to create more targeted strategies and increase the efficiency of your data management operations.

The Content Analyzer adds semantic identifiers to the content indexing data in your Search Engine. You can use these additional semantic identifiers to create more targeted rules for archiving based on specific types of information present in the content of the data.

Features that Support Entity Extraction

You can use Content Analyzer and Entity Extraction with the following Commvault features:

  • Compliance Search

    You can add entity information to your content indexed data to support searching for specific entities from Compliance Search. For more information, see Searching for Entities in Compliance Search.

  • Data Cube

    Some Data Cube connectors support Entity Extraction. With entity extraction for Data Cube, you can search for and identify different types of data across many different types of data repositories in your environment, regardless of whether or not they are protected by a Commvault agent. For more information, see Configuring a Content Analyzer Cloud for Data Cube Entity Extraction.

    Note

    Entity Extraction for Data Cube requires the creation of a Content Analyzer Cloud. The Content Analyzer Cloud links a client with Content Analyzer with an Index Server configured for Data Cube.

  • Database Sensitivity Report

    On the Database Sensitivity report, you can view which tables in your database contain sensitive information. Download the Database FS Converter tool from the Commvault Store and follow the instructions in the readme file to complete the setup.

How Entity Extraction Works

After data has been content indexed, the content indexing information is stored in the Search Engine. The Content Analyzer package reads the information in the Search Engine and looks for the entities that you want to identify, such as telephone numbers, social security numbers, and other meaningful types. When the Content Analyzer encounters a specified type of entity, it tags the data with additional identifying metadata. Other Commvault features, such as data archiving, can access the entity information in the data to enable you to make more targeted rules for managing your data.

Benefits of Entity Extraction

With entity extraction, you can:

  • Locate specific types of important information contained within your data.

  • Create rules to archive data based on the types of entities within the data content.

  • Increase the efficiency and reliability of your automated data management operations.

Loading...