Creating Data Classification Plans

Create a data classification plan to define file indexing requirements.

For more complex deployments, you can also separate data into different Index Servers and Content Analyzer Clouds computers.

Before You Begin

If you are using the data classification plan for Activate, determine whether or not you need to search metadata or to search metadata and content:

  • Metadata only: You need entity detection, but you do not need to search the contents of the files.

  • Metadata and content: You need entity detection, and you need to search the contents of the files. For example, when you review a data source for risks, you can search on keywords contained within the file.

Important

Entity detection works with either option.

Procedure

  1. From the navigation pane, expand Configuration, and then click Plans.

    The Plans page appears.

  2. Click Create plan, and then click Data classification.

    The Create data classification plan page appears.

  3. On the General page, in the Plan name box, enter a unique name for the plan.

  4. Next to Index Server, choose the index server:

    • To create a new index server, click Create a new index server.

    • To use an existing index server, click Use existing index server, and then from the Select index server list, click the index server.

  5. Click Next.

    The Content search page appears.

  6. To enable content search, click the Enable check box, and then select from the following options:

    • To collect metadata for files and folders, next to Search, click Metadata.

    • To collect metadata and to content index files and folders, next to Search, click Metadata and content.

    • If Metadata and content is enabled, to generate faster email previews, select the Pre-generate previews check box, and then enter the storage location information.

      The Pre-generate previews option is applicable when user mailboxes are used as a data source.

      1. In the Preview storage location box, enter the UNC path.

        This is the location where the preview data is stored.

      2. In the user name and password boxes, enter the credentials for a user account that can access the UNC path.

  7. Click Next.

    The Content analysis page appears.

  8. To have the software detect entity types in the end-user data, do the following:

    1. Select the Entity detection check box.

    2. From the Content Analyzer list, select a content analyzer cloud.

    3. From the Entities list, select one or more entity types.

      The software only detects the selected entities for end-user data requests and projects that use this plan. For information about built-in entity types, see Personally Identifiable Information (PII) Entity Types. For information about creating custom entity types, see Creating a Custom Entity Type in Entity Manager.

      Tip

      You can use the search bar to filter the entities, and use the selection controls to select all of the entity types.

  9. If your end-user data includes scanned documents and you want to include the scanned documents in content indexing or entity detection, select the Extract text from image check box.

    For information about scanned document support, see OCR Support for Scanned Documents. To enable content indexing and entity detection for images in additional file types, see Adding OCR Support for Additional File Types.

  10. Click Next.

    The Advanced options page appears. This page shows the file extensions that are selected for file indexing, and the paths for which file indexing is not performed.

  11. Optional: On the Advanced options page, customize what is content indexed:

    • Under Include file types, you can update the list of file types to include in content indexing and entity detection:

      • To add a file extension, type the extension in the Enter file extension box using the format *.ext, and then click Add.

      • To remove a file extension from the list of file types that are indexed, click the x next to a file extension.

    • Under Exclude paths, you can update the list of directories that are skipped for indexing operations.

      • To add a directory path, type the path in the Enter folder path or pattern box and then click Add.

        You can include wildcard expressions in the directory path. For example, to exclude all of the files in a temporary directory, enter */temp.

      • To remove a directory from the list of folders that are skipped, click the x next to a path.

  12. Under File size, enter values for Minimum and Maximum to specify the range of file sizes that are included in indexing operations.

  13. Click Save.

Loading...