Adding File System Data Sources to a Project

You can add file system data sources to a project. After a data source is crawled, the personally identifiable information (PII) (entities) defined in the data classification plan associated with the project is discovered in the data source.

Note: The approach mentioned on this page is applicable to file servers (Windows, UNIX, and NAS) and endpoints.

Before You Begin

Gather the information that applies to the status of the filer server used as the data source. For a list of statuses and the information needed for each status, see Required Information for File System Data Sources.
If you are analyzing data that is not backed up and the operating system of the data source differs from the operating system of the Index Server, then the operating system of the access node must match the operating system of the data source and must have a Commvault package installed. For example, if you want to analyze a UNC share, but the Index Server is on a Linux computer, you must use a Windows computer as the access node.
If your file system data source is a Windows computer or a NetApp filer, you can enable monitoring so that all users who accessed, modified, deleted, or renamed a file are captured. Modifying a file includes creating and changing a file. Before you enable monitoring, see the considerations for file monitoring.

Procedure

From the navigation pane, go to Activate.

The Activate page appears.
Under Data Governance, click Sensitive data governance.

The Sensitive data governance page appears.
Under Quick start, click File system.

The Quick start page appears.
On the Project tab, under Select project, from the Project list, select a project.
Click Next.

The Add file server page appears.

Tip

If you do not see the server that you want to analyze, in the upper-right corner of the page, click Refresh inventory.
Next to the server you want to add, select the check box.
Tip

To refine the list, you can perform the following actions:
- Select one or more facets to the left of the list.
- Perform a keyword search. If the server is not found, you can click Add server to add the server to the inventory.
- Control pagination with the controls at the bottom of the table.
Click Next.

On the Configuration tab, add the information required to complete the configuration.

Agent Status	Steps
Agent installed Content indexing enabled	In the Display name box, enter a name for the data source. From the Data classification plan list, select a data classification plan. The data classification plan identifies the index server to use. From the Country name list, select the country where the server is located. Determine the type of data to use: To crawl data from a local directory that is not content indexed or backed up, click Analyze from source, and in the Directory path box, enter the path to the data on the server you want to analyze. Tip You can click Browse to view the file system of the server and select the path you want to crawl from the file system view. To use the data collected from a content indexing job or a back up job, click Analyze from backup. Note If content indexing is enabled, all of the content indexed data from the server is used. If an agent is installed and content indexing is not enabled, all of the backed up data from the server is used.
Agent not installed	In the Display name box, enter a name for the data source. From the Data classification plan list, select a data classification plan. The data classification plan identifies the index server to use. From the Country name list, select the country where the server is located. In the User name and Password boxes, enter the credentials for a user with write access to the server. In the Directory path box, enter the path to the data on the server you want to analyze. If the operating system of the data source differs from the operating system of the Index Server, under Advanced settings, from the Access node list, select an access node with the same operating system as the data source.

Agent Status

Steps

Agent installed
Content indexing enabled

In the Display name box, enter a name for the data source.
From the Data classification plan list, select a data classification plan.

The data classification plan identifies the index server to use.
From the Country name list, select the country where the server is located.
Determine the type of data to use:
- To crawl data from a local directory that is not content indexed or backed up, click Analyze from source, and in the Directory path box, enter the path to the data on the server you want to analyze.
  
  Tip
  
  You can click Browse to view the file system of the server and select the path you want to crawl from the file system view.
- To use the data collected from a content indexing job or a back up job, click Analyze from backup.
  
  Note
  
  If content indexing is enabled, all of the content indexed data from the server is used. If an agent is installed and content indexing is not enabled, all of the backed up data from the server is used.

Agent not installed

In the Display name box, enter a name for the data source.
From the Data classification plan list, select a data classification plan.

The data classification plan identifies the index server to use.
From the Country name list, select the country where the server is located.
In the User name and Password boxes, enter the credentials for a user with write access to the server.
In the Directory path box, enter the path to the data on the server you want to analyze.
1. If the operating system of the data source differs from the operating system of the Index Server, under Advanced settings, from the Access node list, select an access node with the same operating system as the data source.

Optional: If the file server is a Windows computer or a NetApp file server, to capture additional file information, move the Enable monitoring toggle key to the right.

When Enable monitoring is enabled, all users who accessed, modified, deleted, or renamed a file are captured. Modifying a file includes creating and changing a file.
Click Finish.

The server appears in the Data sources table, and a data collection job runs to crawl and to analyze the data in the data source.

What to Do Next

After creating a project and defining data sources, you can create requests to manage user requests to export or to delete data that contains personally identifiable information (PII). For more information, see Adding a Request in Request Manager.

After you add the data source to the project, a data collection job runs to crawl and to analyze the data in the data source. If at a later time you want to update the data collected from the data source, you can run a data collection job from the data source details page. For more information, see Collecting Data from Data Sources.

Adding File System Data Sources to a Project

Before You Begin

Procedure

What to Do Next

Related Topics