Adding File System Data Sources to a Project

You can add file system data sources to a project. After a data source is crawled, the personally identifiable information (PII) (entities) defined in the data classification plan associated with the project is discovered in the data source.

Note: The approach mentioned on this page is applicable to file servers (Windows, UNIX, and NAS) and endpoints.

Before You Begin

Gather the information that applies to the status of the filer server used as the data source. For a list of statuses and the information needed for each status, see Required Information for File System Data Sources.
If you are analyzing data that is not backed up and the operating system of the data source differs from the operating system of the Index Server, then the operating system of the access node must match the operating system of the data source and must have a Commvault package installed. For example, if you want to analyze a UNC share, but the Index Server is on a Linux computer, you must use a Windows computer as the access node.
If your file system data source is a Windows computer or a NetApp filer, you can enable monitoring so that all users who accessed, modified, deleted, or renamed a file are captured. Modifying a file includes creating and changing a file. Before you enable monitoring, see the considerations for file monitoring.

Start the Configuration Wizard

From the navigation pane, go to Data Insights.

The Data Insights page appears.
Under Risk Analysis, click Sensitive data governance.

The Sensitive data governance page appears.
In the right area of the page, click Add > File system.

The Add file server configuration wizard appears.

Project

From the Projects list, select a project.

Tip

To add a project, click the add button , and then enter the details required. For more information, see Creating a Project.
Click Next.

The File Server page of the configuration wizard appears.

File Server

In the Select file server section, select a server. By default, the servers associated with system default inventory are displayed.
Tip
- To choose a server associated with a domain, select an Active Directory identity server(s) from the Identity server list.
- If you do not see a server that you want to analyze in the system default inventory, click Synchronize to sync with the CommCell environment.
- To choose a server from a custom inventory that you created, select your inventory from the Inventory list on left of the page.
- If you do not see the server that you want to analyze in the selected custom inventory, in the upper-right corner of the page, click Refresh inventory.
- Perform a keyword search. If the server is not found, you can click Add server to add the server to the inventory. The Add server option is available only if a custom inventory is selected in the Inventory list on left of the page.
Click Next.

The Configuration page of the configuration wizard appears.

Configuration

Add the information required to complete the configuration.

Agent Status	Steps
Agent installed	In the Display name box, enter a name for the data source. From the Data classification plan list, select a data classification plan. The data classification plan identifies the index server to use. From the Country name list, select the country where the server is located. Determine the type of data to use: To crawl data from a local directory that is not content indexed or backed up, click Analyze from source, and in the Source path box, enter the path to the data on the server you want to analyze. Tip You can click Browse to view the file system of the server and select the path you want to crawl from the file system view. To use the data collected from a content indexing job or a back up job, click Analyze from backup. Note If content indexing is enabled, all of the content indexed data from the server is used. If an agent is installed and content indexing is not enabled, all of the backed up data from the server is used. The data classification plan and the client selected in the Select file server section must have the same storage pool.
Content indexing enabled	In the Display name box, enter a name for the data source. From the Data classification plan list, select a data classification plan. The data classification plan identifies the index server to use. From the Country name list, select the country where the server is located. In the User name and Password boxes, enter the credentials for a user with read access to the server. To use a saved credential, do the following: Select the Saved credentials check box. From the Credential list, select the credential. To add a new credential, do the following: Steps to add new credentials Click + beside the Credential list. The Add credential dialog box appears. Enter the following information: In Account type, select the type of account that you want to create the credential entity for. In Credential Vault, the BUILT_IN credential vault is selected by default. In the Credential name box, enter the name of the credential. In the User account box, enter the name of the user account. In the Password box, enter the password. Click Save. From the Access node list, select a server from where the directory can be accessed. Note The User name and Password boxes and Access node list are displayed only if you click the Analyze from source option. Determine the type of data to use: To crawl data from a local directory that is not content indexed or backed up, click Analyze from source, and in the Source path box, enter the path to the data on the server you want to analyze. Tip You can click Browse to view the file system of the server and select the path you want to crawl from the file system view. To use the data collected from a content indexing job or a back up job, click Analyze from backup. Note If content indexing is enabled, all of the content indexed data from the server is used. If an agent is installed and content indexing is not enabled, all of the backed up data from the server is used. The data classification plan and the client selected in the Select file server section must have the same storage pool.
Agent not installed	In the Display name box, enter a name for the data source. From the Data classification plan list, select a data classification plan. The data classification plan identifies the index server to use. From the Country name list, select the country where the server is located. In the User name and Password boxes, enter the credentials for a user with read access to the server. To use a saved credential, do the following: Select the Saved credentials check box. From the Credential list, select the credential. To add a new credential, do the following: Steps to add new credentials Click + beside the Credential list. The Add credential dialog box appears. Enter the following information: In Account type, select the type of account that you want to create the credential entity for. In Credential Vault, the BUILT_IN credential vault is selected by default. In the Credential name box, enter the name of the credential. In the User account box, enter the name of the user account. In the Password box, enter the password. Click Save. From the Access node list, select a server from where the directory can be accessed. In the Source path box, enter the path to the data on the server you want to analyze.

Optional: If the file server is a Windows computer or a NetApp file server, to capture additional file information, move the Enable monitoring toggle key to the right.

To enable monitoring for users who accessed, modified, deleted, or renamed a file, move the Enable monitoring toggle to the right. Modifying a file includes creating and changing a file.
Click Create.

The server appears in the Data sources table, and a data collection job runs to crawl and to analyze the data in the data source.

What to Do Next

After creating a project and defining data sources, you can create requests to manage user requests to export or to delete data that contains personally identifiable information (PII). For more information, see Adding a Request in Request Manager.

After you add the data source to the project, a data collection job runs to crawl and to analyze the data in the data source. If at a later time you want to update the data collected from the data source, you can run a data collection job from the data source details page. For more information, see Collecting Data from Data Sources.

Adding File System Data Sources to a Project

Before You Begin

Start the Configuration Wizard

Project

File Server

Configuration

What to Do Next

Related Topics