Adding a File System Data Source to Data Cube
You can add file system data sources in Data Cube. After the data source is added, you can restructure the data. For example, you can combine data from two or more data sources into a single data source. For information about the available data restructuring options, see Restructuring Data in Data Cube.
Before You Begin
- You must log in to the Web Console with a user who has permission to access Data Cube and create data sources.
For more information, see Permissions and Associations for Data Cube Users.
- From the Data Cube dashboard, next to File System, click Add New.
Tip: Alternatively, click File System to open the Data Sources (File System) page, and then click Add File System in the upper-right corner.
The New Data Source (File System) page appears.
- In the Data Source Name section, enter the following information:
The Index Server that you want to use for the data source.
- Click the Index Server list, and then select an Index Server.
Data Source Name
The name of the data source as you would like for it to appear in Data Cube.
- Enter a name for the data source.
Data Source Description
Enter a short description for the data source.
- Optional: Enter a description for the data source.
- Click Next, and then in the Source Details section, enter the following information:
The access node is a MediaAgent in your CommCell environment that can connect to the file system data source.
- Click the Access Node list and select a MediaAgent.
Paths to the directories and files that you want to include in the data source.
- Local directory on the access node:
- Shared UNC directories:
- Enter the path to the file system directories or files that you want to crawl.
Tip: To add multiple paths, enter each path on a separate line.
Specify the username for the credentials with read access for all of the locations in the source details.
Note: If you are using paths on the access node, you can leave the username and password blank.
- Enter your user name to access the data source locations.
Note: If you entered multiple locations in Source Details, enter a user name that has access to all of the locations.
The password for the user name to access the data source.
- Enter the password for the user name to access the data source.
Incremental crawls only collect new data and data that changed since the previous crawl. Otherwise, data from the previous crawl is completely overwritten by each subsequent crawl. Incremental crawling is enabled by default.
- Optional: Click the slider to disable incremental crawling.
Use this option to filter the data from the source that you want to exclude from Data Cube. You can exclude data in a specific directory or based on matching wildcard expressions in the file name or directory path.
The following are some examples of filter patterns:
- To exclude text files, enter *.txt.
- To exclude different types of multimedia files, enter *avi, *mpg, *mp3, *mp4, *mov.
- To exclude all of the files in a temporary directory, enter */temp.
- Optional: Enter a path or pattern to exclude data from the data source.
Tip: To add multiple filters, enter the filters as a comma separated list.
- Click Next.
- To include the file content in the file system data source, do the following in the Content Search section:
- To enable content search, click the slider.
- To specify the types of files and the sizes of files that you want to be searchable, configure the Include Filter and File Size options.
Note: By default, the file system connector only collects metadata, such as file name and modified date, from source files. When content search is enabled, you can search the content of your data and enable features like entity extraction.
- Click Submit.
The data source name Configuration page appears.
What to Do Next
Last modified: 10/19/2018 6:30:58 PM