V11 SP8
Loading...

Connecting to a CSV File with Data Cube

You can use Data Cube to collect, organize, and mine the data residing in various repositories across your enterprise.

Uploading Multiple CSV Files

With Data Cube you can upload multiple CSV files to the same data source, either by selecting multiple files or adding the files to a ZIP folder. When adding multiple files to the same data source, the data structure in the files must be identical for the data to be crawled successfully. If the structure of the CSV files are different, then you must use a new data source to crawl the data.

Before You Begin

  • You must be able to log in to the Web Console to view Data Cube. See Accessing the Web Console.
  • Only users assigned a role with the Data Connectors permission at the MediaAgent level can access Data Cube in the Analytics section of the Web Console. The associated MediaAgent must have been configured with Analytics Engine for Data Cube.

Procedure

  1. In a Web browser, log in to the Web Console and then click Analytics > CSV.
  2. Under Data Connectors next to CSV, click Add New.
  3. On the New Data Source (CSV) page, under Data Source Name:
    • Click the Analytics Engine list and select an Analytics Engine.
    • In Data Source Name, enter a name for the data source. The name cannot contain spaces.
    • Click Next to proceed to the next section.
  4. Under CSV File Details:
    • Upload a CSV files as follows:
      Expand All

      To upload files from a local computer:

      1. Select Upload CSV File(s) and click Upload
      2. Select one or more local CSV files. You can also select multiple CSV files included in a ZIP folder. See Uploading Multiple CSV Files.
      3. Click Open.

        A box containing upload progress information appears in the lower-right corner of the page.

      To upload files from a shared path or local path on the Analytics Engine selected for this data source:

      1. Select Specify CSV File(s) Location.
      2. In Folder Path, enter the local folder path or share path to the CSV file.

        Note: The location of the CSV file is relative to the Analytics Engine selected for this data source.

      3. If you entered a share path, enter the User Name and Password to access the share.
    • Under Column Separator, select or enter the character used to separate the items in the CSV file(s).
    • If the first row of the CSV file(s) contains column heading names, select First row has column name. If you uploaded multiple CSV files, then the column names of the first CSV file are added and the first row of the other CSV files are skipped.
    • Optional: In the Columns box, enter column heading names for the CVS file(s) as comma separated values. If you selected First row has column name, the values in the Columns box will overwrite the values in the first row of the CSV file(s). If you did not select the First row has column name option, then the first row of the CSV file(s) will be added as values under your custom column names.
    • Click Next to proceed to the next section.
    • Under CSV File Preview:
      • To enable Data Cube to automatically detect the data type for each column, enable Detect Data Type.

        If Detect Data Type is enabled, a clickable list of data types appears under each column in the data preview. The data type is detected from the first row of the data source. You can click the list and change the data type for each column as needed.

        Notes:  

        • Detect Data Type is only available for uploaded data sources.
        • The detect data type feature will only work when you create a data source. If you enable Detect Data Type when editing an existing data source, then the current data types of the data source are shown. You can select different data types for each column, but the software will not automatically detect the data types.
        • If you do not enable this feature, then all of the data will be configured as a string data type.
      • Optional: Select Incremental Crawl to perform one full crawl of the data and subsequently crawl only the data that has changed from the previous crawl.
      • If the CSV file has a primary key, enter the column name of the primary key in Primary Key.

        Note: The primary key must contain unique entries. If two primary key entries are the same, then only the last entry will be crawled.

      • Click Next to proceed to the next section.
    • Under Entity Extraction:
      • To enable entity extraction for the data source, select Enable Entity Extraction.
      • Click the Content Analyzer list and select the content analyzer cloud that you want to use.
      • Click the Entities to Extract list and select the check boxes next to the types of entities that you want to include during crawls.

        Tip: Use the search bar to search for the entities that you want to select.

        For more information about the entity types, see Entity Extraction Types for Data Cube.

      • In Fields to Extract Entities From, select or type the fields that  you want to use for entity extraction.

        Note: If you uploaded the CSV file,  you can select from a list of the data fields. If you did not choose to upload the CSV file, you can enter the data fields as a comma-separated list.

      • Click Next to proceed to the next section.
    • Under Data Blending:
      • If you want to configure data blending, select Enable Data Blending and configure the data blending options.

        For more information, see Configuring Data Blending in Data Cube.

      • Select Start Crawling Now to start crawling the data source after the data source is saved.
    • When finished, click Save.

The source appears on the Data Source (CSV) page.