When connecting to a database with Data Cube, you can select Incremental Crawling to perform incremental crawling of the data source. When you select Incremental Crawling, only the values in the database that changed after the previous crawl are updated in the data source.
If you select Incremental Crawling, then you must provide the following additional information.
Option |
Description |
---|---|
Incremental Crawling |
Select this option to crawl only updated values in the database. If you select this option, the first crawl of the database uses the query in SQL Query. Subsequent crawls use the queries in Delta SQL Query and Delta Import SQL Query. If you don't select this option, then all crawls use the query in SQL Query. |
Primary Key |
The column that is the primary key in the database. If you select Incremental Crawling, then you must enter this value. |
Delta SQL Query |
A SQL query that defines the criteria for determining the fields that were updated after the previous crawl. If you select Incremental Crawling, then you must enter this value. |
Delta Import SQL Query |
A SQL query that selects the data to import from the database for the rows that match the criteria in Delta SQL Query. If you select Incremental Crawling, then you must enter this value. |
Example Queries
The following examples are queries for incremental crawling of database sources in Data Cube.
For information about connecting to a database, see Connecting to a Database with Data Cube.
Example 1: Crawling Order Details That Changed After the Previous Crawl
Note: The following example uses a data field named last_index_time, which is the time when the previous crawl completed. This field has a strongly typed datetime data type in the local time zone of the Index Server, formatted as YYYY-MM-DD HH:MM:SS (for example, 2016-09-07 08:11:42).
The following example contains configuration information to update the data for orders that changed after the previous crawl of a database table named ORDERITEM.
Field |
Query |
---|---|
SQL Query |
SELECT OrderItemID, ProductID, OrderRevievedTime, Quantity, Price, ShippingPrice |
Primary Key |
OrderItemID |
Delta SQL Query |
SELECT OrderItemID FROM ORDERITEM WHERE CONVERT(VARCHAR, CAST(OrderRevievedTime AS datetime), 20) > '${dih.last_index_time}' |
Delta Import Query |
SELECT OrderItemID, ProductID, OrderRevievedTime, Quantity, Price, ShippingPrice FROM ORDERITEM WHERE OrderItemID = '${dih.delta.OrderItemID}' |
Example 2: Crawling Order Details from the Previous 24 Hours
The following example contains configuration information to update the data for orders that occurred in the previous 24 hours for a database table named ORDERITEM.
Field |
Query |
---|---|
SQL Query |
SELECT OrderItemID, ProductID, OrderRevievedTime, Quantity, Price, ShippingPrice |
Primary Key |
OrderItemID |
Delta SQL Query |
SELECT OrderItemID |
Delta Import Query |
SELECT OrderItemID, ProductID, OrderRevievedTime, Quantity, Price, ShippingPrice |