Complete Guide to Lookup Table Fields and Settings

Modified on Mon, 14 Jul at 11:00 PM

In this article, we'll show complete guide to lookup table fields and settings

Let’s understand what it contains and how each column is used.

Look Up Table Columns:

TABLE_SRC
TABLE_DEST
SCHEMA_DEST
PRIMARY_KEY
WATERMARK_COLUMN
WATERMARK_COLUMN_FORMAT
FREQUENCY_IN_DAYS
REFRESH_TYPE
LAST_RUN_DATE

Understanding the columns in Lookup Table

Source and Destination Information

1. TABLE_SRC
The source table for the pipeline, which refers to the table name defined in the BI Connector from which the data will be pulled.

2. TABLE_DEST
The name of the destination table in the target database where the data will be loaded.

3. SCHEMA_DEST
The schema in the destination database where the TABLE_DEST exists.

Primary Key and Uniqueness

4. PRIMARY_KEY
The primary key column(s) of the source data.

Used to uniquely identify rows and track changes.
Can be a single column. Example: ERROR_ID
Or multiple columns separated by commas. Example: PERSON_ID,EFFECTIVE_START_DATE,EFFECTIVE_END_DATE

Watermarking (Incremental Load Logic)

5. WATERMARK_COLUMN
The column(s) used to detect new or updated records since the last run. Typically a timestamp or numeric column.

Simply provide the name of the column used to track new or updated records
Example: START_DATE

6. WATERMARK_COLUMN_FORMAT
Specifies the format of the datetime column of the WATERMARK_COLUMN field. It ensures that ADF can correctly interpret timestamp values for incremental data filtering.

Required only if the watermark column is date/time column.
Not required (can be left empty) if the column is not date/time column.
Example: YYYY-MM-DD"T"HH24:MI:SS

Watermark Column Configuration Scenarios using example columns

Scenarios	watermark_column	watermark_column_format
Water mark column (Date/Time Column)	START_DATE	YYYY-MM-DD"T"HH24:MI:SS
Water mark column (Not a Date/Time Column)	RUN_RESULT_ID	Not required (can be left empty)

Execution Control
7. FREQUENCY_IN_DAYS
Indicates how often this pipeline should run.

Example: 1 = daily, 7 = weekly.

8. REFRESH_TYPE
Defines the type of data load:

FULL_LOAD :

Set the REFRESH_TYPE to FULL_LOAD in the lookup table for the tables you want to do full load.
Make sure the LAST_RUN_DATE is NULL for the rows configured for full load.
To set FULL_LOAD for all rows in the lookup table, you can run the following SQL command:

UPDATE <your_lookup_table_name>
SET 
  REFRESH_TYPE = 'FULL_LOAD',
  LAST_RUN_DATE = NULL;

To set FULL_LOAD for specific rows in the lookup table, you can run the following SQL command:

UPDATE <your_lookup_table_name>
SET 
  REFRESH_TYPE = 'FULL_LOAD',
  LAST_RUN_DATE = NULL
WHERE 
  TABLE_SRC IN ('BIC_PJB_ERRORS', 'BIC_PER_ALL_PEOPLE_F', 'BIC_PAY_RUN_BALANCES');

5. This query will update rows for the tables listed in the IN clause.
Note: In the IN clause, these are BI Connector table names (that you created in BI Connector) inserted within '' and separated by commas.

IF NOT FULL_LOAD:

First , it will check for the maximum value of the watermark column in the destination table.
If the Destination Table is empty or the maximum value is empty, then it will do full load
Else, It will do an incremental refresh. It does this by comparing each source row's watermark value (like LAST_UPDATED_DATE) against the maximum value already present in the destination.Only the records with newer values — meaning data that was added or updated since the last load — are fetched and loaded.

9. LAST_RUN_DATE
Timestamp when the pipeline was last executed successfully.

It is used to track the last execution and determine when the pipeline should run next, based on the value of FREQUENCY_IN_DAYS
For example, LAST_RUN_DATE= 2024-12-20 00:00:00
FREQUENCY_IN_DAYS=2
Today’s date: 2024-12-22, the pipeline will execute since 2 days have passed since the last run