Merger
Purpose
The merger resource defines how to create golden records.
Configuration
Merge Types
Merge Type | Description | Use Case |
|---|---|---|
RANDOM | Random selection by ID hash | No preference |
CONSTANT | Select record with specific value | Trust specific source |
DATE | Most recent record wins | Prefer latest data |
NUMBER | Highest number wins | Use numeric priority |
SCRIPT | Custom weight calculation | Complex logic |
Merge Sort
HIGHEST_WEIGHT - Prefer values from highest-weighted records
LOWEST_WEIGHT - Prefer values from lowest-weighted records
Merge Behavior
Single-value columns:
Select value from highest/lowest weighted record
Based on merge sort strategy
Array columns:
Union all unique values from all records
Preserves all data
Nested datasets:
Merge based on identity matching
Recursive merge logic
Example Merger Configuration
{
"type": "merger",
"id": "customer_merger",
"dataset": "customer_dataset",
"mergeType": "DATE",
"mergeSort": "HIGHEST_WEIGHT",
"weights": [
{
"column": "email",
"weight": 2.0
},
{
"column": "full_name",
"weight": 1.5
}
]
}
Best Practices
Trust recency - Use DATE merge type for frequently updated data
Consider source - Use CONSTANT to prefer specific sources
Array preservation - Arrays automatically preserve all values
Test merges - Review golden records for quality