Skip to main content
Skip table of contents

User Guide

Welcome to the Trazadera Golden User Guide. This guide helps you understand how to use Golden for your day-to-day data quality and master data management tasks.

Who This Guide Is For

This guide is designed for:

Role

Primary Tasks

Data Stewards

Review duplicates, merge records, maintain data quality

Business Analysts

Query data, generate reports, monitor metrics

Data Managers

Configure entities, set up workflows, manage users

Operations Teams

Monitor tasks, troubleshoot issues, track performance


Quick Navigation

I want to...

Go to...

Understand the deduplication process

Understanding Deduplication

Review and merge duplicate records

Working with Buckets

Search for specific records

Finding Records

Monitor data quality

Monitoring and Reports

See a complete workflow example

Examples


Understanding Deduplication

Golden uses a multi-step process to identify and resolve duplicate records:

The Deduplication Pipeline

CODE
Source Data → Load → Index → Classify → Review → Merge → Golden Record

Step

What Happens

Your Role

Load

Data is imported from source systems

Configure sources

Index

Similar records are grouped into "buckets"

Define matching rules

Classify

System determines if records are duplicates

Set classification thresholds

Review

Uncertain matches are flagged for review

Review and decide

Merge

Confirmed duplicates are merged

Approve or customize

Golden Record

Best data is selected as the master

Define selection rules

Key Concepts

Concept

Description

Entity

A deduplication project (e.g., "Customers", "Products")

Bucket

A group of potentially duplicate records

Golden Record

The authoritative master record created from merged data

Classification

System's confidence level (DUPLICATES, REVIEW, UNIQUE)


Working with Buckets

Buckets are the heart of the deduplication workflow. Each bucket contains records that the system believes might be duplicates.

Bucket Classifications

Classification

Meaning

Action Required

DUPLICATES

High confidence these are duplicates

Auto-merged (or review if preferred)

REVIEW

Medium confidence - needs human review

Manual review required

UNIQUE

System believes this is not a duplicate

No action needed

Reviewing Buckets

When reviewing a bucket, you'll see:

  1. All records in the bucket side-by-side

  2. Match scores showing why records were grouped

  3. Field comparisons highlighting differences

  4. Golden record preview showing merged result

Making Decisions

Decision

When to Use

Result

Merge

Records are confirmed duplicates

Creates single golden record

Disconnect

Records are NOT duplicates

Separates into individual buckets

Skip

Need more information

Leaves for later review

Best Practices for Review

  • Start with highest-confidence REVIEW buckets

  • Use consistent decision criteria

  • Document unusual decisions with comments

  • Take breaks during long review sessions to maintain accuracy


Finding Records

Golden provides multiple ways to search for records.

Search Methods

Method

Best For

Example

Quick Search

Known identifier (email, ID)

[email protected]

Advanced Search

Multiple criteria

Name + City + Date range

Bucket Search

Finding specific bucket

Bucket ID B-12345

Filter by Status

Workflow management

All REVIEW buckets

Search Tips

  • Exact match: Use quotes for exact phrases: "John Smith"

  • Partial match: Use wildcards: john* or *@example.com

  • Multiple fields: Combine criteria: email:john* AND city:Boston


Working with Golden Records

Golden Records are the authoritative, deduplicated master records.

What Makes a Golden Record?

The Golden Record is created by:

  1. Selecting the best value for each field from source records

  2. Applying business rules (e.g., prefer most recent, prefer specific source)

  3. Tracking lineage back to original source records

Golden Record Lifecycle

Status

Description

Active

Current master record

Updated

Modified by new source data or manual edit

Merged

Combined with another golden record

Archived

Soft-deleted, can be restored

Viewing Golden Record Details

Each Golden Record shows:

  • Master data: The consolidated field values

  • Source records: All contributing records with lineage

  • History: Changes over time

  • Quality scores: Data completeness and confidence


Monitoring and Reports

Key Metrics Dashboard

Metric

What It Shows

Target

Total Records

Records loaded from sources

Varies

Unique Records

Confirmed non-duplicates

Higher is better

Duplicate Rate

Percentage of duplicates found

Depends on data quality

Review Queue

Buckets awaiting manual review

Lower is better

Merge Rate

Daily/weekly merges completed

Track trends

Common Reports

Report

Purpose

Frequency

Duplicate Summary

Overview of deduplication results

Daily/Weekly

Source Quality

Data quality by source system

Weekly

Steward Activity

Review decisions by user

Weekly

Trend Analysis

Duplicate rates over time

Monthly

Setting Up Alerts

Configure alerts for:

  • Review queue exceeding threshold

  • Unusual spike in duplicates

  • Task failures

  • Data quality drops below threshold


Common Use Cases

Building a Customer MDM System

Goal: Create a single view of customer across all systems

Steps:

  1. Create customer dataset with required fields (name, email, phone, address)

  2. Set up sources (CRM, e-commerce, support systems)

  3. Configure indexer for email, phone, name matching

  4. Define classifier with weighted comparisons

  5. Create merger strategy (prefer most recent, most complete)

  6. Configure entity with AUTO_DUPLICATES mode for steward review

  7. Schedule automatic synchronization

  8. Monitor duplicate resolution progress

Relevant Guides: Entities, Resources, Golden Records


Implementing Data Integration Pipeline

Goal: Load and clean data from external sources

Steps:

  1. Define source dataset matching source schema

  2. Create source resource (JDBC, file, or API)

  3. Define target dataset for Golden Core

  4. Create transformation mapping fields

  5. Create pipeline for data cleaning

  6. Configure table for storage

  7. Execute load operation

  8. Monitor task completion

Relevant Guides: Resources, Tables, Tasks


Setting Up User Access Control

Goal: Ensure appropriate access to data and functions

Steps:

  1. Configure authentication method (SSO or internal)

  2. Define custom roles for your organization

  3. Assign appropriate permissions to roles

  4. Create user accounts with assigned roles

  5. Generate access tokens for integrations

  6. Configure entitlement filters for data privacy

  7. Monitor and audit access regularly

Relevant Guides: Security


Manual Duplicate Review Process

Goal: Review and resolve uncertain duplicate matches

Steps:

  1. Synchronize entity to create/update buckets

  2. Query REVIEW-classified buckets

  3. Examine records in each bucket

  4. Make merge or disconnect decisions

  5. Track progress via statistics

  6. Generate reports on data quality improvements

See detailed walkthrough: Example: Manual Deduplication Workflow


Examples

Detailed step-by-step examples to help you get started:

Example

Description

Skill Level

Customer Data Integration

Complete ETL pipeline from CRM to Golden

Intermediate

Manual Deduplication Workflow

Step-by-step guide for reviewing duplicates

Beginner

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.