Skip to main content
Skip table of contents

Resources

Introduction

Resources are the building blocks of data integration in Golden Core. They define how data flows through the system - from reading external sources, transforming data structures, to writing to target destinations.

Resources are reusable, configurable components that can be mixed and matched to create complex data workflows.

Resource Types Overview

Resource Type

Purpose

Used For

Dataset

Define data structure (schema)

Tables, sources, sinks, transformations

Source

Read data from external systems

Data ingestion, ETL loads

Sink

Write data to external systems

Data export, integration

Transformation

Map between data structures

Schema mapping, field renaming

Pipeline

Multi-stage data processing

Data cleaning, validation, enrichment

Indexer

Define search indexes

Entity search, duplicate detection

Classifier

Compare records for similarity

Duplicate detection

Merger

Create golden records

Master data management

This section covers Datasets, Sources, Sinks, Transformations, and Pipelines. See the Entities Guide for indexers, classifiers, and mergers.


Resource Management

List Resources

Endpoint: GET /resources
Permission: resource.list

Get Resource Details

Endpoint: GET /resources/id/{id}

Permission: resource.view

Create or Update Resource

Endpoint: POST /resources
Permission: resource.save

JSON
{
  "test": false,
  "resource": {
    /* resource configuration */
  }
}

Set test: true to validate without saving.

Delete Resource

Endpoint: DELETE /resources/id/{id}

Permission: resource.delete

Resources in use by entities or other resources cannot be deleted.

Duplicate Resource

Endpoint: PUT /resources/duplicate/{id}/{newId}

Permission: resource.save

Rename Resource

Endpoint: PUT /resources/rename/{id}/{newId}

Permission: resource.save


Import and Export

Export Resources

Endpoint: POST /resources/export

JSON
{
  "ids": ["customer_dataset", "crm_source", "normalize_customer"]
}

Returns JSON package for backup or migration.

Import Resources

Endpoint: POST /resources/import

JSON
{
  "resources": [
    { /* resource configuration */ }
  ]
}

Validates and imports resources. Dependencies resolved automatically.


ETL Workflows

Resources power ETL operations through three primary patterns:

LOAD Operation

Extract from source and load into table:

CODE
Source → (Transformation) → (Pipeline) → Target Table

Example:

BASH
POST /tables/load
{
  "source": "crm_source",
  "transformation": "normalize_customer",
  "pipeline": "customer_cleaning",
  "sinkTable": "customer_table",
  "operation": "UPSERT"
}

TRANSFORM Operation

Apply transformations to existing data:

CODE
Source Table → (Transformation/Pipeline) → Same Table

Example:

BASH
POST /tables/transform
{
  "source": "customer_table",
  "transformation": "enrich_customer",
  "maxRecords": 10000
}

EXPORT Operation

Export table data to external systems:

CODE
Source Table → (Transformation) → Sink

Example:

BASH
POST /tables/export
{
  "source": "customer_table",
  "transformation": "format_export",
  "maxRecords": 5000
}

Best Practices

Dataset Design

  • Start with core fields - Add optional fields later

  • Use meaningful keys - Clear column names

  • Enable lookups - Index frequently queried fields

  • Nested datasets - Use for complex hierarchical data

  • Validate early - Use mandatory and validation rules

Source Configuration

  • Incremental loading - Enable timestamp filtering for large datasets

  • Custom queries - Optimize with custom SQL for complex joins

  • Connection pooling - Configure JDBC properties for performance

  • Error handling - Test connections before production use

Transformation Strategy

  • Modular mappings - Create reusable transformations

  • Carry over carefully - Consider carryOverData implications

  • Test thoroughly - Validate mappings with sample data

  • Script sparingly - Use COLUMN/CONCATENATE when possible

Pipeline Design

  • Logical ordering - Clean before transform

  • Single responsibility - One purpose per processor

  • Reusable components - Reference transformations instead of duplicating

  • Monitor performance - Complex pipelines can slow processing


Troubleshooting

Resource Validation Fails

Check:

  • Referenced resources exist (datasets, credentials)

  • JDBC URLs are well-formed

  • File patterns are valid paths

  • Transformation source/target datasets match

Source Connection Errors

Check:

  • Network connectivity to source system

  • Credentials are correct and not expired

  • Database/API permissions granted

  • JDBC drivers available for database type

Transformation Mapping Errors

Check:

  • Source columns exist in source dataset

  • Target columns exist in target dataset

  • Data types are compatible

  • Scripts have no syntax errors

Pipeline Processing Errors

Check:

  • Processors execute in logical order

  • Referenced transformations exist

  • Scripts handle null values

  • Dataset matches pipeline configuration

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.