Sources
Purpose
Sources read data from external systems and load it into Golden tables.
Source Types
SourceFile - File-Based Sources
Read data from CSV, JSON, or NDJSON files.
{
"type": "source-file",
"id": "customer_csv_source",
"description": "Customer data from CSV files",
"dataset": "customer_dataset",
"inputPattern": "/data/customers/*.csv",
"format": "CSV",
"separator": ",",
"header": true,
"ignoreQuotes": false
}
Configuration:
inputPattern- File path pattern (supports wildcards*)format- CSV, JSON, or NDJSONseparator- Delimiter for CSV (default: ";")header- Whether CSV has header rowignoreQuotes- Quote handling for CSV
SourceJdbc - Database Sources
Read data from relational databases via JDBC.
{
"type": "source-jdbc",
"id": "orders_postgres_source",
"description": "Orders from PostgreSQL",
"dataset": "order_dataset",
"url": "jdbc:postgresql://localhost:5432/orders",
"credentials": "postgres_credentials",
"table": "public.orders",
"timestampFilteringFlag": true,
"timestampFilteringColumn": "created_at",
"readIdFlag": false,
"properties": {
"ssl": "true"
}
}
Configuration:
url- JDBC connection stringcredentials- Reference to credentials resourcetable- Database table (format:catalog.schema.table)timestampFilteringFlag- Enable date range filteringtimestampFilteringColumn- Column for incremental loadscustomScripts- Enable custom SQL queriesqueryScript- Custom SELECT querycountQueryScript- Custom COUNT query
Incremental Loading:
When enabled, only records modified since last execution are loaded.
SourceTable - Internal Table Sources
Read data from Golden tables.
{
"type": "source-table",
"id": "staging_table_source",
"table": "staging_customers"
}
Common Source Properties
All sources support:
dataset- Data structure definitionfromFilter- Start timestamp for filteringtoFilter- End timestamp for filtering