Transformation Concepts
Understand the NATIS data transformation model: lakehouse zones, transformation layers, and the pipeline DAG.
On this page
NATIS organizes transformed data across three lakehouse zones: Raw (as-ingested), Silver (cleaned and standardized), and Gold (business-ready aggregates). Transformation pipelines move data progressively through these zones using a directed acyclic graph (DAG) execution model.
Lakehouse Zones
Zone | Also Known As | Data Quality | Typical Use — | — | — | — Raw Zone | Bronze Layer | As-ingested, no changes | Data retention, audit, replay Silver Zone | Refined Layer | Cleaned, deduplicated, typed | Exploration, ad-hoc SQL Gold Zone | Curated Layer | Aggregated, business-logic applied | BI dashboards, ML features
Transformation Methods
All transformations run on NATIS managed clusters. Compute costs are attributed to the pipeline owner's workspace budget. Use the Cost Estimator (available in the Pipeline Editor) before scheduling high-volume jobs.
- SQL Transformations — drag-and-drop SQL nodes in the Pipeline Editor; supports CTEs, window functions, UDFs
- Spark Notebooks — PySpark or Scala for complex transformations; full Spark 3.5 API available
- dbt Integration — deploy and run dbt models natively within NATIS pipelines
- Low-Code Builder — visual column mapping, type casting, filter, join, and pivot widgets
- Python Scripts — general-purpose transformation scripts with pandas, polars, or PySpark
Was this page helpful?
Thanks for your feedback!