Data Transformation

Pipeline Orchestration & Scheduling

Schedule, monitor, and manage pipeline runs with NATIS Workflow Orchestrator.

7 min read · Updated May 2025

NATIS Workflow Orchestrator (NWO) is a built-in DAG-based orchestration engine compatible with Apache Airflow 2.x operators. You can define pipelines visually in the Pipeline Editor or as code using the NATIS Pipeline SDK.

Scheduling Options

  • Cron Expression — Standard 5-field cron (e.g., 0 2 * * * for daily at 2AM)
  • Preset Schedules — Every 15min, Hourly, Daily, Weekly, Monthly
  • Event-Based Triggers — Trigger on file arrival, table update, or API webhook
  • Dependency Chains — Start after upstream pipeline succeeds
  • Manual / Ad-Hoc — Run on demand from UI, CLI, or API

Pipeline YAML Definition

YAML
# pipeline-daily-sales.yaml
name: daily_sales_pipeline
description: Ingest and transform daily sales data
schedule: "0 6 * * *"  # Run daily at 6AM UTC
timezone: Asia/Ho_Chi_Minh
tags: [sales, daily, production]

tasks:
  - id: ingest_raw_sales
    type: ingestion
    source: postgresql://prod-db/sales
    destination: catalog.raw.sales
    incremental_key: updated_at
    on_failure: retry(3, delay=5m)

  - id: transform_silver
    type: sql
    depends_on: [ingest_raw_sales]
    query: |
      INSERT INTO catalog.silver.sales_cleaned
      SELECT * FROM catalog.raw.sales
      WHERE amount > 0 AND currency IS NOT NULL

  - id: build_gold_aggregates
    type: notebook
    depends_on: [transform_silver]
    notebook_path: /notebooks/sales/daily-aggregates
    cluster_size: medium

  - id: refresh_dashboard
    type: dashboard_refresh
    depends_on: [build_gold_aggregates]
    dashboard_id: dash_abc123

notifications:
  on_failure:
    - email: data-team@company.com
    - slack: "#data-alerts"

Monitoring Pipeline Runs

View all pipeline runs under Pipelines → Run History. Each run shows: execution status, start/end time, duration, rows processed, compute cost, and a Gantt chart of task execution. Click any failed task to see the full error log and stack trace.

Was this page helpful?

Thanks for your feedback!