Ingestion Overview
Understand NATIS ingestion patterns — batch, micro-batch, streaming, and CDC — and how to choose the right approach.
On this page
Data ingestion in NATIS is handled by the Ingestion Engine, a scalable subsystem that supports batch loads, real-time streaming, and change data capture from databases. All ingested data lands in the Raw Zone of your Delta Lake lakehouse and is automatically catalogued in Unity Catalog.
Ingestion Patterns
Pattern | Latency | Throughput | Best For — | — | — | — Full Batch | Hours | Very High (TB+) | Initial loads, historical backfill Incremental Batch | Minutes–Hours | High | Daily/hourly data updates Micro-batch (Structured Streaming) | Seconds–Minutes | High | Near-real-time analytics Change Data Capture (CDC) | Milliseconds–Seconds | Medium | Database replication, event sourcing Real-Time Streaming (Kafka) | Milliseconds | Very High | Event streams, IoT, clickstream
Supported Source Categories
NATIS maintains pre-built connectors for 200+ data sources. Custom connectors can be built using the NATIS Connector SDK (Python or Java). See the API & SDK section for details.
- Relational Databases — PostgreSQL, MySQL, Oracle, SQL Server, MariaDB
- Cloud Data Warehouses — Snowflake, BigQuery, Redshift, Synapse
- File Storage — HDFS, S3, Azure Blob, Google Cloud Storage, SFTP
- SaaS Applications — Salesforce, HubSpot, Stripe, Shopify, SAP
- Message Queues — Apache Kafka, Azure Event Hub, Amazon Kinesis, RabbitMQ
- APIs — REST, GraphQL, SOAP with custom connector framework
- NoSQL Databases — MongoDB, Cassandra, DynamoDB, Elasticsearch
Was this page helpful?
Thanks for your feedback!