Feature Store Guide
Create, share, and reuse ML features across models using the NATIS Feature Store.
On this page
The NATIS Feature Store is a centralized repository for ML features that enables feature reuse across models, ensures consistency between training and serving environments, and maintains a full audit trail of feature definitions and values.
Creating a Feature Table
from natis.feature_store import FeatureStoreClient
from pyspark.sql import functions as F
fs = FeatureStoreClient()
# Compute features from raw data
customer_features = (
spark.table("catalog.silver.customers")
.join(spark.table("catalog.silver.orders"), "customer_id", "left")
.groupBy("customer_id")
.agg(
F.count("order_id").alias("total_orders_90d"),
F.sum("order_value").alias("total_spend_90d"),
F.avg("order_value").alias("avg_order_value"),
F.max("order_date").alias("last_order_date"),
F.datediff(F.current_date(), F.max("order_date")).alias("days_since_last_order"),
)
.withColumn("clv_tier",
F.when(F.col("total_spend_90d") > 5000, "high")
.when(F.col("total_spend_90d") > 1000, "medium")
.otherwise("low")
)
)
# Create feature table (write to Feature Store)
fs.create_table(
name="catalog.features.customer_features_v2",
primary_keys=["customer_id"],
df=customer_features,
description="Customer behavioral features for churn & CLV models, computed from 90-day order history",
tags={"team": "ds-retention", "model": "churn", "version": "2"},
)
Serving Features for Inference
The Feature Store supports both offline serving (for batch inference using Spark) and online serving (for real-time inference via REST API with low-latency feature lookup from Redis or DynamoDB back-end). Serving Mode | Latency | Use Case | Backend — | — | — | — Offline (Batch) | Minutes | Batch scoring, training dataset generation | Delta Lake Online (Real-Time) | < 10ms | API-based real-time inference | Redis / DynamoDB
Was this page helpful?
Thanks for your feedback!