AI & Machine Learning
Model Deployment & Serving
Deploy ML models as REST endpoints or batch jobs using NATIS Model Serving.
On this page
NATIS Model Serving deploys registered MLflow models as production REST endpoints with auto-scaling, A/B testing, traffic splitting, and built-in monitoring. No Kubernetes or container expertise required.
Deploying a Model via UI
- 1. Navigate to AI/ML → Model Registry and select your model.
- 2. Click Create Serving Endpoint in the top-right corner.
- 3. Give your endpoint a name (e.g., customer-churn-v1).
- 4. Select the model version to deploy (Staging or Production version).
- 5. Configure compute: CPU Tiny (1 core), Small (4 cores), Medium (8 cores), or GPU (1 × NVIDIA T4).
- 6. Set auto-scaling: Min Provisioned Concurrency and Max Concurrency.
- 7. Click Create Endpoint. The endpoint will be live in ~5 minutes.
Calling the REST Endpoint
Use the A/B Testing configuration (Endpoint Settings → Traffic Splitting) to route a percentage of traffic to a new model version before fully promoting it.
PYTHON
import requests
import json
endpoint_url = "https://app.natis.vn/serving-endpoints/customer-churn-v1/invocations"
token = "dapiXXXXXXXX" # Personal Access Token
# Prepare input data
payload = {
"dataframe_records": [
{
"customer_id": "cust_123",
"total_orders_90d": 12,
"total_spend_90d": 2450.00,
"avg_order_value": 204.17,
"days_since_last_order": 7,
"clv_tier": "medium"
}
]
}
response = requests.post(
endpoint_url,
headers={
"Authorization": f"Bearer {token}",
"Content-Type": "application/json"
},
data=json.dumps(payload)
)
result = response.json()
print(f"Churn probability: {result['predictions'][0]:.2%}")
# Output: Churn probability: 23.40%
Was this page helpful?
Thanks for your feedback!