AI & Machine Learning

MLflow Experiment Tracking

Track experiments, log metrics, and compare model runs using MLflow on NATIS.

8 min read · Updated May 2025

On this page

Basic Experiment Logging
Comparing Experiment Runs
Model Staging and Promotion

MLflow is deeply integrated into NATIS and automatically tracks all notebook runs. You can explicitly log parameters, metrics, artifacts, and models using the mlflow Python client.

Basic Experiment Logging

PYTHON

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, roc_auc_score
from sklearn.model_selection import train_test_split
import pandas as pd

# Set experiment (auto-creates if not exists)
mlflow.set_experiment("/Users/analyst@company.com/churn-prediction")

# Load features from Feature Store
from natis.feature_store import FeatureStoreClient
fs = FeatureStoreClient()
features_df = fs.get_feature_table("catalog.features.customer_features")

X = features_df.drop(columns=["customer_id", "churn_label"])
y = features_df["churn_label"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train with MLflow autologging
with mlflow.start_run(run_name="rf-v3-tuned") as run:
    mlflow.log_param("n_estimators", 200)
    mlflow.log_param("max_depth", 8)
    mlflow.log_param("feature_set", "v2.1")
    
    model = RandomForestClassifier(n_estimators=200, max_depth=8, random_state=42)
    model.fit(X_train, y_train)
    
    y_pred = model.predict(X_test)
    y_prob = model.predict_proba(X_test)[:, 1]
    
    mlflow.log_metric("accuracy", accuracy_score(y_test, y_pred))
    mlflow.log_metric("roc_auc", roc_auc_score(y_test, y_prob))
    
    # Log model to registry
    mlflow.sklearn.log_model(
        model, 
        artifact_path="model",
        registered_model_name="customer-churn-classifier"
    )
    
    print(f"Run ID: {run.info.run_id}")

Comparing Experiment Runs

Navigate to AI/ML → Experiments and click on your experiment name to see all runs. You can compare runs side-by-side, plot metric curves over time, and filter by parameters. Use the Parallel Coordinates chart to visualize the relationship between hyperparameters and performance metrics.

Model Staging and Promotion

1. In the Experiments UI, find the best run and click Register Model (or use the Python API).
2. A new version is created in the Model Registry under the model name you specified.
3. Review model performance, feature importance, and validation metrics in the Model Registry.
4. Transition the model from Staging → Production using the Stage dropdown or mlflow.transition_model_version_stage().
5. Set up webhooks to trigger downstream tasks (e.g., A/B test deployment) when a model is promoted to Production.

Was this page helpful?

Thanks for your feedback!