diff --git a/AI/README.md b/AI/README.md
index b4a1f6e22..e938a5fd8 100644
--- a/AI/README.md
+++ b/AI/README.md
@@ -28,6 +28,10 @@ We are particularly interested in examples that are:
 * Modular and showcase best practices.
 * Cover a diverse range of tools and MLOps stages.
 
-## Current Status
+## Available Examples
 
-_This section is currently being populated. Check back soon for our first set of AI/ML examples!_
+| Example | Description | GPU Required |
+|---|---|---|
+| [Model Inference with Scikit-Learn](model-inference-sklearn/) | Minimal inference API (FastAPI + scikit-learn) with Kubernetes best practices (probes, resource limits, security context) | No |
+| [TensorFlow Model Serving](model-serving-tensorflow/) | Deploy TensorFlow Serving with PersistentVolumes and Ingress | No (CPU mode) |
+| [vLLM Inference Server](vllm-deployment/) | Serve large language models (Gemma) with vLLM and optional HPA | Yes |
diff --git a/AI/model-inference-sklearn/README.md b/AI/model-inference-sklearn/README.md
new file mode 100644
index 000000000..b8908493c
--- /dev/null
+++ b/AI/model-inference-sklearn/README.md
@@ -0,0 +1,290 @@
+# Minimal AI Model Inference on Kubernetes (Scikit-Learn + FastAPI)
+
+## Purpose / What You'll Learn
+
+This example demonstrates how to deploy a lightweight AI/ML model for
+real-time inference on Kubernetes -- without GPUs, specialized hardware, or
+heavy ML platforms. You'll learn how to:
+
+- Train and package a [scikit-learn](https://scikit-learn.org/) model inside a
+  container image.
+- Serve predictions through a [FastAPI](https://fastapi.tiangolo.com/)
+  REST API.
+- Deploy the inference server to Kubernetes using a
+  [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/)
+  with **resource requests/limits**, **liveness and readiness probes**, and
+  **security hardening** (non-root user, read-only filesystem).
+- Expose the server with a
+  [Service](https://kubernetes.io/docs/concepts/services-networking/service/).
+- Send test predictions and verify the setup.
+
+This is the simplest possible "AI on Kubernetes" pattern -- ideal for learning
+the fundamentals before moving to GPU-accelerated serving solutions such as
+[TensorFlow Serving](../model-serving-tensorflow/) or
+[vLLM](../vllm-deployment/).
+
+---
+
+## Table of Contents
+
+- [Prerequisites](#prerequisites)
+- [Quick Start / TL;DR](#quick-start--tldr)
+- [Detailed Steps & Explanation](#detailed-steps--explanation)
+  - [1. Build the Container Image](#1-build-the-container-image)
+  - [2. Deploy to Kubernetes](#2-deploy-to-kubernetes)
+  - [3. Expose the Service](#3-expose-the-service)
+- [Verification / Seeing it Work](#verification--seeing-it-work)
+- [Configuration Customization](#configuration-customization)
+- [Cleanup](#cleanup)
+- [Troubleshooting](#troubleshooting)
+- [Further Reading / Next Steps](#further-reading--next-steps)
+
+---
+
+## Prerequisites
+
+| Requirement | Details |
+|---|---|
+| Kubernetes cluster | v1.27 or later (tested with v1.31) |
+| `kubectl` | Configured and in your `PATH` |
+| Container runtime | Docker or a compatible builder (Podman, etc.) |
+| Container registry | Any registry your cluster can pull from |
+| `curl` | For sending test requests |
+
+> **Note:** This example does **not** require GPUs. It runs on any standard
+> CPU node, making it easy to try on Minikube, kind, or a managed cluster.
+
+---
+
+## Quick Start / TL;DR
+
+```shell
+# 1. Clone the repo and build the image (replace <YOUR_REGISTRY>)
+git clone --depth 1 https://github.com/kubernetes/examples.git
+cd examples/AI/model-inference-sklearn
+docker build -t <YOUR_REGISTRY>/sklearn-inference:v1.0.0 image/
+docker push <YOUR_REGISTRY>/sklearn-inference:v1.0.0
+
+# 2. Update the image reference in deployment.yaml, then apply all manifests
+#    Replace <YOUR_REGISTRY> with your actual registry, e.g. docker.io/myuser
+kubectl apply -f deployment.yaml -f service.yaml -f pdb.yaml
+
+# 3. Wait for rollout and test
+kubectl wait --for=condition=Available deployment/sklearn-inference --timeout=120s
+kubectl port-forward service/sklearn-inference 8080:80 &
+curl -s -X POST http://localhost:8080/predict \
+  -H "Content-Type: application/json" \
+  -d '{"instances": [[5.1, 3.5, 1.4, 0.2]]}'
+```
+
+---
+
+## Detailed Steps & Explanation
+
+### 1. Build the Container Image
+
+The `image/` directory contains everything needed to build the inference
+server:
+
+```
+image/
+- Dockerfile         # Multi-stage build: train -> serve
+- app.py             # FastAPI inference server
+- train_model.py     # Trains & saves the scikit-learn model
+- requirements.txt   # Pinned Python dependencies
+```
+
+**Multi-stage Dockerfile explained:**
+
+- **Stage 1 (builder):** installs Python dependencies, runs
+  `train_model.py` to train a Random Forest classifier on the
+  [Iris dataset](https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html),
+  and saves the model as `iris_model.joblib`.
+- **Stage 2 (runtime):** copies only the application code and the trained
+  model into a slim image, creates a non-root user, and starts the
+  FastAPI server.
+
+Clone the repository and build from the `image/` directory:
+
+```shell
+git clone --depth 1 https://github.com/kubernetes/examples.git
+cd examples/AI/model-inference-sklearn
+docker build -t <YOUR_REGISTRY>/sklearn-inference:v1.0.0 image/
+docker push <YOUR_REGISTRY>/sklearn-inference:v1.0.0
+```
+
+### 2. Deploy to Kubernetes
+
+Before applying, update the `image` field in `deployment.yaml` to point to
+your registry:
+
+```yaml
+# In deployment.yaml -> spec.template.spec.containers[0]
+image: <YOUR_REGISTRY>/sklearn-inference:v1.0.0
+```
+
+Then apply the manifests:
+
+```shell
+kubectl apply -f deployment.yaml -f service.yaml -f pdb.yaml
+```
+
+**What the manifests provide:**
+
+| Feature | How |
+|---|---|
+| Replicas | `replicas: 2` for basic availability |
+| Resource governance | CPU/memory requests **and** limits |
+| Readiness probe | `GET /readyz` -- traffic is routed only after the model loads |
+| Liveness probe | `GET /healthz` -- container restarts if the process hangs |
+| Security | `runAsNonRoot`, `readOnlyRootFilesystem`, all capabilities dropped |
+| Writable temp storage | `emptyDir` mounted at `/tmp` for Python runtime |
+| Disruption budget | [PodDisruptionBudget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) keeps at least 1 replica during node drains |
+| Pod spreading | `topologySpreadConstraints` distributes replicas across nodes |
+
+Wait for the rollout:
+
+```shell
+kubectl wait --for=condition=Available deployment/sklearn-inference --timeout=120s
+```
+
+Check pod status:
+
+```shell
+kubectl get pods -l app=sklearn-inference
+```
+
+Expected output:
+
+```
+NAME                                 READY   STATUS    RESTARTS   AGE
+sklearn-inference-6f9b8d7c5f-abc12   1/1     Running   0          30s
+sklearn-inference-6f9b8d7c5f-def34   1/1     Running   0          30s
+```
+
+### 3. Expose the Service
+
+The included `service.yaml` creates a `ClusterIP` service on port 80.
+Access it from your workstation via port-forward:
+
+```shell
+kubectl port-forward service/sklearn-inference 8080:80
+```
+
+---
+
+## Verification / Seeing it Work
+
+With the port-forward running, send a prediction request:
+
+```shell
+curl -s -X POST http://localhost:8080/predict \
+  -H "Content-Type: application/json" \
+  -d '{"instances": [[5.1, 3.5, 1.4, 0.2], [6.7, 3.0, 5.2, 2.3]]}'
+```
+
+Expected output:
+
+```json
+{
+  "predictions": [
+    {
+      "label": "setosa",
+      "probability": 1.0
+    },
+    {
+      "label": "virginica",
+      "probability": 0.96
+    }
+  ]
+}
+```
+
+You can also verify the health endpoints:
+
+```shell
+# Liveness
+curl -s http://localhost:8080/healthz
+```
+
+```json
+{"status": "alive"}
+```
+
+```shell
+# Readiness
+curl -s http://localhost:8080/readyz
+```
+
+```json
+{"status": "ready"}
+```
+
+Check the container logs:
+
+```shell
+kubectl logs -l app=sklearn-inference --tail=20
+```
+
+Expected output:
+
+```
+INFO:inference-server:Loading model from /model/iris_model.joblib ...
+INFO:inference-server:Model loaded successfully.
+INFO:     Started server process [1]
+INFO:     Waiting for application startup.
+INFO:     Application startup complete.
+INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
+```
+
+---
+
+## Configuration Customization
+
+| Parameter | How to Change |
+|---|---|
+| **Model** | Replace `train_model.py` with your own training script. Update `app.py` to match the new model's input/output schema. Rebuild the image. |
+| **Replicas** | Edit `spec.replicas` in `deployment.yaml`. |
+| **Resource limits** | Adjust `resources.requests` and `resources.limits` in `deployment.yaml` to match your model's footprint. |
+| **Port** | Set the `PORT` environment variable in `deployment.yaml` and update the `containerPort` accordingly. |
+| **External access** | Change the Service `type` to `LoadBalancer` or add an [Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) resource. |
+
+---
+
+## Cleanup
+
+Remove all resources created by this example:
+
+```shell
+kubectl delete -f pdb.yaml -f service.yaml -f deployment.yaml
+```
+
+---
+
+## Troubleshooting
+
+| Symptom | Likely Cause | Fix |
+|---|---|---|
+| Pod stays in `CrashLoopBackOff` | Model file missing or corrupt | Rebuild the image and verify `iris_model.joblib` exists at `/model/` |
+| Readiness probe fails | Application hasn't started yet | Increase `initialDelaySeconds` in the readiness probe |
+| `ImagePullBackOff` | Wrong image reference or registry auth | Verify the `image` field and ensure your cluster has pull access |
+| `curl` returns connection refused | Port-forward not active | Re-run `kubectl port-forward service/sklearn-inference 8080:80` |
+
+---
+
+## Further Reading / Next Steps
+
+- [Kubernetes Deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/)
+- [Configure Liveness, Readiness and Startup Probes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/)
+- [Resource Management for Pods and Containers](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/)
+- [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/)
+- [PodDisruptionBudgets](https://kubernetes.io/docs/tasks/run-application/configure-pdb/)
+- [scikit-learn Documentation](https://scikit-learn.org/stable/)
+- [FastAPI Documentation](https://fastapi.tiangolo.com/)
+- More AI examples in this repo:
+  [TensorFlow Serving](../model-serving-tensorflow/) |
+  [vLLM Inference](../vllm-deployment/)
+
+---
+
+**Last Validated Kubernetes Version:** v1.31
diff --git a/AI/model-inference-sklearn/deployment.yaml b/AI/model-inference-sklearn/deployment.yaml
new file mode 100644
index 000000000..96544f148
--- /dev/null
+++ b/AI/model-inference-sklearn/deployment.yaml
@@ -0,0 +1,106 @@
+# Deployment for the scikit-learn Iris inference API.
+#
+# Key best practices demonstrated:
+#   - Explicit resource requests and limits
+#   - Liveness and readiness probes
+#   - Non-root security context
+#   - Read-only root filesystem
+#   - Specific image tag (no ":latest")
+#
+# For more on Deployments see:
+#   https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: sklearn-inference
+  labels:
+    app: sklearn-inference
+spec:
+  replicas: 2
+  selector:
+    matchLabels:
+      app: sklearn-inference
+  template:
+    metadata:
+      labels:
+        app: sklearn-inference
+    spec:
+      # Security best practice: do not run containers as root.
+      # https://kubernetes.io/docs/concepts/security/pod-security-standards/
+      securityContext:
+        runAsNonRoot: true
+        runAsUser: 10001
+        runAsGroup: 10001
+        seccompProfile:
+          type: RuntimeDefault
+      containers:
+        - name: inference-server
+          # ---------------------------------------------------------------
+          # IMPORTANT: Replace the image reference below with your own
+          # registry and tag after building the container image.
+          #
+          # To build and push:
+          #   docker build -t <YOUR_REGISTRY>/sklearn-inference:v1.0.0 image/
+          #   docker push <YOUR_REGISTRY>/sklearn-inference:v1.0.0
+          # ---------------------------------------------------------------
+          image: <YOUR_REGISTRY>/sklearn-inference:v1.0.0
+          ports:
+            - containerPort: 8080
+              name: http
+              protocol: TCP
+          # Environment variables consumed by the application.
+          env:
+            - name: MODEL_PATH
+              value: "/model/iris_model.joblib"
+            - name: PORT
+              value: "8080"
+          # Resource requests and limits ensure predictable scheduling
+          # and protect against runaway resource consumption.
+          # https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
+          resources:
+            requests:
+              cpu: "250m"
+              memory: "256Mi"
+            limits:
+              cpu: "500m"
+              memory: "512Mi"
+          # Readiness probe: Kubernetes will not route traffic to the pod
+          # until the model is loaded and the /readyz endpoint returns 200.
+          # https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
+          readinessProbe:
+            httpGet:
+              path: /readyz
+              port: http
+            initialDelaySeconds: 5
+            periodSeconds: 10
+            failureThreshold: 3
+          # Liveness probe: Kubernetes will restart the container if
+          # /healthz stops responding.
+          livenessProbe:
+            httpGet:
+              path: /healthz
+              port: http
+            initialDelaySeconds: 10
+            periodSeconds: 15
+            failureThreshold: 3
+          # Container-level security context.
+          securityContext:
+            allowPrivilegeEscalation: false
+            readOnlyRootFilesystem: true
+            capabilities:
+              drop:
+                - ALL
+          volumeMounts:
+            - name: tmp
+              mountPath: /tmp
+      volumes:
+        - name: tmp
+          emptyDir: {}
+      # Spread pods across nodes to ensure high availability.
+      topologySpreadConstraints:
+        - maxSkew: 1
+          topologyKey: kubernetes.io/hostname
+          whenUnsatisfiable: ScheduleAnyway
+          labelSelector:
+            matchLabels:
+              app: sklearn-inference
diff --git a/AI/model-inference-sklearn/image/Dockerfile b/AI/model-inference-sklearn/image/Dockerfile
new file mode 100644
index 000000000..6419f1c50
--- /dev/null
+++ b/AI/model-inference-sklearn/image/Dockerfile
@@ -0,0 +1,57 @@
+# ---------------------------------------------------------
+# Multi-stage Dockerfile for the scikit-learn inference API.
+#
+# Stage 1 -- "builder":  installs dependencies and trains
+#                        the model so the artifact is baked
+#                        into the image (no external storage
+#                        required for this demo).
+# Stage 2 -- "runtime":  copies only what is needed to serve.
+# ---------------------------------------------------------
+
+# ---------- Stage 1: build & train ----------
+FROM python:3.12-slim AS builder
+
+WORKDIR /build
+
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+
+COPY train_model.py .
+RUN python train_model.py
+
+# ---------- Stage 2: runtime ----------
+FROM python:3.12-slim
+
+ARG APP_UID=10001
+ARG APP_GID=10001
+
+LABEL org.opencontainers.image.description="Minimal scikit-learn inference server (Iris model)" \
+      org.opencontainers.image.source="https://github.com/kubernetes/examples/tree/master/AI/model-inference-sklearn"
+
+# Run as non-root for security best practices.
+RUN groupadd -g ${APP_GID} -r appuser && \
+    useradd -u ${APP_UID} -r -g appuser -d /app -s /sbin/nologin appuser
+
+WORKDIR /app
+
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+
+# Copy application code and the pre-trained model from the builder stage.
+COPY app.py .
+COPY --from=builder /build/iris_model.joblib /model/iris_model.joblib
+
+# Environment variables with sensible defaults.
+ENV MODEL_PATH="/model/iris_model.joblib" \
+    PORT="8080" \
+    PYTHONDONTWRITEBYTECODE="1" \
+    PYTHONUNBUFFERED="1" \
+    TMPDIR="/tmp"
+
+# Switch to non-root user.
+USER appuser
+
+EXPOSE 8080
+
+# Start the FastAPI server via uvicorn.
+CMD ["python", "app.py"]
diff --git a/AI/model-inference-sklearn/image/app.py b/AI/model-inference-sklearn/image/app.py
new file mode 100644
index 000000000..5359783a4
--- /dev/null
+++ b/AI/model-inference-sklearn/image/app.py
@@ -0,0 +1,156 @@
+"""
+FastAPI inference server for a scikit-learn Iris classification model.
+
+This lightweight API loads a pre-trained model at startup and exposes
+a /predict endpoint that accepts feature vectors and returns predicted
+class labels with probabilities.
+
+Endpoints:
+    GET  /healthz   - Liveness probe (always returns 200).
+    GET  /readyz    - Readiness probe (returns 200 once the model is loaded).
+    POST /predict   - Accepts feature vectors and returns predictions.
+"""
+
+import logging
+import os
+from contextlib import asynccontextmanager
+from typing import List
+
+import joblib
+import numpy as np
+from fastapi import FastAPI, HTTPException
+from pydantic import BaseModel, Field
+
+# ---------------------------------------------------------------------------
+# Configuration
+# ---------------------------------------------------------------------------
+MODEL_PATH = os.environ.get("MODEL_PATH", "/model/iris_model.joblib")
+PORT = int(os.environ.get("PORT", "8080"))
+
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger("inference-server")
+
+# ---------------------------------------------------------------------------
+# Application state
+# ---------------------------------------------------------------------------
+model = None
+class_names: List[str] = []
+
+
+@asynccontextmanager
+async def lifespan(application: FastAPI):
+    """Load the model once at startup and release on shutdown."""
+    global model, class_names
+    logger.info("Loading model from %s ...", MODEL_PATH)
+    try:
+        model = joblib.load(MODEL_PATH)
+        # Iris class names corresponding to label indices 0, 1, 2.
+        class_names = ["setosa", "versicolor", "virginica"]
+        logger.info("Model loaded successfully.")
+    except Exception as exc:
+        logger.error("Failed to load model: %s", exc)
+        raise
+    yield
+    logger.info("Shutting down inference server.")
+
+
+app = FastAPI(
+    title="Scikit-Learn Inference Server",
+    version="1.0.0",
+    lifespan=lifespan,
+)
+
+# ---------------------------------------------------------------------------
+# Request / response schemas
+# ---------------------------------------------------------------------------
+
+
+class PredictRequest(BaseModel):
+    """Input payload for the /predict endpoint.
+
+    Each entry in *instances* is a list of four numerical features
+    corresponding to sepal length, sepal width, petal length, and petal width.
+    """
+
+    instances: List[List[float]] = Field(
+        ...,
+        min_length=1,
+        json_schema_extra={
+            "example": [[5.1, 3.5, 1.4, 0.2], [6.7, 3.0, 5.2, 2.3]]
+        },
+    )
+
+
+class Prediction(BaseModel):
+    label: str
+    probability: float
+
+
+class PredictResponse(BaseModel):
+    predictions: List[Prediction]
+
+
+# ---------------------------------------------------------------------------
+# Health endpoints (used by Kubernetes probes)
+# ---------------------------------------------------------------------------
+
+
+@app.get("/healthz", status_code=200)
+def liveness():
+    """Liveness probe -- the process is alive."""
+    return {"status": "alive"}
+
+
+@app.get("/readyz", status_code=200)
+def readiness():
+    """Readiness probe -- the model is loaded and ready to serve."""
+    if model is None:
+        raise HTTPException(status_code=503, detail="Model not loaded yet")
+    return {"status": "ready"}
+
+
+# ---------------------------------------------------------------------------
+# Prediction endpoint
+# ---------------------------------------------------------------------------
+
+
+@app.post("/predict", response_model=PredictResponse)
+def predict(request: PredictRequest):
+    """Return class predictions and confidence for each input instance."""
+    if model is None:
+        raise HTTPException(status_code=503, detail="Model not loaded yet")
+
+    try:
+        data = np.array(request.instances)
+        if data.ndim != 2 or data.shape[1] != 4:
+            raise HTTPException(
+                status_code=422,
+                detail="Each instance must have exactly 4 features.",
+            )
+
+        predictions = model.predict(data)
+        probabilities = model.predict_proba(data)
+
+        results = []
+        for idx, label_idx in enumerate(predictions):
+            results.append(
+                Prediction(
+                    label=class_names[label_idx],
+                    probability=round(float(probabilities[idx][label_idx]), 4),
+                )
+            )
+        return PredictResponse(predictions=results)
+    except HTTPException:
+        raise
+    except Exception as exc:
+        logger.exception("Prediction failed")
+        raise HTTPException(status_code=500, detail=str(exc))
+
+
+# ---------------------------------------------------------------------------
+# Entrypoint (used by the container CMD)
+# ---------------------------------------------------------------------------
+if __name__ == "__main__":
+    import uvicorn
+
+    uvicorn.run(app, host="0.0.0.0", port=PORT)
diff --git a/AI/model-inference-sklearn/image/requirements.txt b/AI/model-inference-sklearn/image/requirements.txt
new file mode 100644
index 000000000..192b140d4
--- /dev/null
+++ b/AI/model-inference-sklearn/image/requirements.txt
@@ -0,0 +1,6 @@
+fastapi==0.115.12
+uvicorn==0.34.2
+scikit-learn==1.6.1
+joblib==1.4.2
+numpy==2.2.3
+pydantic==2.10.6
diff --git a/AI/model-inference-sklearn/image/train_model.py b/AI/model-inference-sklearn/image/train_model.py
new file mode 100644
index 000000000..0fdb67bdd
--- /dev/null
+++ b/AI/model-inference-sklearn/image/train_model.py
@@ -0,0 +1,45 @@
+"""
+Train a simple scikit-learn Random Forest classifier on the Iris dataset
+and persist it as a joblib file.
+
+The resulting model file (iris_model.joblib) is embedded in the container
+image so the inference server can load it at startup without any external
+storage dependency.
+
+Usage:
+    python train_model.py
+"""
+
+import joblib
+from sklearn.datasets import load_iris
+from sklearn.ensemble import RandomForestClassifier
+from sklearn.model_selection import train_test_split
+
+# ------------------------------------------------------------------
+# 1. Load the built-in Iris dataset
+# ------------------------------------------------------------------
+iris = load_iris()
+X, y = iris.data, iris.target  # 4 features, 3 classes
+
+# ------------------------------------------------------------------
+# 2. Split into training and test sets
+# ------------------------------------------------------------------
+X_train, X_test, y_train, y_test = train_test_split(
+    X, y, test_size=0.2, random_state=42
+)
+
+# ------------------------------------------------------------------
+# 3. Train a lightweight Random Forest classifier
+# ------------------------------------------------------------------
+clf = RandomForestClassifier(n_estimators=50, random_state=42)
+clf.fit(X_train, y_train)
+
+accuracy = clf.score(X_test, y_test)
+print(f"Test accuracy: {accuracy:.4f}")
+
+# ------------------------------------------------------------------
+# 4. Persist the trained model
+# ------------------------------------------------------------------
+model_path = "iris_model.joblib"
+joblib.dump(clf, model_path)
+print(f"Model saved to {model_path}")
diff --git a/AI/model-inference-sklearn/pdb.yaml b/AI/model-inference-sklearn/pdb.yaml
new file mode 100644
index 000000000..a8a0ce4b5
--- /dev/null
+++ b/AI/model-inference-sklearn/pdb.yaml
@@ -0,0 +1,14 @@
+# PodDisruptionBudget ensures that at least one replica is always available
+# during voluntary disruptions (e.g., node drains, cluster upgrades).
+#
+# For more on PDBs see:
+#   https://kubernetes.io/docs/tasks/run-application/configure-pdb/
+apiVersion: policy/v1
+kind: PodDisruptionBudget
+metadata:
+  name: sklearn-inference-pdb
+spec:
+  minAvailable: 1
+  selector:
+    matchLabels:
+      app: sklearn-inference
diff --git a/AI/model-inference-sklearn/service.yaml b/AI/model-inference-sklearn/service.yaml
new file mode 100644
index 000000000..96770a453
--- /dev/null
+++ b/AI/model-inference-sklearn/service.yaml
@@ -0,0 +1,23 @@
+# Service that exposes the scikit-learn inference Deployment inside the
+# cluster on port 80 (mapped to container port 8080).
+#
+# To access the service from outside the cluster you can use:
+#   kubectl port-forward service/sklearn-inference 8080:80
+#
+# For more on Services see:
+#   https://kubernetes.io/docs/concepts/services-networking/service/
+apiVersion: v1
+kind: Service
+metadata:
+  name: sklearn-inference
+  labels:
+    app: sklearn-inference
+spec:
+  selector:
+    app: sklearn-inference
+  type: ClusterIP
+  ports:
+    - name: http
+      protocol: TCP
+      port: 80
+      targetPort: http   # refers to the named port on the container