587: Add minimal AI model inference example (scikit-learn + FastAPI) by NIRANJAN0125 · Pull Request #591 · kubernetes/examples

NIRANJAN0125 · 2026-02-08T08:21:56Z

Closes #587

What This PR Adds

A new example under AI/model-inference-sklearn/ demonstrating how to deploy
a lightweight AI/ML model for real-time inference on Kubernetes -- without GPUs,
specialized hardware, or heavy ML platforms.

This covers the core deployment pattern requested in #587:

Lightweight trained model: scikit-learn Random Forest classifier on the
Iris dataset, trained during the container build and baked into the image.
Small inference API: FastAPI server with /predict, /healthz, and
/readyz endpoints.
Kubernetes manifests: Deployment, Service, and PodDisruptionBudget.
Basic best practices: resource requests/limits, liveness and readiness
probes, non-root user, read-only root filesystem, seccomp profile,
capabilities dropped, topology spread constraints.

Files

AI/model-inference-sklearn/
├── README.md              # Full walkthrough with prerequisites, quick start,
│                          # detailed steps, verification, cleanup, and
│                          # troubleshooting
├── deployment.yaml        # Deployment with probes, resource limits, security
│                          # context, topology spreading, and /tmp emptyDir
├── service.yaml           # ClusterIP Service
├── pdb.yaml               # PodDisruptionBudget (minAvailable: 1)
└── image/
    ├── Dockerfile         # Multi-stage: train in stage 1, serve in stage 2
    ├── app.py             # FastAPI inference server
    ├── train_model.py     # Trains and saves the scikit-learn model
    └── requirements.txt   # Pinned Python dependencies

Also updates AI/README.md to add an "Available Examples" table listing
this example alongside the existing TensorFlow Serving and vLLM examples.

Verified Locally

train_model.py produces a valid model (test accuracy: 1.0).
FastAPI server starts, loads the model, and serves all endpoints correctly.
/healthz returns 200, /readyz returns 200 after model load.
/predict returns correct class labels and probabilities.
Error cases (wrong feature count, empty input, invalid JSON) all return
appropriate 422 responses.
All YAML manifests parse as valid Kubernetes resources.
Cross-file consistency verified: labels, ports, UIDs, probe paths, and
model paths all match across Deployment, Service, PDB, Dockerfile, and
application code.

k8s-ci-robot · 2026-02-08T08:22:04Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: NIRANJAN0125
Once this PR has been reviewed and has the lgtm label, please assign soltysh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2026-02-08T08:22:05Z

Welcome @NIRANJAN0125!

It looks like this is your first PR to kubernetes/examples 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/examples has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

Champbreed · 2026-02-13T09:54:30Z

Hi @NIRANJAN0125, thanks for this PR.

One thought for the 'Further Reading' section: Since baking the model into the image works so well here for the Iris dataset, it might be worth adding a small note about using InitContainers or PersistentVolumes for users who are planning to scale this pattern to multi-gigabyte models. It would help them transition from this example to production-scale AI.

587: Add minimal AI model inference example (scikit-learn + FastAPI)

16920cd

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 8, 2026

k8s-ci-robot requested review from kow3ns and soltysh February 8, 2026 08:22

k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Feb 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

587: Add minimal AI model inference example (scikit-learn + FastAPI)#591

587: Add minimal AI model inference example (scikit-learn + FastAPI)#591
NIRANJAN0125 wants to merge 1 commit intokubernetes:masterfrom
NIRANJAN0125:niranjan0125_587_ai_example

NIRANJAN0125 commented Feb 8, 2026

Uh oh!

k8s-ci-robot commented Feb 8, 2026

Uh oh!

k8s-ci-robot commented Feb 8, 2026

Uh oh!

Champbreed commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

NIRANJAN0125 commented Feb 8, 2026

What This PR Adds

Files

Verified Locally

Uh oh!

k8s-ci-robot commented Feb 8, 2026

Uh oh!

k8s-ci-robot commented Feb 8, 2026

Uh oh!

Champbreed commented Feb 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants