aiSSEMBLE™ Inference

A modular Python library for deploying ML models to production using the Open Inference Protocol. Built on MLServer, designed for enterprise deployment.

How aiSSEMBLE Inference Fits In

aiSSEMBLE Inference is a toolkit for the full ML deployment lifecycle - from packaging models to consuming them in applications. It leverages MLServer as the inference runtime across all environments, with optional KServe integration for serverless Kubernetes deployments.

┌─────────────────────────────────────────────────────────────────┐
│                     aiSSEMBLE Inference                         │
│    Deployment Tooling (inference deploy)  +  Client Library     │
└──────────────────────────────┬──────────────────────────────────┘
                               │ generates / speaks OIP to
                               ▼
┌─────────────────────────────────────────────────────────────────┐
│                          MLServer                               │
│    Lightweight Python inference server - works everywhere       │
│    Local  →  Docker  →  Kubernetes  →  KServe (optional)        │
└─────────────────────────────────────────────────────────────────┘

What aiSSEMBLE Inference Adds

Layer	What It Does
`inference deploy`	Generates deployment configs for MLServer across environments
Client library	Abstracts tensor complexity into task-specific APIs

Deployment Progression

Target	Infrastructure	Use Case
`local`	MLServer	Development
`docker`	MLServer + Docker	Containerized deployment
`kubernetes`	MLServer + K8s	Production Kubernetes
`kserve`	MLServer + KServe	Serverless ML (autoscaling, scale-to-zero)

You don't need aiSSEMBLE Inference if you already have deployment workflows and OIP client code you're happy with - MLServer and KServe are excellent tools on their own.

Key Features

Tensor Abstraction

Work with domain objects, not raw tensors:

# Traditional OIP: Manual tensor parsing
outputs = response.json()["outputs"]
bbox_tensor = next(o for o in outputs if o["name"] == "bboxes")
bboxes = bbox_tensor["data"]  # Is this [N,4] or [1,N,4]? What coordinate system?

# aiSSEMBLE Inference: Typed domain objects
client = InferenceClient(adapter, endpoint)
result = client.detect_object().image("dog.jpg").confidence(0.5).run()
for detection in result.detections:
    print(f"{detection.label} at {detection.bbox}")

Write Once, Deploy Many

Generate deployment configs for multiple targets from a single model:

pip install aissemble-inference-deploy
inference deploy init --target local --target docker --target kubernetes --target kserve

Target	Description
`local`	MLServer scripts for development
`docker`	Multi-stage Dockerfile + Docker Compose
`kubernetes`	Kustomize manifests with dev/prod overlays
`kserve`	ServingRuntime + InferenceService with scale-to-zero

See aissemble-inference-deploy/README.md for details.

Installation

# Core library
pip install aissemble-inference-core

# Model modules (install as needed)
pip install aissemble-inference-yolo    # YOLO object detection
pip install aissemble-inference-sumy    # Text summarization

# Deployment tooling
pip install aissemble-inference-deploy

Modules

Module	Description
`aissemble-inference-core`	Base abstractions (OipAdapter, Translator, Predictor)
`aissemble-inference-deploy`	Deployment config generation (Local, Docker, K8s, KServe)
`aissemble-inference-yolo`	YOLO model family (v5, v8, v11)
`aissemble-inference-sumy`	Text summarization (TextRank, LSA, LexRank)

Modules auto-register via Python entry points:

from aissemble_inference_core.client import InferenceClient, ModuleRegistry

# Discover installed modules
print(ModuleRegistry.instance().list_available())
# {'runtimes': ['yolo', 'sumy'], 'translators': ['yolo', 'sumy', 'object_detection'], ...}

# Use object detection with fluent API
client = InferenceClient(adapter, endpoint)
result = client.detect_object("yolo").image("photo.jpg").confidence(0.5).run()

# Text summarization
summary = client.summarize("sumy").text("Long article...").max_length(100).run()

Examples

Object Detection: aissemble-inference-examples/aissemble-object-detection-example/
Text Summarization: aissemble-inference-examples/aissemble-summarization-example/

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 197 Commits
.github		.github
.mvn		.mvn
aissemble-inference-core		aissemble-inference-core
aissemble-inference-deploy		aissemble-inference-deploy
aissemble-inference-examples		aissemble-inference-examples
aissemble-inference-modules		aissemble-inference-modules
docs/design		docs/design
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
DRAFT_RELEASE_NOTES.md		DRAFT_RELEASE_NOTES.md
LICENSE.txt		LICENSE.txt
README.md		README.md
pom.xml		pom.xml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

aiSSEMBLE™ Inference

How aiSSEMBLE Inference Fits In

What aiSSEMBLE Inference Adds

Deployment Progression

Key Features

Tensor Abstraction

Write Once, Deploy Many

Installation

Modules

Examples

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

aiSSEMBLE™ Inference

How aiSSEMBLE Inference Fits In

What aiSSEMBLE Inference Adds

Deployment Progression

Key Features

Tensor Abstraction

Write Once, Deploy Many

Installation

Modules

Examples

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages