TUD Anomaly Detection Model (ONNX)

Model Info

This repository contains a trained Autoencoder-based anomaly detection model developed in the context of the MLSysOps project (Machine Learning for Autonomic System Operation in the Heterogeneous Edge-Cloud Continuum), funded by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101092912.

The model is exported in ONNX format for efficient inference on edge or cloud devices.

Purpose

This model performs unsupervised anomaly detection on node/VM telemetry metrics by learning to reconstruct normal observations.

Input: A feature vector of telemetry metrics (float values), normalized with Min-Max scaling.
Output: The reconstructed feature vector.
Anomaly score: RMSE between input and reconstruction.
Decision rule: anomaly if RMSE > threshold (threshold stored in model_config.json).

Repository Structure

The repository provides the trained model and its configuration for easy deployment.

.
├── demo.py                  # Inference script (ONNXRuntime)
├── model/
│   ├── autoencoder.onnx     # ONNX model
│   └── model_config.json    # Model configuration (features, normalization, threshold)
├── requirements.txt         # Python dependencies
└── README.md                # Documentation

Training Data

The model was trained on telemetry data representing normal system behavior. The training dataset is not included in this Zenodo record unless explicitly provided in the uploaded files.

Important: The inference input must use the same feature ordering as the training data.

Features Used (Feature Order)

The expected feature order (last dimension of the input tensor) is:

cpu_0_idle
cpu_0_iowait
cpu_0_irq
cpu_0_nice
cpu_0_softirq
cpu_0_steal
cpu_0_system
cpu_0_user
cpu_1_idle
cpu_1_iowait
cpu_1_irq
cpu_1_nice
cpu_1_softirq
cpu_1_steal
cpu_1_system
cpu_1_user
cpu_2_idle
cpu_2_iowait
cpu_2_irq
cpu_2_nice
cpu_2_softirq
cpu_2_steal
cpu_2_system
cpu_2_user
cpu_3_idle
cpu_3_iowait
cpu_3_irq
cpu_3_nice
cpu_3_softirq
cpu_3_steal
cpu_3_system
cpu_3_user
memory_used_bytes
node_memory_Buffers_bytes
node_memory_Cached_bytes
node_memory_MemAvailable_bytes
node_memory_MemFree_bytes
node_memory_MemTotal_bytes

(These names must match model/model_config.json.)

Model Architecture

This model is a fully-connected Autoencoder with ReLU activations:

Encoder dims: feature_size -> int(0.75*feature_size) -> int(0.5*feature_size) -> int(0.25*feature_size) -> int(0.1*feature_size)
Decoder dims: symmetric back to feature_size

Model Specification

Inputs

The model accepts a single tensor representing the telemetry feature vector.

Input name	Shape	Type	Description
`x`	`[batch_size, 38]`	float32	Min-Max normalized feature vector

Preprocessing

Min-Max normalization is applied using per-feature min and max values stored in model/model_config.json:

x_norm = (x - min) / (max - min)
If a feature has max == min (constant feature in training), normalization must avoid division by zero. Recommended behavior (used in the provided demo script): set that normalized feature to 0.0.

Optionally clamp x_norm to [0, 1] if desired (configurable via model_config.json).

Outputs

The ONNX graph outputs the reconstructed input vector.

Output name	Shape	Type	Description
`reconstruction`	`[batch_size, 38]`	float32	Reconstructed feature vector

Post-processing (Anomaly Detection)

The anomaly score is computed outside the ONNX graph:

rmse = sqrt(mean((x_norm - reconstruction)^2)) per sample
anomaly = 1 if rmse > threshold else 0
threshold is stored in model/model_config.json

Limitations

Feature order & dimension are fixed: Inputs must have exactly 38 features in the specified order.
Normalization is training-dependent: Min/Max parameters are derived from the training data distribution; out-of-distribution inputs may yield unreliable anomaly scores.
Constant features: Features with max == min require special handling during normalization (avoid division by zero).
ONNX output is reconstruction only: The anomaly score/label is computed in the inference script.

Usage Demo

1. Setup Environment

Create a virtual environment and install dependencies:

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Example requirements.txt:

numpy
onnxruntime

2. Run Inference Script

The demo script loads the ONNX model, applies preprocessing, runs inference, and outputs RMSE + anomaly label.

Run with a CSV input:

python demo.py --model model/autoencoder.onnx --config model/model_config.json --csv telemetry.csv --row 0

CSV Format Requirements

CSV must include a header row.
Numeric columns only (or ensure the numeric columns match the 38 features exactly).
Column order must match the feature list in this README and model_config.json.

If --csv is not provided, the script may run on a random normalized sample (sanity check only).

Citation

If you wish to cite this model, please use the citation generated by Zenodo (located in the right sidebar of this record).

Acknowledgement & Funding

This work is part of the MLSysOps project, funded by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101092912.

More information about the project is available at https://mlsysops.eu/

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
model		model
README.md		README.md
demo.py		demo.py
reuqirements.txt		reuqirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TUD Anomaly Detection Model (ONNX)

Model Info

Purpose

Repository Structure

Training Data

Features Used (Feature Order)

Model Architecture

Model Specification

Inputs

Preprocessing

Outputs

Post-processing (Anomaly Detection)

Limitations

Usage Demo

1. Setup Environment

2. Run Inference Script

CSV Format Requirements

Citation

Acknowledgement & Funding

About

Uh oh!

Contributors 2

Uh oh!

Languages

mlsysops-eu/model-anomaly-detection

Folders and files

Latest commit

History

Repository files navigation

TUD Anomaly Detection Model (ONNX)

Model Info

Purpose

Repository Structure

Training Data

Features Used (Feature Order)

Model Architecture

Model Specification

Inputs

Preprocessing

Outputs

Post-processing (Anomaly Detection)

Limitations

Usage Demo

1. Setup Environment

2. Run Inference Script

CSV Format Requirements

Citation

Acknowledgement & Funding

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors 2

Uh oh!

Languages