This repository contains a trained Autoencoder-based anomaly detection model developed in the context of the MLSysOps project (Machine Learning for Autonomic System Operation in the Heterogeneous Edge-Cloud Continuum), funded by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101092912.
The model is exported in ONNX format for efficient inference on edge or cloud devices.
This model performs unsupervised anomaly detection on node/VM telemetry metrics by learning to reconstruct normal observations.
- Input: A feature vector of telemetry metrics (float values), normalized with Min-Max scaling.
- Output: The reconstructed feature vector.
- Anomaly score: RMSE between input and reconstruction.
- Decision rule: anomaly if
RMSE > threshold(threshold stored inmodel_config.json).
The repository provides the trained model and its configuration for easy deployment.
.
├── demo.py # Inference script (ONNXRuntime)
├── model/
│ ├── autoencoder.onnx # ONNX model
│ └── model_config.json # Model configuration (features, normalization, threshold)
├── requirements.txt # Python dependencies
└── README.md # Documentation
The model was trained on telemetry data representing normal system behavior. The training dataset is not included in this Zenodo record unless explicitly provided in the uploaded files.
Important: The inference input must use the same feature ordering as the training data.
The expected feature order (last dimension of the input tensor) is:
- cpu_0_idle
- cpu_0_iowait
- cpu_0_irq
- cpu_0_nice
- cpu_0_softirq
- cpu_0_steal
- cpu_0_system
- cpu_0_user
- cpu_1_idle
- cpu_1_iowait
- cpu_1_irq
- cpu_1_nice
- cpu_1_softirq
- cpu_1_steal
- cpu_1_system
- cpu_1_user
- cpu_2_idle
- cpu_2_iowait
- cpu_2_irq
- cpu_2_nice
- cpu_2_softirq
- cpu_2_steal
- cpu_2_system
- cpu_2_user
- cpu_3_idle
- cpu_3_iowait
- cpu_3_irq
- cpu_3_nice
- cpu_3_softirq
- cpu_3_steal
- cpu_3_system
- cpu_3_user
- memory_used_bytes
- node_memory_Buffers_bytes
- node_memory_Cached_bytes
- node_memory_MemAvailable_bytes
- node_memory_MemFree_bytes
- node_memory_MemTotal_bytes
(These names must match model/model_config.json.)
This model is a fully-connected Autoencoder with ReLU activations:
- Encoder dims:
feature_size -> int(0.75*feature_size) -> int(0.5*feature_size) -> int(0.25*feature_size) -> int(0.1*feature_size) - Decoder dims: symmetric back to
feature_size
The model accepts a single tensor representing the telemetry feature vector.
| Input name | Shape | Type | Description |
|---|---|---|---|
x |
[batch_size, 38] |
float32 | Min-Max normalized feature vector |
Min-Max normalization is applied using per-feature min and max values stored in model/model_config.json:
x_norm = (x - min) / (max - min)- If a feature has
max == min(constant feature in training), normalization must avoid division by zero. Recommended behavior (used in the provided demo script): set that normalized feature to0.0.
Optionally clamp x_norm to [0, 1] if desired (configurable via model_config.json).
The ONNX graph outputs the reconstructed input vector.
| Output name | Shape | Type | Description |
|---|---|---|---|
reconstruction |
[batch_size, 38] |
float32 | Reconstructed feature vector |
The anomaly score is computed outside the ONNX graph:
rmse = sqrt(mean((x_norm - reconstruction)^2))per sampleanomaly = 1 if rmse > threshold else 0thresholdis stored inmodel/model_config.json
- Feature order & dimension are fixed: Inputs must have exactly 38 features in the specified order.
- Normalization is training-dependent: Min/Max parameters are derived from the training data distribution; out-of-distribution inputs may yield unreliable anomaly scores.
- Constant features: Features with
max == minrequire special handling during normalization (avoid division by zero). - ONNX output is reconstruction only: The anomaly score/label is computed in the inference script.
Create a virtual environment and install dependencies:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Example requirements.txt:
numpy
onnxruntime
The demo script loads the ONNX model, applies preprocessing, runs inference, and outputs RMSE + anomaly label.
Run with a CSV input:
python demo.py --model model/autoencoder.onnx --config model/model_config.json --csv telemetry.csv --row 0
- CSV must include a header row.
- Numeric columns only (or ensure the numeric columns match the 38 features exactly).
- Column order must match the feature list in this README and
model_config.json.
If --csv is not provided, the script may run on a random normalized sample (sanity check only).
If you wish to cite this model, please use the citation generated by Zenodo (located in the right sidebar of this record).
This work is part of the MLSysOps project, funded by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101092912.
More information about the project is available at https://mlsysops.eu/