Skip to content

Machine learning model for anomaly detection on VM telemetry metrics. Developed by TU Delft for the EU MLSysOps project.

Notifications You must be signed in to change notification settings

mlsysops-eu/model-anomaly-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

TUD Anomaly Detection Model (ONNX)

Model Info

This repository contains a trained Autoencoder-based anomaly detection model developed in the context of the MLSysOps project (Machine Learning for Autonomic System Operation in the Heterogeneous Edge-Cloud Continuum), funded by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101092912.

The model is exported in ONNX format for efficient inference on edge or cloud devices.

Purpose

This model performs unsupervised anomaly detection on node/VM telemetry metrics by learning to reconstruct normal observations.

  • Input: A feature vector of telemetry metrics (float values), normalized with Min-Max scaling.
  • Output: The reconstructed feature vector.
  • Anomaly score: RMSE between input and reconstruction.
  • Decision rule: anomaly if RMSE > threshold (threshold stored in model_config.json).

Repository Structure

The repository provides the trained model and its configuration for easy deployment.

.
├── demo.py                  # Inference script (ONNXRuntime)
├── model/
│   ├── autoencoder.onnx     # ONNX model
│   └── model_config.json    # Model configuration (features, normalization, threshold)
├── requirements.txt         # Python dependencies
└── README.md                # Documentation

Training Data

The model was trained on telemetry data representing normal system behavior. The training dataset is not included in this Zenodo record unless explicitly provided in the uploaded files.

Important: The inference input must use the same feature ordering as the training data.

Features Used (Feature Order)

The expected feature order (last dimension of the input tensor) is:

  1. cpu_0_idle
  2. cpu_0_iowait
  3. cpu_0_irq
  4. cpu_0_nice
  5. cpu_0_softirq
  6. cpu_0_steal
  7. cpu_0_system
  8. cpu_0_user
  9. cpu_1_idle
  10. cpu_1_iowait
  11. cpu_1_irq
  12. cpu_1_nice
  13. cpu_1_softirq
  14. cpu_1_steal
  15. cpu_1_system
  16. cpu_1_user
  17. cpu_2_idle
  18. cpu_2_iowait
  19. cpu_2_irq
  20. cpu_2_nice
  21. cpu_2_softirq
  22. cpu_2_steal
  23. cpu_2_system
  24. cpu_2_user
  25. cpu_3_idle
  26. cpu_3_iowait
  27. cpu_3_irq
  28. cpu_3_nice
  29. cpu_3_softirq
  30. cpu_3_steal
  31. cpu_3_system
  32. cpu_3_user
  33. memory_used_bytes
  34. node_memory_Buffers_bytes
  35. node_memory_Cached_bytes
  36. node_memory_MemAvailable_bytes
  37. node_memory_MemFree_bytes
  38. node_memory_MemTotal_bytes

(These names must match model/model_config.json.)

Model Architecture

This model is a fully-connected Autoencoder with ReLU activations:

  • Encoder dims: feature_size -> int(0.75*feature_size) -> int(0.5*feature_size) -> int(0.25*feature_size) -> int(0.1*feature_size)
  • Decoder dims: symmetric back to feature_size

Model Specification

Inputs

The model accepts a single tensor representing the telemetry feature vector.

Input name Shape Type Description
x [batch_size, 38] float32 Min-Max normalized feature vector

Preprocessing

Min-Max normalization is applied using per-feature min and max values stored in model/model_config.json:

  • x_norm = (x - min) / (max - min)
  • If a feature has max == min (constant feature in training), normalization must avoid division by zero. Recommended behavior (used in the provided demo script): set that normalized feature to 0.0.

Optionally clamp x_norm to [0, 1] if desired (configurable via model_config.json).

Outputs

The ONNX graph outputs the reconstructed input vector.

Output name Shape Type Description
reconstruction [batch_size, 38] float32 Reconstructed feature vector

Post-processing (Anomaly Detection)

The anomaly score is computed outside the ONNX graph:

  • rmse = sqrt(mean((x_norm - reconstruction)^2)) per sample
  • anomaly = 1 if rmse > threshold else 0
  • threshold is stored in model/model_config.json

Limitations

  • Feature order & dimension are fixed: Inputs must have exactly 38 features in the specified order.
  • Normalization is training-dependent: Min/Max parameters are derived from the training data distribution; out-of-distribution inputs may yield unreliable anomaly scores.
  • Constant features: Features with max == min require special handling during normalization (avoid division by zero).
  • ONNX output is reconstruction only: The anomaly score/label is computed in the inference script.

Usage Demo

1. Setup Environment

Create a virtual environment and install dependencies:

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Example requirements.txt:

numpy
onnxruntime

2. Run Inference Script

The demo script loads the ONNX model, applies preprocessing, runs inference, and outputs RMSE + anomaly label.

Run with a CSV input:

python demo.py --model model/autoencoder.onnx --config model/model_config.json --csv telemetry.csv --row 0

CSV Format Requirements

  • CSV must include a header row.
  • Numeric columns only (or ensure the numeric columns match the 38 features exactly).
  • Column order must match the feature list in this README and model_config.json.

If --csv is not provided, the script may run on a random normalized sample (sanity check only).

Citation

If you wish to cite this model, please use the citation generated by Zenodo (located in the right sidebar of this record).

Acknowledgement & Funding

This work is part of the MLSysOps project, funded by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101092912.

More information about the project is available at https://mlsysops.eu/

About

Machine learning model for anomaly detection on VM telemetry metrics. Developed by TU Delft for the EU MLSysOps project.

Topics

Resources

Stars

Watchers

Forks

Contributors 2

  •  
  •  

Languages