Skip to content

altrup/model-ablater

Repository files navigation

AI Model Ablater

A program that allows you to ablate and inspect AI models easily

  • I'm envisioning that this would render the value of all the nodes upon giving a model a prompt, and then allow you to weaken the weights that result in the nodes being "activated"

Example Image

Image of Token Activation Viewer with Values

NOTE: Currently, this has only been tested on Llama 3.2-3B-Instruct

Setup

Guide to installing repository and required packages

Prerequisites

  • Python 3.13.2

Project Installation

  • Clone repository

    git clone https://github.com/altrup/model-ablater.git
  • Enter newly created folder

    cd model-ablater
  • Create Python virtual environment

    python -m venv .venv

  • Activate Python virtual environment

    • Linux

      source .venv/bin/activate
    • Windows

      .venv\Scripts\activate
  • Download packages

    pip install -r requirements.txt

Example Usage

Prerequisites

Installing Models

  • Hugging Face model will be installed into ./model/model_name
python install.py --model-id "meta-llama/Llama-3.2-3B-Instruct"

Generating Images

NOTE: Add -h option to any script for more info

  • Generate the tensors from a sample text
python get_tensors.py
  • Generate mappings (optional)
python gen_mappings.py
  • Generate images
python gen_images.py
  • View activations (click on pixels to select them)
python view_activations.py
  • Run ablated model (selected pixels will be set to 0)
python test_model.py

About

A program that allows you to ablate a transformer model easily

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages