DISCLAIMER: THIS STILL DOES NOT WORK, BUT ITS GETTING THERE I GUESS (v1.2)

New shit:

~~Output constrained to 224x224~~ -> Now supports ANY resolution via tiled processing (theoretically)
~~add vae-based attack later~~ -> VAE attack added
~~write mathematical explanation in latex~~ -> See report.tex (if i make any mistakes then you can correct me)

Remaining limitations:

Not effective against multimodal big boys like ChatGPT (yet)
Shows promising effect on weaker models
Needs to be more robust (fk you SD) and transferable (fk all of you image gen AIs)
Must resist detoxification attempts, e.g.: https://github.com/huzpsb/DeTox/
Needs to work against screenshots and similar workarounds
Maybe add batch mode?
train train train train train

why did i make this: for fun and to stop the AI slop bs

also, note that some models may have been trained for defense against PGD attacks like this, so it wouldn't matter to them

however it also comes at a cost, that model loses accuracy on general tasks -> still a win for me

💊 PANACEA

Perturbation-based Adversarial Noise Attack for Copyright Enforcement and Authorship
Invisible to Humans • Hostile to Models • Transferable by Design
Engineered for maximum model damage at minimal visual cost

Panacea is a tool for protecting images from AI models through imperceptible adversarial perturbations. Similar to Nightshade and Glaze, it modifies images in ways that are invisible to humans but disrupt AI understanding.

✨ Features

Two Attack Modes

Mode	Purpose	How It Works
Targeted (Offense)	Data poisoning	Optimizes perturbations so inputs with a trigger are mapped toward a specific target class chosen by the attacker, causing controlled misclassification.
Untargeted (Defense)	Image cloaking / evasion	Maximizes loss on the true class so the input exits the correct decision region, preventing reliable recognition without enforcing any specific false label.

Initial (v1.1)

LPIPS Perceptual Loss: Uses VGG-based similarity to keep perturbations invisible (~35dB PSNR)
Saliency Masking: Reduces perturbations on edges and important features
Hybrid Attack: Combined push-and-pull for maximum disruption
CLIP-based: Targets the backbone of modern AI art generators (Stable Diffusion, DALL-E, Midjourney)

In v1.2

Full Resolution Processing: No more 224×224 limitation! Tiled processing preserves original resolution (kinda lmao)
VAE-based Attack: Latent space perturbations for more natural adversarial examples (not sure if it works)
LaTeX Report: Mathematical foundations in report.tex (NeurIPS format, but it's a mess rn)

Installation

# Clone or download the repository
cd Panacea

# Install dependencies
pip install -r requirements.txt

Requirements

Python 3.8+
PyTorch 2.0+
CUDA-capable GPU (recommended for faster processing)

Usage

Quick Demo

python main.py demo

Targeted Attack (Offense)

Make AI think your dog photo is "abstract art":

python main.py attack -i dog.png -o poisoned.png -m targeted -t "abstract art"

Untargeted Attack (Defense)

Cloak your portrait so AI can't recognize it:

python main.py attack -i portrait.png -o cloaked.png -m untargeted -l "human face portrait"

Hybrid Attack

Push away from "dog" and pull toward "cat":

python main.py attack -i dog.png -o hybrid.png -m hybrid -l "dog" -t "cat"

Adjust Visual Quality

# Higher perceptual weight = more invisible, weaker attack (default: 0.3)
python main.py attack -i img.png -o out.png -m targeted -t "cat" -p 0.5

# Disable perceptual loss for faster processing
python main.py attack -i img.png -o out.png -m targeted -t "cat" --no-perceptual

Analyze & Compare

# Check how CLIP perceives an image
python main.py analyze -i image.png -l "cat" -l "dog" -l "abstract art"

# Measure perturbation visibility
python main.py compare -o original.png -p perturbed.png

Resolution Preservation

Process images at ANY resolution using tiled processing:

# Attack a high-res image (e.g., 4000x3000)
python main.py attack-fullres -i highres.jpg -o protected.jpg -m targeted -t "abstract art"

# Limit max dimension for faster processing
python main.py attack-fullres -i huge.png -o out.png -m untargeted -l "portrait" --max-size 2048

How it works:

Splits image into overlapping 224×224 tiles (32px overlap by default)
Applies attack to each tile independently
Blends tiles with linear interpolation at overlaps
Output retains original resolution
If this fucks the image up, I blame ChatGPT

VAE-based Attack (hopefully better?)

Perturb latent space for more natural adversarial examples:

# VAE-based targeted attack
python main.py vae-attack -i photo.png -o output.png -m targeted -t "abstract art"

# VAE-based untargeted attack (cloaking)
python main.py vae-attack -i portrait.png -o cloaked.png -m untargeted -l "human face"

Parameters

Parameter	Default	Description
`--epsilon`, `-e`	0.05	Max perturbation magnitude (L∞ bound). Higher = more effective but more visible.
`--iterations`, `-n`	100	Number of PGD optimization steps. More iterations = better attack.
`--step-size`, `-s`	0.01	Step size per iteration.
`--perceptual-weight`, `-p`	0.3	Weight for LPIPS loss (0-1). Higher = more invisible, weaker attack.
`--no-perceptual`	-	Disable LPIPS perceptual loss for faster processing.
`--no-saliency`	-	Disable saliency-based masking.
`--device`, `-d`	auto	`cuda` or `cpu`. Auto-detects GPU.

How It Works

Panacea uses Projected Gradient Descent (PGD) with LPIPS perceptual constraints:

For each iteration:
    1. Compute CLIP embedding similarity
    2. Compute LPIPS perceptual loss
    3. Combine losses with perceptual weight
    4. Calculate gradients w.r.t. input pixels
    5. Apply saliency-weighted gradient update
    6. Project perturbation onto ε-ball (L∞ constraint)
    7. Clamp to valid pixel range [0, 1]

Why CLIP?

CLIP is the backbone of most modern AI image generators:

Stable Diffusion uses CLIP for text-image alignment
DALL-E uses CLIP for image ranking
Midjourney uses CLIP-like models

Perturbations effective against CLIP transfer well to these downstream models.

Quality Metrics

PSNR (Peak Signal-to-Noise Ratio): Higher = less visible perturbation
- >40 dB: Virtually invisible
- 30-40 dB: Imperceptible to most viewers ← Panacea v1.1 achieves ~35dB
- 20-30 dB: Subtle differences may be visible
LPIPS: Lower = more perceptually similar (less visible)
L∞ norm: Maximum pixel change. Bounded by epsilon parameter.

📁 Project Structure

Panacea/
├── main.py                 # Entry point
├── report.tex              # LaTeX report (NeurIPS format)
├── requirements.txt        # Dependencies
├── README.md              # This file
└── panacea/
    ├── __init__.py        # Package initialization
    ├── models.py          # CLIP model wrapper
    ├── attacks.py         # PGD attack with perceptual loss
    ├── perceptual.py      # LPIPS and saliency masking
    ├── full_resolution.py # Tile-based full-res processing (v1.2)
    ├── vae_attack.py      # VAE latent space attacks (v1.2)
    ├── utils.py           # Image I/O and metrics
    └── cli.py             # Command line interface

Improving Attack Effectiveness

Training the VAE

The VAE can be trained on your own image dataset for better reconstruction:

from panacea.vae_attack import SimpleVAE, train_vae
from torch.utils.data import DataLoader
from torchvision.datasets import ImageFolder
import torchvision.transforms as T

# Prepare dataset
transform = T.Compose([T.Resize(224), T.CenterCrop(224), T.ToTensor()])
dataset = ImageFolder("path/to/images", transform=transform)
loader = DataLoader(dataset, batch_size=32, shuffle=True)

# Train VAE
vae = SimpleVAE()
trained_vae = train_vae(vae, loader, epochs=50, device="cuda")

# Use trained VAE for attacks
from panacea import VAEAttack, load_clip_model
clip = load_clip_model()
attacker = VAEAttack(clip, vae=trained_vae)

Other Ways to Improve Attacks

Method	Description	Difficulty
Ensemble attacks	Optimize against multiple CLIP models (ViT-B/32, ViT-L/14, RN50)	Medium
Longer optimization	More iterations (500+) with smaller step size	Easy
Lower perceptual weight	Trade visibility for stronger attack	Easy
Train VAE on target domain	Better latent space for specific image types	Medium
Diffusion-based attacks	Use SD's U-Net gradients directly	Hard
A lot more methods that I haven't found out yet	it takes me on average 2 hours just to read one paper	Impossible

Future Directions

Batch processing mode for multiple images
Frequency-domain attacks (DCT/FFT perturbations)
Multi-model ensemble (CLIP + DINO + DINOv2)
Anti-screenshot robustness (survive JPEG/resize)
DeTox resistance testing

⚠️ Ethical Considerations

This tool is designed for legitimate defensive purposes:

✅ Appropriate Uses:

Protecting your own artwork from unauthorized AI training
Research into adversarial robustness
Understanding AI model vulnerabilities

❌ Inappropriate Uses:

Applying to images you don't own
Maliciously poisoning public datasets (I will find you and I will beat the shit out of you, personally)
Bypassing content moderation systems

References

Anti-DreamBooth - Data poisoning tool from VinAI
Mist - Data poisoning tool from several PhD students from US and China
Nightshade - Data poisoning tool from UChicago
Glaze - Style mimicry protection
CLIP - OpenAI's vision-language model
LPIPS - Learned Perceptual Image Patch Similarity
PGD Attack - Madry et al., "Towards Deep Learning Models Resistant to Adversarial Attacks"
A bunch of other very famous and helpful paper that I won't have the space to list here, but it would be helpful if you've read about DDPM, VAE, GAN, and all that epic shit

License

This project is licensed under the GNU General Public License v3.0 (GPLv3).

You are free to:

Use, study, and modify the source code
Redistribute modified versions under the same license

Under the following conditions:

Derivative works must remain open-source under GPLv3
Attribution is required
No warranty is provided

Ethical Use Clause (repeated again because this is important)

This software is intended for defensive, research, and self-protection purposes only.

You must not:

Use Panacea to poison datasets you do not own or control
Deploy it at scale against public or community datasets
Weaponize it for harassment, sabotage, or model vandalism

If you do, that’s on you - legally, ethically, and karmically.
The author disclaims responsibility for misuse.

Use responsibly.

👤 Author

_{Nguyen Dan Vu}
_{Lone Wolf

Professional AI Art Hater}

and no I don't fap to AI-generated Mirko anymore, screw you mfs

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
panacea		panacea
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

License

dxpawn/PANACEA

Folders and files

Latest commit

History

Repository files navigation

DISCLAIMER: THIS STILL DOES NOT WORK, BUT ITS GETTING THERE I GUESS (v1.2)

why did i make this: for fun and to stop the AI slop bs

💊 PANACEA

✨ Features

Two Attack Modes

Initial (v1.1)

In v1.2

Installation

Requirements

Usage

Quick Demo

Targeted Attack (Offense)

Untargeted Attack (Defense)

Hybrid Attack

Adjust Visual Quality

Analyze & Compare

Resolution Preservation

VAE-based Attack (hopefully better?)

Parameters

How It Works

Why CLIP?

Quality Metrics

📁 Project Structure

Improving Attack Effectiveness

Training the VAE

Other Ways to Improve Attacks

Future Directions

⚠️ Ethical Considerations

References

License

Ethical Use Clause (repeated again because this is important)

👤 Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Languages