Skip to content

Coragon42/watermarking

Repository files navigation

AI Watermarking Research – ERSP 2024

This repository contains code and resources for our research on watermarking AI-generated content, conducted under the guidance of Professor Ananth as part of the Early Research Scholars Program (ERSP) at UCSB.

📌 Overview

This project explores various LLM watermarking techniques to embed and detect watermarks in generated text while preserving semantic integrity. The repository includes:

  • SynthID – Experimenting with Google’s SynthID watermarking method.
  • Soft Watermarking – Revisiting the pioneer of key-based watermarks.
  • Unigram Watermarking – Evaluating a similar watermark with a fixed green list for comparison.

📂 Repository Contents

👥 Contributors

  • Zeel Patel
  • Brian Sen
  • Siddhi Mundhra
  • Emerson Yu

Noted

This repository primarily consists of collaborative notebooks and code used in our research.

Environment setup for all three watermarks, after creating and activating environment (python=3.11.11):

  • python -m pip install “synthid-text[notebook]” notebook absl-py mock pytest tensorflow-datasets>=4.9.3 SentencePiece accelerate>=0.26.0 safetensors>=0.4.1 bitsandbytes tf-keras
  • python -m pip install --upgrade jax jaxlib flax transformers optax
  • after installing CUDA Toolkit, install PyTorch accordingly, e.g.:
  • python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
  • python -m pip install gradio nltk

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •