Continual-Intelligence · PavlosKolivatzis · Oct 16, 2025 · Oct 16, 2025
diff --git a/README.md b/README.md
@@ -54,17 +54,50 @@ pip install -r requirements.txt
 
 ### 4. Configure environment
 
-Create a `.env` file in the project root and add your OpenAI API key:
+To run the SEAL framework, you will need to provide an OpenAI API key. This key is used to access the GPT models that are used for generating self-edits.
+
+Create a `.env` file in the project root and add your OpenAI API key to it:
 
 ```env
 OPENAI_API_KEY=your_openai_api_key_here
 ```
 
+The `.env` file is used to store environment variables that are specific to your local environment. The SEAL framework will automatically load the variables from this file.
+
 ### 5. SLURM users
 
 Before running any shell scripts, make sure to update the SLURM directives at the top of each `.sh` file to match your system configuration. All experiments can be run with 2 A100/H100 GPUs. Other setups may require refactoring and/or changing model sizes.
 
 
+## 📂 Project Structure
+
+The project is organized as follows:
+
+```
+.
+├── few-shot/
+│   ├── README.md
+│   ├── ...
+├── general-knowledge/
+│   ├── README.md
+│   ├── ...
+├── LICENSE
+├── README.md
+└── requirements.txt
+```
+
+*   `few-shot/`: This directory contains the code and data for the few-shot learning experiments.
+*   `general-knowledge/`: This directory contains the code and data for the general-knowledge experiments.
+*   `LICENSE`: The license for the project.
+*   `README.md`: The main README file for the project.
+*   `requirements.txt`: The Python dependencies for the project.
+
+
+## 📄 License
+
+This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
+
+
 ## 📄 Citation
 
 If you found this work useful, please cite:

diff --git a/few-shot/README.md b/few-shot/README.md
@@ -1,8 +1,22 @@
 # SEAL - few-shot
 
+This document provides the commands to reproduce the SEAL few-shot learning experiments on the ARC dataset. The goal of these experiments is to demonstrate how SEAL can adapt to new tasks with only a few examples.
+
+The experiments are divided into several steps, including training the base model, evaluating the trained models, and running a baseline evaluation. By following the steps in this guide, you will be able to reproduce the results from the paper.
+
 This document contains the commands to reproduce the SEAL Few Shot Experiments on ARC. 
 Code is adopted from: [Ekin's Repo](https://github.com/ekinakyurek/marc/tree/main)
 
+## Environment Variables
+
+Before running the experiments, you will need to set the following environment variables:
+
+*   `DATA_DIR`: This variable should point to the directory where the ARC dataset is located.
+*   `TTI_DIR`: This variable should point to the directory where the training text-to-image data is located.
+*   `LORA_DIR`: This variable should point to the directory where the LoRA checkpoints will be saved.
+
+Please make sure to set these variables to the correct paths in your environment before running the commands.
+
 ## SEAL RL Iteration 1
 
 ### 1. Training on 12 Problems (Iteration 1)

diff --git a/few-shot/eval-self-edits.py b/few-shot/eval-self-edits.py
@@ -1,3 +1,16 @@
+"""
+This script evaluates the performance of the self-edited models.
+
+The script takes as input a set of tasks and a set of LoRA checkpoints, and it
+evaluates the performance of each model on each task. The script can be used to
+evaluate the performance of the baseline model, as well as the performance of
+the self-edited models.
+
+The script is designed to be run from the command line and takes several
+arguments, including the path to the data and solution files, the path to the
+pretrained checkpoint, the path to the LoRA checkpoints folder, and the number
+of self-edits to evaluate.
+"""
 import argparse
 import glob
 import json

diff --git a/few-shot/self-edit.py b/few-shot/self-edit.py
@@ -1,3 +1,23 @@
+"""
+This script implements the self-editing process for the SEAL project.
+
+The script is divided into two main phases:
+
+Phase 1: Generate configs using a self-edit model.
+In this phase, the script uses a language model to generate a set of "self-edit"
+configurations for each task. These configurations specify how to augment the
+training data and what training parameters to use.
+
+Phase 2: Train models using the generated configs.
+In this phase, the script uses the generated configurations to train a set of
+LoRA models. Each model is trained on a different augmented version of the
+training data with different training parameters.
+
+The script is designed to be run from the command line and takes several
+arguments, including the name of the experiment, the path to the challenge and
+solution files, the name of the model to use, and the number of tasks and
+self-edits to perform.
+"""
 import os
 import re
 import json
@@ -268,10 +288,24 @@ def format_and_filter(formatter, tokenizer, task, train_on_input: False):
 def get_test_time_train_data(
     original_task: Task, augmenters: List[Augmenter], n: int = 1, permute_n: int = 1, seed: int = 0
 ) -> List[Task]:
+    """
+    Generates a set of training tasks for a given original task.
+
+    This function creates a set of training tasks by applying a series of
+    augmentations to the original task. The augmentations include basic
+    transformations like rotations and flips, as well as more complex
+    augmentations like increasing the resolution of the images.
+
+    The function also creates new training tasks by leaving out one or more
+    of the original training examples. This is a form of leave-one-out
+    cross-validation and helps to improve the generalization of the model.
+    """
     rng = np.random.RandomState(seed)
     train_examples = original_task.train_examples.copy()
     initial_tasks = []
     N = len(train_examples)
+
+    # Create new tasks by leaving out n-1 training examples.
     for i in range(len(train_examples)):
         examples = train_examples.copy()
         indices = set(range(N)) - {i}
@@ -283,6 +317,7 @@ def get_test_time_train_data(
                 Task(name="", train_examples=[examples[j] for j in comb], test_example=examples[i])
             )
 
+    # Apply augmentations to the initial tasks.
     augmented_tasks = []
     for augmenter in augmenters:
         for task in initial_tasks:
@@ -294,8 +329,8 @@ def get_test_time_train_data(
 
     augmented_tasks = list(set(augmented_tasks + initial_tasks))
 
+    # Permute the colors and examples of the augmented tasks.
     color_and_permute_augmented_tasks = []
-
     for _ in range(permute_n):
         for task in augmented_tasks:
             if len(augmenters) != 0:

diff --git a/general-knowledge/README.md b/general-knowledge/README.md
@@ -1,6 +1,16 @@
 # SEAL - general-knowledge
 
-This is an implementation of SEAL for the *general knowledge incorporation* setting, where the goal is to update or integrate new information from a passage into weights.
+This document provides instructions on how to reproduce the SEAL experiments for the *general knowledge incorporation* setting. In this setting, the goal is to update or integrate new information from a passage into the language model's weights.
+
+The experiments involve several steps, including creating synthetic data, running a training and evaluation server, and performing Reinforcement Learning (RL) training. By following this guide, you will be able to reproduce the results from the paper.
+
+## Prerequisites
+
+Before you begin, please ensure that you have the following prerequisites:
+
+*   **SLURM:** These experiments are designed to be run on a SLURM cluster. You will need to have access to a SLURM cluster and be familiar with submitting jobs using `sbatch`.
+*   **ZMQ:** The experiments use [ZMQ](https://zeromq.org/) for communication between the different components of the system. Please ensure that ZMQ is installed on your system.
+*   **Python dependencies:** Please make sure that you have installed all the Python dependencies listed in the `requirements.txt` file in the root of the project.
 
 ## Usage