GitHub - derekn4/EmpatheticLLM: Few Shot LLM Training of GPT2 Dialogue models to output empathetic responses

Virtual Empathy: The Illusion of Conscientiousness in Conversational LMs

Training a GPT2 model with conversational data to demonstrate empathetic dialogues.
Explore the docs »

Table of Contents

About The Project
- Built With
- Libraries Used
DioloGPT Code: How it works
trltoxic Code: How it works
Contact

About The Project

Dataset Curation using Few-shot learning for Empathetic Conversational Agents. In this project, we aimed to build a conversational agent for mental health applications that is more “human-like” in its approach and has the following characteristics:

Is empathetic
Has a sense of morality
Is self-aware (Knows when to not respond)
Doesn’t generate triggering responses

Specifically, we graded the final model on these categories through Human Evaluations:

Natural Flow
Context Dependence
Topic Consistency
Speaker Consistency
Specificity
Interestingness

Our primary objective is to develop an empathetic conversational agent that is specifically tailored for self-care and emotional support settings.

(back to top)

Built With

(back to top)

Libraries Used

HuggingFace
Transformers
Pytorch
Pandas
Sklearn
numpy

(back to top)

DioloGPT Code: How it works

Install all necessary libraries to run DialoGPT.ipynb

Data Processing

Dataset is pulled from local storage "FB_Multi_Train.csv"
Preprocessing of Data required:
Tokenization
End-of-Sentence Token addition
Flattening Conversations
Padding
Caching Features

Args Class

After importing the Transformers Library and various Pytorch imports, the Args() class is initialized
- This class defines a set of parameters for configuring the training process.
- Parameters include:
  - paths to model, tokenizer, and output directories
  - batch sizes
  - learning rates
  - gradient accumulation steps
  - number of epochs
  - and various other training hyperparameters.

Construct Conversations Function

"construct_conv" function:
- This function takes a conversation row, a tokenizer, and an optional argument eos (end-of-sentence)
- Encodes each utterance in the conversation using the tokenizer and appends an end-of-sentence token if "eos" is True.
- Conversation is flattened into a single list of token IDs
- Function returns the flattened list of token IDs representing the conversation.

ConversationDataset Class

This class inherits from "Dataset", which is a PyTorch class for representing datasets in PyTorch.
The "init" method initializes the dataset.
- Takes parameters including a tokenizer, args (training arguments), df (Dataframe containing conversation data), and an optional "block_size" (sets the maximum length of the sequence)
- Checks if cached features exist and loads them if overwrite_cache parameter is set to False
  - Otherwise, it creates features from the dataset and saves them.
- Constructs examples from the dataset by iterating over each row in the DataFrame.
- Encodes the conversation using the "construct_conv" function and adds it to the examples list if its length is less than block_size.
- len method returns the total number of examples in the dataset.
- getitem method retrieves an item from the dataset. It returns a PyTorch tensor containing the token IDs of the conversation at index item.

Train Function

This function is responsible for training the model.
Initializes a TensorBoard writer for logging
Sets up training batch size and collation function for the DataLoader
Calculates total number of optimization steps based on the number of training examples, gradient accumulation steps, and number of epochs.
Initializes optimizer and scheduler for learning rate scheduling
- loads optimizer and scheduler states if they already exist
Initializes mixed precision training if "args.fp16" is enabled.
Sets up multi-GPU and distributed training if multiple GPUs are available.
Iterates through epochs and batches, calculates loss, performs backpropagation, and updates model parameters.
Logs training progress, evaluates the model periodically, and saves checkpoints.
Manages the maximum number of steps for training.
Closes the TensorBoard writer.

Evalute Function

This function evaluates the model performance on a validation dataset.
Sets up the evaluation batch size and collation function for the DataLoader.
Initializes a DataLoader for the evaluation dataset.
Performs evaluation by iterating through batches, calculating loss, and accumulating evaluation metrics.
Computes the perplexity metric based on the evaluation loss.
Logs evaluation results and writes them to an output file.
Returns the evaluation results as a dictionary.

(back to top)

trltoxic Code: How it works

This script performs fine-tuning of a language model using the Proximal Policy Optimization (PPO) algorithm to generate less toxic text. Hence, "trl" for Transformer Reinforcement Learning.

Below are some key steps and components of the script:

Script Arguments and Configuration

The script uses dataclass to define script arguments such as the model name, learning rate, mini-batch size, etc.
It uses HfArgumentParser to parse the arguments and configure the PPO training.

Dataset Building

The build_dataset function is defined to prepare the dataset for training.
It loads the data from a CSV file, tokenizes it, filters out short samples, and splits it into training and validation sets.

Model Initialization

The script loads a pretrained language model for causal language modeling (LM).
It then creates a value head for the LM using AutoModelForCausalLMWithValueHead.

PPO Trainer Initialization

It initializes a PPOTrainer object, which orchestrates the PPO training process.
This includes setting up the model, reference model, tokenizer, optimizer, and dataset.

Reward Pipeline Setup

The script loads a toxicity detection model (RoBERTa) and tokenizer.
It defines the generation arguments and output length sampler.

PPO Training Loop and Model saving

Inside the training loop, it iterates over the dataset and generates responses using the policy model.
Sentiment scores (toxicity labels) are computed for the generated responses using the toxicity model.
PPO steps are performed to optimize the policy based on the generated responses and rewards.
Training statistics are logged, and the model is periodically saved during training.
After training, the script saves the trained PPO model.

Contact

Derek Nguyen

LinkedIn
Email

Project Link: https://github.com/derekn4/EmpatheticLLM

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
saved_model/model		saved_model/model
DialoGPT.ipynb		DialoGPT.ipynb
README.md		README.md
Virtual Empathy_ The Illusion of Conscientiousness in Conversational LMs.pptx		Virtual Empathy_ The Illusion of Conscientiousness in Conversational LMs.pptx
trltoxic.py		trltoxic.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Virtual Empathy: The Illusion of Conscientiousness in Conversational LMs

About The Project

Built With

Libraries Used

DioloGPT Code: How it works

Data Processing

Args Class

Construct Conversations Function

ConversationDataset Class

Train Function

Evalute Function

trltoxic Code: How it works

Script Arguments and Configuration

Dataset Building

Model Initialization

PPO Trainer Initialization

Reward Pipeline Setup

PPO Training Loop and Model saving

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

derekn4/EmpatheticLLM

Folders and files

Latest commit

History

Repository files navigation

Virtual Empathy: The Illusion of Conscientiousness in Conversational LMs

About The Project

Built With

Libraries Used

DioloGPT Code: How it works

Data Processing

Args Class

Construct Conversations Function

ConversationDataset Class

Train Function

Evalute Function

trltoxic Code: How it works

Script Arguments and Configuration

Dataset Building

Model Initialization

PPO Trainer Initialization

Reward Pipeline Setup

PPO Training Loop and Model saving

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages