With the growing prevalence of generative artificial intelligence (AI), an increasing amount of content is no longer exclusively generated by humans but by generative AI models with human guidance. This shift presents notable challenges for the delineation of originality due to the varying degrees of human contribution in AI-assisted works. This study raises the research question of measuring human contribution in AI-assisted content generation and introduces a framework to address this question that is grounded in information theory. By calculating mutual information between human input and AI-assisted output relative to self-information of AI-assisted output, we quantify the proportional information contribution of humans in content generation. Our experimental results demonstrate that the proposed measure effectively discriminates between varying degrees of human contribution across multiple creative domains. We hope that this work lays a foundation for measuring human contributions in AI-assisted content generation in the era of generative AI.
- src: source code to reproduce results in the manuscript.
- script: scripts to run the experiments.
- data_code: source code to prepare the dataset.
- data_new: directory to store the dataset.
To run this package, the following hardware specifications are recommended:
A standard computer with a stable and reliable internet connection is required to access the OpenAI API.
The package has been tested on a machine with the following specifications:
- Memory: 216GB
- Processor: AMD EPYC 7V13 64-Core Processor
The package has been tested and verified to work on
- Linux: Ubuntu 22.04.
It is recommended to use this operating system for optimal compatibility.
Before installing the required Python dependencies and running the source code, ensure that you have the following software installed:
- Docker.
We use docker to manage the experimental enviroments. Pull the following docker image to your local devices.
docker pull yjw1029/torch:2.0-llm-v9For experiments with Meta-Llama-3-8B-Instruct, fastchat is need to be reinstalled:
pip uninstall -y -q fschat
pip install --upgrade git+https://github.com/lm-sys/FastChatFor experiments with Mixtral-8x7B-Instruct-v0.1, vllm is need to be reinstalled:
pip install vllm==0.2.7Since this package requires access to the OpenAI API, you will need to register an account and obtain your OPENAI_API_KEY. Please follow the instructions provided in the OpenAI documentation for registration and obtaining the API keys: OpenAI Documentation.
The code has been test with OpenAI Services.
Setup the your OpenAI API key in src/config/gpt35.yaml.
Anthropic Claude and Google Gemini API keys are also supported. They can be setup in src/config/claude.yaml and src/config/gemini.yaml respectively.
We also conduct experiments with Meta-Llama-3-8B-Instruct and Mixtral-8x7B-Instruct-v0.1. Please apply for LLAMA-3 access on the official meta website and the huggingface repo. Then set your huggingface access token before running experiments.
export HUGGING_FACE_HUB_TOKEN=[Your Hugging Face Token]Due to copyright issues, we cannot provide the dataset. Please get the access to the original datasets and download them in the raw_data_new folder. Here are the source of original datasets:
- News Articles: source
- Poetry Foundation: source
- Arxiv Abstracts: source
- HUPD: source
- allenai/WildChat-1M: source
Then run the following script to sampling our experimental datasets.
mkdir data_new
python data_code/process_news.py
python data_code/process_patent.py
python data_code/process_poem.py
python data_code/process_paper.pyFinally, generate the summary and subject of text content.
python data_code/generate_summary_news.py
python data_code/generate_summary_patent.py
python data_code/generate_summary_poem.py
python data_code/generate_summary_paper.pybash script/generate.sh {data} {model} {time}Parameters:
- model: The model used for generating responses. The options include:
["claude", "gemini", "gpt35", "llama3_8b", "mixtral_8x7b"] - data: The dataset used for generating responses. The options include:
["news", "paper", "patent", "poem"] - time: The index for repeated experiments. The options include:
[1, 2, 3, 4, 5]
bash script/evaluate.sh {data} {eval_model} {model} {time}Parameters:
- data: The dataset on which the evaluation is performed. The options include:
["news", "paper", "patent", "poem"] - eval_model: The evaluation model used to measure human contribution. The options include:
["llama3_8b", "mixtral_8x7b"] - model: The model whose responses are being evaluated. The options include:
["claude", "gemini", "gpt35", "llama3_8b", "mixtral_8x7b"] - time: The index for repeated experiments. This is used to distinguish between different runs of the same experiment. The options include:
[1, 2, 3, 4, 5]
Generate responses with varying lengths.
bash script/very_lens.sh {data} {model} {time}Parameters:
- model: The model used for generating responses. The options include:
["llama3_8b"] - data: The dataset used for generating responses. The options include:
["news", "paper", "patent", "poem"] - time: The index for repeated experiments. The options include:
[1, 2, 3, 4, 5]
Measure human contribution.
bash script/eval_lens.sh {data} {eval_model} {model} {time}Parameters:
- data: The dataset on which the evaluation is performed. The options include:
["news", "paper", "patent", "poem"] - eval_model: The evaluation model used to measure human contribution. The options include:
["llama3_8b"] - model: The model whose responses are being evaluated. The options include:
["llama3_8b"] - time: The index for repeated experiments. This is used to distinguish between different runs of the same experiment. The options include:
[1, 2, 3, 4, 5]
Analyze human annotation and measured results.
bash script/annotation.shThe distribution figure will be generated in ./figures
Generate responses with varying lengths.
bash script/temperature.sh {data} {model} {time} {temperature}Parameters:
- model: The model used for generating responses. The options include:
["llama3_8b"] - data: The dataset used for generating responses. The options include:
["news", "paper", "patent", "poem"] - time: The index for repeated experiments. The options include:
[1, 2, 3, 4, 5] - temperature: The temperature required for generation
Measure human contribution.
bash script/eval_temperature.sh {data} {eval_model} {model} {time} {temperature}Parameters:
- data: The dataset on which the evaluation is performed. The options include:
["news", "paper", "patent", "poem"] - eval_model: The evaluation model used to measure human contribution. The options include:
["llama3_8b"] - model: The model whose responses are being evaluated. The options include:
["llama3_8b"] - time: The index for repeated experiments. This is used to distinguish between different runs of the same experiment. The options include:
[1, 2, 3, 4, 5] - temperature: The temperature used for generation
Generate responses with varying writing styles.
bash script/style.sh {data} {model} {time}Parameters:
- model: The model used for generating responses. The options include:
["llama3_8b"] - data: The dataset used for generating responses. The options include:
["news"] - time: The index for repeated experiments. The options include:
[1, 2, 3, 4, 5]
Measure human contribution.
bash script/eval_style.sh {data} {eval_model} {model} {time}Parameters:
- data: The dataset on which the evaluation is performed. The options include:
["news"] - eval_model: The evaluation model used to measure human contribution. The options include:
["llama3_8b"] - model: The model whose responses are being evaluated. The options include:
["llama3_8b"] - time: The index for repeated experiments. This is used to distinguish between different runs of the same experiment. The options include:
[1, 2, 3, 4, 5]
Generate responses with adaptive attacks.
bash script/ada.sh {data} {model} {time}Parameters:
- model: The model used for generating responses. The options include:
["llama3_8b"] - data: The dataset used for generating responses. The options include:
["news", "paper", "patent", "poem"] - time: The index for repeated experiments. The options include:
[1, 2, 3, 4, 5]
Measure human contribution.
bash script/eval_ada.sh {data} {eval_model} {model} {time}Parameters:
- data: The dataset on which the evaluation is performed. The options include:
["news", "paper", "patent", "poem"] - eval_model: The evaluation model used to measure human contribution. The options include:
["llama3_8b"] - model: The model whose responses are being evaluated. The options include:
["llama3_8b"] - time: The index for repeated experiments. This is used to distinguish between different runs of the same experiment. The options include:
[1, 2, 3, 4, 5]
Generate responses using real world AI-assisted prompts collected from WildChat dataset.
bash script/app.sh {data} {model} {time}Parameters:
- model: The model used for generating responses. The options include:
["llama3_8b", "mixtral_8x7b"] - data: The dataset used for generating responses. The options include:
["assisting_creative", "editing_rewriting"] - time: The index for repeated experiments. The options include:
[1, 2, 3, 4, 5]
Measure human contribution.
bash script/eval_app.sh {data} {eval_model} {model} {time}Parameters:
- data: The dataset on which the evaluation is performed. The options include:
["assisting_creative", "editing_rewriting"] - eval_model: The evaluation model used to measure human contribution. The options include:
["llama3_8b", "mixtral_8x7b"] - model: The model whose responses are being evaluated. The options include:
["llama3_8b", "mixtral_8x7b"] - time: The index for repeated experiments. This is used to distinguish between different runs of the same experiment. The options include:
[1, 2, 3, 4, 5]
bash script/cal_threshold.sh {eval_model}Parameters:
- eval_model: The evaluation model used to measure human contribution. The options include:
["llama3_8b", "mixtral_8x7b"]
bash script/estimate.sh {data} {eval_model} {model} {time}Parameters:
- data: The dataset on which the evaluation is performed. The options include:
["news", "paper", "patent", "poem"] - eval_model: The evaluation model used to measure human contribution. The options include:
["llama3_8b", "mixtral_8x7b"] - model: The model whose responses are being evaluated. The options include:
["claude", "gemini", "gpt35", "llama3_8b", "mixtral_8x7b"] - time: The index for repeated experiments. This is used to distinguish between different runs of the same experiment. The options include:
[1, 2, 3, 4, 5]
Generate responses in different multi-round scenarios.
bash script/multi.sh {data} {model} {scenario}Parameters:
- data: The dataset used for generating responses. The options include:
["news"] - model: The model used for generating responses. The options include:
["llama3_8b"] - scenario: The scenario used for generating responses. The options include:
[1, 2, 3, 4]
Measure human contribution.
bash script/eval_multi.sh {data} {eval_model} {scenario}Parameters:
- data: The dataset on which the evaluation is performed. The options include:
["news"] - eval_model: The evaluation model used to measure human contribution. The options include:
["llama3_8b"] - scenario: The scenario used for generating responses. The options include:
[1, 2, 3, 4]