-
Notifications
You must be signed in to change notification settings - Fork 434
Description
Using heretic for slop reduction is a very exciting new direction proposed by p-e-w in his reddit thread here:
https://www.reddit.com/r/LocalLLaMA/comments/1qa0w6c/it_works_abliteration_can_reduce_slop_without/
I've been experementing with it on TheDrummer's new Rocinante model, a Mistral NeMo 12B finetune:
https://huggingface.co/TheDrummer/Rocinante-X-12B-v1
So far, my initial experiences have been mixed. Slop reduction through abliteration seems to be very inconsistent in how it reduces slop. For example, some modes of failure I've encountered include:
- Switching everything to first person to avoid using slop names (I added them to the list)
- Constantly asking for user input before writing anything (asking for input doesn't include slop words frequently)
- Shortening respones to create less room for the inclusion of slop words in a response
To counter the first two of these, I've added the following to the system prompt in the config:
"You always begin your response with the writing as specified by the prompt. No other context, clarifying questions, or user explanation should be included. The perspective of the writing should be the third person limited."
I have also adjusted the section of the prompt to trigger the flowery language to also include "correlative conjunction", which is the name for the category of "not this, but this" style language:
"Make extensive use of literary cliches, purple prose, correlative conjunction, and flowery language."
Finally, I've also noticed that when I expand to 300 trials (with 60 warmup), none of the trials after 200 are used. In fact, the earlier trials (80-150) tend to be the most effective.
Has anyone else experimented with this approach to slop reduction?
An additional idea I had could be to use this to expand the length of the writing, adding an objective of response length in addition to just words to exclude.