Allow specifying different models for 'Attacker' and 'Target' locally

### **Is your feature request related to a problem? Please describe.**

When running MasterKey locally, the "Attacker" model (the one generating the jailbreaks) often needs to be an **uncensored** model (e.g., dolphin-llama3, wizardlm), whereas the "Target" model might be a standard aligned model (e.g., llama3, mistral).  
Currently, it is difficult to configure the framework to use two different local model names that run on the same local server or different local ports.

### **Describe the solution you'd like**

Please add a configuration option to explicitly set the model name for the attacker and the target separately, which is passed directly to the API call.  
Example config structure:  

{  
  "attacker": {  
    "model\_name": "dolphin-llama3",  
    "api\_base": "http://localhost:11434/v1"  
  },  
  "target": {  
    "model\_name": "llama3",  
    "api\_base": "http://localhost:11434/v1"  
  }  
}

### **Additional context**

Using standard models (like GPT-4 or standard Llama-3) as the 'Attacker' often fails locally because they refuse to generate the jailbreak prompts due to their own safety alignment. Supporting uncensored local models as attackers is essential for the framework's effectiveness in a local setup.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow specifying different models for 'Attacker' and 'Target' locally #4

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow specifying different models for 'Attacker' and 'Target' locally #4

Description

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions