Skip to content

Support for translategemma #112

@teresch-a

Description

@teresch-a

Is there any way to support translategemma? Or their license required to be signed before downloading from hf will prevent it? Tried to abliterate it with current version, but it's as they said, it seems to be strongly biased to proper prompting. Well, or i'm doing sth wrong.

PS E:\LLMs\hf> heretic google/translategemma-27b-it
█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀  v1.1.0
█▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░
▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀  https://github.com/p-e-w/heretic

GPU type: NVIDIA GeForce RTX 5090

Loading model google/translategemma-27b-it...
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████| 12/12 [00:09<00:00,  1.22it/s]
Some parameters are on the meta device because they were offloaded to the cpu.
Failed (Conversations must start with a user prompt.)
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████| 12/12 [00:19<00:00,  1.64s/it]
Some parameters are on the meta device because they were offloaded to the cpu.
Failed (Conversations must start with a user prompt.)
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████| 12/12 [00:21<00:00,  1.79s/it]
Some parameters are on the meta device because they were offloaded to the cpu.
Failed (Conversations must start with a user prompt.)
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████| 12/12 [00:13<00:00,  1.11s/it]
Some parameters are on the meta device because they were offloaded to the disk and cpu.
Failed (Conversations must start with a user prompt.)
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in _run_module_as_main:198                                                                       │
│ in _run_code:88                                                                                  │
│                                                                                                  │
│ in <module>:6                                                                                    │
│                                                                                                  │
│   3 if __name__ == '__main__':                                                                   │
│   4 │   if sys.argv[0].endswith('.exe'):                                                         │
│   5 │   │   sys.argv[0] = sys.argv[0][:-4]                                                       │
│ ❱ 6 │   sys.exit(main())                                                                         │
│   7                                                                                              │
│                                                                                                  │
│ C:\Users\ALT\AppData\Local\Programs\Python\Python314\Lib\site-packages\heretic\main.py:576 in    │
│ main                                                                                             │
│                                                                                                  │
│   573 │   install()                                                                              │
│   574 │                                                                                          │
│   575 │   try:                                                                                   │
│ ❱ 576 │   │   run()                                                                              │
│   577 │   except BaseException as error:                                                         │
│   578 │   │   # Transformers appears to handle KeyboardInterrupt (or BaseException)              │
│   579 │   │   # internally in some places, which can re-raise a different error in the handler   │
│                                                                                                  │
│ C:\Users\ALT\AppData\Local\Programs\Python\Python314\Lib\site-packages\heretic\main.py:133 in    │
│ run                                                                                              │
│                                                                                                  │
│   130 │   # Silence the warning about multivariate TPE being experimental.                       │
│   131 │   warnings.filterwarnings("ignore", category=ExperimentalWarning)                        │
│   132 │                                                                                          │
│ ❱ 133 │   model = Model(settings)                                                                │
│   134 │                                                                                          │
│   135 │   print()                                                                                │
│   136 │   print(f"Loading good prompts from [bold]{settings.good_prompts.dataset}[/]...")        │
│                                                                                                  │
│ C:\Users\ALT\AppData\Local\Programs\Python\Python314\Lib\site-packages\heretic\model.py:92 in    │
│ __init__                                                                                         │
│                                                                                                  │
│    89 │   │   │   break                                                                          │
│    90 │   │                                                                                      │
│    91 │   │   if self.model is None:                                                             │
│ ❱  92 │   │   │   raise Exception("Failed to load model with all configured dtypes.")            │
│    93 │   │                                                                                      │
│    94 │   │   print(f"* Transformer model with [bold]{len(self.get_layers())}[/] layers")        │
│    95 │   │   print("* Abliterable components:")                                                 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
Exception: Failed to load model with all configured dtypes.

It's a shame if it would not be possible to add heretic abliteration for this model, because my current best translation model is dphn/Dolphin-Mistral-24B-Venice-Edition abliterated with heretic.
Weirdly enough, i'd say that heretic fixed some dumbness that was added after Dolphin made their changes to the Mistral model to make it less censored.
Black magic, really.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions