Skip to content

Releases: ModelCloud/Tokenicer

Tokenicer v0.0.6

09 Feb 08:27
92bdf47

Choose a tag to compare

What's Changed

  • [FIX] avoid proxying call to inner tokenizer (ChatGLMTokenizer compatibility) by @ZX-ModelCloud in #42

New Contributors

Full Changelog: v0.0.5...v0.0.6

Toke(n)icer v0.0.5

04 Sep 05:11
63e3008

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.0.4...v0.0.5

Toke(n)icer v0.0.4

21 Feb 09:36
dd95bdf

Choose a tag to compare

What's Changed

⚡ Now tokenicer instance dynamically inherits the native tokenizer.__class__ of tokenizer passed in or loaded via our Tokenicer.load() api.
⚡ CI now tests tokenizers from 64 models

Full Changelog: v0.0.2...v0.0.4

Toke(n)icer v0.0.3

21 Feb 07:18
b0b2591

Choose a tag to compare

What's Changed

Now tokenicer instance dynamically inherits the native tokenizer.__class__ of tokenizer passed in or loaded via our Tokenicer.load() api.

Full Changelog: v0.0.2...v0.0.3

Toke(n)icer v0.0.2

10 Feb 13:41
efc81a2

Choose a tag to compare

What's Changed

⚡ Auto-fix models not setting padding_token
⚡ Auto-Fix models released with wrong padding_token: many models incorrectly use eos_token as pad_token which leads to subtle and hidden errors in post-training and inference when batching is used which is almost always.
⚡ Compatible with all HF Transformers recognized tokenizers

New Contributors

Full Changelog: https://github.com/ModelCloud/Tokenicer/commits/v0.0.2