-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomersstoryStory derived from bug or feature requestStory derived from bug or feature request
Description
LLaMA should include quantisation. This introduces a dilemma between two options:
- The quantisation is done by invoking the current Python interpreter available on the path to convert model state dicts to ggml. The library will only do the conversion from ggml to quantified ggml.
- None of it is handled by the model, but pack the scripts with the model for consistency on quantisation.
Done
- Create any relevant shims or implementation for quantisation
- Create documentation for quantisation.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomersstoryStory derived from bug or feature requestStory derived from bug or feature request