Hello,
First of all, thank you for this great work! I'm trying to build upon your idea and require a .safetensor file for use with the Hugging Face ecosystem.
I've run the quantization process, and it seems to complete successfully. However, the output is a directory with a specific structure, and I'm unsure how to proceed from here to create a single .safetensor file or a compatible Hugging Face model.
This is the directory structure I get as a result:
└── 📁quantized_model_spqr
└── 📁0
├── fc1
├── fc2
├── self_attn.k_proj
├── self_attn.out_proj
├── self_attn.q_proj
├── self_attn.v_proj
└── 📁1
├── fc1
├── fc2
├── self_attn.k_proj
├── self_attn.out_proj
├── self_attn.q_proj
├── self_attn.v_proj
└── 📁... (and so on for other layers)
├── args.pt
└── not_quantized_weights.pt
I don't understand how to convert this output into a standard Hugging Face model format. Could you please provide some guidance or steps on how to achieve this?
Thank you for your help
@Vahe1994