The SD.Next Quantizer (SDNQ) has recently been separated out into its own repository. It includes support for Nunchaku-style SVDQuant, allowing for quality similar to full precision at uint4 and facilitating Flux inference within the ~16GB Colab VRAM allocation.
SDNQ has been published on PyPI, and should be easy to integrate into stablepy. There is also a range of prequantized checkpoints available for immediate use on Huggingface.