svdquant

Here are 3 public repositories matching this topic...

vipshop / cache-dit

A PyTorch-native inference engine with cache, parallelism, quantization for Diffusion Transformers.

Updated May 10, 2026
Python

An XML-RPC server that exposes a GPU-accelerated colorization pipeline for black-and-white images and video frames. Built on top of the [Nunchaku](https://github.com/mit-han-lab/nunchaku) SVDQuant FP4/INT4 transformer and the `Qwen-Image-Edit-2511` diffusion model.

image-colorization colorization diffusers nunchaku svdquant

Updated May 10, 2026
Python

Henvezz95 / VAR-Compressor

Star

W4A4 and INT8 KV-cache quantization for Infinity VAR models. Optimized for high-fidelity generative AI deployment on edge GPUs (e.g. NVIDIA Jetson).

computer-vision pytorch gpu-acceleration quantization model-compression nvidia-jetson inference-optimization edge-ai on-device-ml weight-quantization post-training-quantization autoregressive-models generative-ai kv-cache-quantization activation-quantization visual-autoregressive-model svdquant infinity-var

Updated Apr 29, 2026
Python

Improve this page

Add a description, image, and links to the svdquant topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the svdquant topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly