Sandermage

Follow

🏠

Working from home

Sandermage

🏠

Working from home

Follow

5 followers · 1 following

Achievements

Achievements

Popular repositories Loading

genesis-vllm-patches genesis-vllm-patches Public

Runtime patches for vLLM — Qwen3.6 (27B int4 / 35B-A3B FP8) on consumer Ampere. 50+ patches: TurboQuant KV, MTP / DFlash / ngram spec-decode, FULL cudagraph, 256K-320K context. v7.64: P67 non-pow-2…

Python 42 2
vllm vllm Public

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python