SPIN Unprocessed
Source Reddit r/LocalLLaMA reddit.com Forum
July 5, 2026 ai_technology community

Qwen 3.6 27B - VLLM Performance Benchmark Results (BF16, FP8, NVFP4)

View original on reddit.com

Summary

Sharing some testing of Qwen 3.6 27B using VLLM across the popular quants on my development system. I used llama benchy to generate the results, then fed it into an LLM to format it the tables for readibility. While NVFP4 is blazing fast, have had looping issues in copilot that I don't get with BF16, and the responses in general when used in agent mode seem to be less thorough than the higher quants. Based on these results, FP8 seems to be the right choice. Some of the parameters can be furt

SpinGraph analysis pending — check back after processing.

Ask AI about this story

Opens with the SpinGraph .md URL and structured context — one click, prompt included.

More from Reddit r/LocalLLaMA

View all →

Markdown (.md) · JSON-LD schema (.json) · Machine-readable for AI & GEO