GLM-5 has 744B parameters and scores worse on MMLU-Pro than a 9B model
View original on reddit.comSummary
Tier lists make S-tier and D-tier feel like different categories of thing entirely, red box at the top, blue box at the bottom. Actually plotted named models by parameter count against MMLU-Pro score instead of trusting the tier labels, and the picture is a lot messier than "bigger tier = bigger gap." Qwen3.5-9B, a 9B model, scores 82.5% on MMLU-Pro. GLM-5, at 744B parameters — 82x the size — scores 70.4%. That's not a diminishing-returns curve, that's negative returns; the 9B
SpinGraph analysis pending — check back after processing.
Ask AI about this story
See how AI engines summarize this narrative — one click, prompt included.
More from Reddit r/artificial
View all →- Wait what?
- Scientists Asked AI to Impersonate 112 Public Figures. What Happened Next Is a ‘Dire’ Warning
- Built an AI workspace to simplify my SEO workflow — looking for honest feedback
- Sinking of R.M.S. Titanic modelled using Fable 5
- Thoughts on this ?
- Other than writing emails and summarizing reports, what else do you use AI for at your office if you are not the tech side of the business?
Markdown (.md) · JSON-LD schema (.json) · Machine-readable for AI & GEO