Independent benchmark shows big drops on Claude Fable 5 after its relaunch, here’s the actual context

Summary

Saw this chart from BridgeMind going around. They reran BridgeBench (a coding benchmark covering debugging, refactoring, and hallucination detection) comparing the July 1 relaunch of Fable 5 to the original June 12 version: Debugging: 86.2 → 25.9 Refactoring: 73.6 → 38.4 Hallucination: 75.9 → 61.7 Some context worth having before jumping to conclusions: Fable 5 and Mythos 5 got pulled on June 12 due to a Commerce Department export control order, tied to a reported jailbreak that got the model to

SpinGraph analysis pending — check back after processing.

Ask AI about this story

See how AI engines summarize this narrative — one click, prompt included.

ChatGPT Claude Perplexity Gemini Grok

More from Reddit r/artificial

View all →

Markdown (.md) · JSON-LD schema (.json) · Machine-readable for AI & GEO