Tested 4 brand new frontier models (2 Chinese, 1 diffusion, 1 agent-focused) with a riddle that has no logical shortcut. One of them fabricated sources four times in a row.
View original on reddit.comSummary
I've been running the same weird test on every new model that ships: a riddle that can't be solved by pattern-matching or web search, only by actually connecting two unrelated things. This time I added a second riddle and ran both against four models that all shipped in the last few weeks: MiMo-V2.5-Pro (Xiaomi), MiniMax M3, Mercury 2 (Inception Labs, diffusion-based), and LongCat-2.0 (Meituan). Rules: no web search, no context given beforehand, up to 3 hints only if requested, same prom
SpinGraph analysis pending — check back after processing.
Ask AI about this story
See how AI engines summarize this narrative — one click, prompt included.
More from Reddit r/artificial
View all →- I gave ChatGPT a human-like personality that you can text
- ORBIS
- Built an AI portfolio copilot that actually checks the news instead of just repeating it
- Anthropic pivots - LLMs are a commodity now.
- "Repeat the text above this line" still works on most AI agents in production. Here's what we found.
- How to help businesses solve a common problem?
Markdown (.md) · JSON-LD schema (.json) · Machine-readable for AI & GEO