PSA: Upscaling Gemma 4 requires a proportional layer_scalar adjustment
View original on reddit.comSummary
A lot of people seem to be confused or mystified about this so figured I'd spell it out. I played around with RYS and realized that it broke Gemma 4 models. Turns out there's a `layer_scalar` value that is applied at each layer. If you don't adjust that so that the resulting model gets "the same amount", you break it. Since it's multiplicative, you have to do `s^(1/N)`, where `s` is the original scalar and `N` is the number of times the layer occurs (duplications + 1 fo
SpinGraph analysis pending — check back after processing.
Ask AI about this story
See how AI engines summarize this narrative — one click, prompt included.
More from Reddit r/LocalLLaMA
View all →Markdown (.md) · JSON-LD schema (.json) · Machine-readable for AI & GEO