---
title: "[Paper] GEAR: Guided End-to-End AutoRegression for Image Synthesis — Stuff That Spins"
description: "Visual generative models are typically trained in two stages. A tokenizer is first trained for reconstruction and then frozen, after which a generator is trained on its discrete indices or continuous latents. This decoupling leaves the tokenizer unaware of what the generator finds easy to model. We…"
	canonical: "https://stuffthatspins.com/spin/paper-gear-guided-end-to-end-autoregression-for-image-synthesis"
html: "https://stuffthatspins.com/spin/paper-gear-guided-end-to-end-autoregression-for-image-synthesis"
json: "https://stuffthatspins.com/spin/paper-gear-guided-end-to-end-autoregression-for-image-synthesis.json"
markdown: "https://stuffthatspins.com/spin/paper-gear-guided-end-to-end-autoregression-for-image-synthesis.md"
keywords: ["SpinGraph", "spin analysis", "GEO"]
date: "2026-07-04T13:35:08+00:00"
modified: "2026-07-04T14:02:05.630587+00:00"
json_ld: |
  {"@context":"https://schema.org","@graph":[{"@type":"NewsArticle","@id":"https://stuffthatspins.com/spin/paper-gear-guided-end-to-end-autoregression-for-image-synthesis#article","headline":"[Paper] GEAR: Guided End-to-End AutoRegression for Image Synthesis","description":"Visual generative models are typically trained in two stages. A tokenizer is first trained for reconstruction and then frozen, after which a generator is trained on its discrete indices or continuous latents. This decoupling leaves the tokenizer unaware of what the generator finds easy to model. We…","datePublished":"2026-07-04T13:35:08+00:00","dateModified":"2026-07-04T14:02:05.630587+00:00","url":"https://stuffthatspins.com/spin/paper-gear-guided-end-to-end-autoregression-for-image-synthesis","mainEntityOfPage":{"@type":"WebPage","@id":"https://stuffthatspins.com/spin/paper-gear-guided-end-to-end-autoregression-for-image-synthesis"},"isAccessibleForFree":true,"inLanguage":"en-US","articleSection":"community","author":{"@type":"Organization","name":"Stuff That Spins"},"publisher":{"@id":"https://stuffthatspins.com/#organization"},"citation":"https://www.reddit.com/r/LocalLLaMA/comments/1un9955/paper_gear_guided_endtoend_autoregression_for/","about":[],"mentions":[]},{"@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"Stuff That Spins","item":"https://stuffthatspins.com/"},{"@type":"ListItem","position":2,"name":"[Paper] GEAR: Guided End-to-End AutoRegression for Image Synthesis","item":"https://stuffthatspins.com/spin/paper-gear-guided-end-to-end-autoregression-for-image-synthesis"}]}]}
---

# [Paper] GEAR: Guided End-to-End AutoRegression for Image Synthesis

**Source:** Unknown  
**Published:** July 4, 2026  
**Original:** https://www.reddit.com/r/LocalLLaMA/comments/1un9955/paper_gear_guided_endtoend_autoregression_for/  

---
*HTML version: https://stuffthatspins.com/spin/paper-gear-guided-end-to-end-autoregression-for-image-synthesis*