Considering Buying Another RTX 3090 - Benefits?
View original on reddit.comSummary
Currently using dual RTX 3090s, and am happy with it. But never satisfied lol :) I know I've basically maxed out my single stream TPS. (140+ on standard benchmarks now). But I only have 48GB VRAM, So I can only do two concurrent requests @ 256k Context Length, anymore and my KV-Cache will cause OOM errors. So I am considering adding a 3rd RTX 3090, and putting the 3rd one in Pipeline Parallel with the other two. That way I don't lose performance due to bandwidth bottlenecks, but get more
SpinGraph analysis pending — check back after processing.
Ask AI about this story
Opens with the SpinGraph .md URL and structured context — one click, prompt included.
More from Reddit r/LocalLLaMA
View all →- longcat 2.0 (1.6T, ~48B active) weights are now open under MIT license
- DeepSeek-V4-Flash in MXFP4 is too slow on CPU
- GH Copilot’s BYOK Blocking for Inline Completion Makes No Sense. [THE FIX]
- Agents-A1-Q8_0-GGUF works pretty well for me (anecdotal feedback)
- Best choice of model 40B+ Parameters
- Any word on Qwen 3.7 9B? (Also looking for 9B-class alternatives to Qwen 3.5)
Markdown (.md) · JSON-LD schema (.json) · Machine-readable for AI & GEO