Considering Buying Another RTX 3090 - Benefits?

Summary

Currently using dual RTX 3090s, and am happy with it. But never satisfied lol :) I know I've basically maxed out my single stream TPS. (140+ on standard benchmarks now). But I only have 48GB VRAM, So I can only do two concurrent requests @ 256k Context Length, anymore and my KV-Cache will cause OOM errors. So I am considering adding a 3rd RTX 3090, and putting the 3rd one in Pipeline Parallel with the other two. That way I don't lose performance due to bandwidth bottlenecks, but get more

SpinGraph analysis pending — check back after processing.

Ask AI about this story

Opens with the SpinGraph .md URL and structured context — one click, prompt included.

ChatGPT Claude Perplexity Gemini Grok

More from Reddit r/LocalLLaMA