In hindsight - I don't know that the second GPU was worth the spend. The c++ tooling is doing a very good job right now at spreading work between GPU vram and main ram and still being fast enough. Even ~4/5 tokens a second is fast enough to not feel like you're waiting.
I'd suggest skipping the second card and dropping the price quite a bit (~2100 vs ~2900) unless you want to tune/train models.