A 32GB RTX 4080 From China Costs €1,300. It Actually Works.
Chinese manufacturers are doubling the VRAM on RTX 4080 SUPERs and selling them for €1,300. The local LLM community is intrigued — and cautious.

For €1,300 plus shipping, you can now buy an RTX 4080 with 32GB of VRAM from China. The card is a triple-fan unit with twice the memory of the stock model, and at least one r/LocalLLaMA user says it works. The post drew 369 upvotes and 63 comments in two days.
What Happened
Chinese manufacturers are taking stock RTX 4080 SUPERs — which ship with 16GB of GDDR6X — and soldering on double the memory chips. The result is a 32GB card with the same CUDA core count and compute performance as the original, but with enough VRAM to actually run serious local models.
This isn't a factory product. There's no NVIDIA warranty, no official driver support for the expanded memory, and no guarantee the extra chips will stay stable under sustained loads. TweakTown confirmed the cards exist and noted they're clearly aimed at AI workloads rather than gaming.
Key specs of the modded card:
- VRAM: 32GB GDDR6X (doubled from 16GB)
- Cooling: Triple-fan aftermarket design
- Price: ~€1,300 + shipping from China
- Warranty: None (unofficial modification)
Why This Matters
32GB is the magic number for local LLM inference. It's enough to run most 27B models at full precision and 70B models at Q4/Q5 quantization — the workloads that actually matter to the r/LocalLLaMA crowd. The stock 16GB on the RTX 4080 SUPER falls short of that threshold, which has frustrated builders who want CUDA performance without paying RTX 4090 or 5090 prices.
The competitive landscape makes these mods interesting. Intel's Arc Pro B70 launched at $949 with 32GB, but its compute throughput is significantly weaker and the software ecosystem is still catching up. NVIDIA's own 32GB option — the RTX 5090 — sells for roughly $3,500 on the street. A modded 4080 at €1,300 slots neatly between those two: stronger compute than Intel, a fraction of the 5090's price.
The community reaction was predictably split. Some users called it a "no-brainer for local inference," pointing out that even two used RTX 3090s at $650 each give you 48GB but require dealing with tensor parallelism overhead. Others flagged real risks: VRAM stability under extended inference loads, potential thermal issues from chips that weren't part of the original PCB layout, and zero recourse if the card dies in six months.
What's Next
This is part of a broader pattern. Chinese modders have been pushing GPU boundaries for the local AI market, filling gaps that NVIDIA's official lineup doesn't address. As long as the company keeps pricing 32GB+ cards above $2,000, the demand for unofficial alternatives will grow. Whether these mods hold up over thousands of hours of inference remains an open question — but for builders willing to accept the risk, €1,300 for 32GB of CUDA compute is hard to ignore.

