The Nvidia RTX 3060 12GB brings a new level of performance to the mainstream market – sort of. Officially, the RTX 3060 launches today with prices starting at just $329. Realistic? You’re just as likely to find one at that price as an RTX 3060 Ti for $399, RTX 3070 for $499, or RTX 3080 for $699 — perhaps not entirely impossible, but highly unlikely. Nvidia’s Ampere architecture now powers many of the best graphics cards, and they’re all seeing huge demand from gamers and cryptocurrency miners alike. Nvidia has added firmware and driver code to detect Ethereum mining, which should help a bit, but when people are willing to pay extreme scalper prices on eBay, even for cards like the GTX 1660 Super and RTX 2060, all in our GPU benchmark hierarchy is currently almost sold out. Nvidia is even working with partners to bring back the previous generation Turing and Pascal cards.
None of that makes this a bad GPU, but we expect the RTX 3060 to be as hard to come by as any other modern GPU. Eventually, the current Ethereum mining boom will disappear, but it could be a year or more before we see the end of chip shortages. That shouldn’t surprise anyone at this point, but if you were hoping for a reasonably priced gaming PC upgrade, this is a depressing situation.
Unlike the previous Ampere GPUs, Nvidia doesn’t offer an RTX 3060 Founders Edition, so we’re looking at a third-party card. Nvidia sent us the EVGA GeForce RTX 3060 XC for this launch review, a fairly compact and relatively modest card. There’s no metal (or even plastic) backplate, no RGB lighting, and two custom 87mm fans for cooling with a 2.0-slot form factor. The card measures 202x110x38mm and weighs 653g, which is quite a change of pace compared to the other third-party Ampere cards we’ve reviewed so far.
There are of course reasons for this. Making a regular map and equipping it with all the bells and whistles costs money. And we think most gamers shopping for a good price would be much better served by modest designs with good performance. There will certainly be extreme variants of the RTX 3060, and some will be priced higher than the budget RTX 3060 Ti options. Let’s be clear: even the fastest RTX 3060 won’t beat a 3060 Ti in most situations — yes, even with 12GB of VRAM. That’s because memory capacity isn’t a big factor once you get above 8GB, and more memory bandwidth, thanks to the wider memory bus, gives the 3060 Ti a big advantage. The 3060 Ti also has 35% more GPU cores.
|Graphics Card||RTX 3060 Ti||RTX 3060||RTX 2060 Super||RTX 2060|
|Process technology||Samsung 8N||Samsung 8N||TSMC 12FFN||TSMC 12FFN|
|Die size (mm^2)||392.5||276||445||445|
|Base Clock (MHz)||1410||1320||1470||1410|
|Boost Clock (MHz)||1665||1777||1650||1680|
|VRAM Speed (Gbps)||14||15||14||14|
|VRAM bus width||256||192||256||192|
|GFLOPS FP32 (Boost)||16.2||12.7||7.2||6.5|
|TFLOPS FP16 (Tensor)||65 (130)||51 (102)||57||52|
|Launch date||Dec-20||February-21||July 19||Jan-19|
Here’s how things break, comparing the RTX 3060 to its closest Ampere sibling and Turing predecessors. The RTX 2060 and 2060 Super show how much has changed for the -60 suffix cards between Turing and Ampere. Ampere gives you a lot more shader cores, potentially much higher compute performance, and a small improvement in memory bandwidth for the 12GB card. It also doubles the VRAM capacity (at least to the expected RTX 3060 6GB appears, although Nvidia might just leave that to the RTX 3050 line) and include improvements in the RT and Tensor cores, as well as the memory subsystem, all of which lead to better performance. The power consumption remains comparable, with a 170 W TGP (Total Graphics Power), a significant step down from the 220 W TGP of the RTX 3060 Ti.
An interesting fact is that this is Nvidia’s first time using 15Gbps GDDR6 memory. The RTX 20 series cards all used 14 Gbps memory, except for the RTX 2080 Super which was equipped with 15.5 Gbps VRAM. That narrows the bandwidth gap between the 3060 and 3060 Ti a bit, although the extra 64-bit interface width still gives the GA104 cards a distinct advantage. And GA106 has no advantage in ROPs, Render outputs, because it only has 48 – the same as the RTX 2060.
However, the differences between Turing and Ampere GPUs are not always reflected in specification tables like the one above. Theoretically, the RTX 3060 has up to 95% more FP32 performance and 97% more FP16 Tensor core performance than the RTX 2060. In practice, the actual performance difference is much smaller, as half of FP32 pipelines share processing resources with INT32 pipelines. The 3060 should never be slower for gaming purposes, but usually it will only be about 20-25 percent faster.
This is the first desktop card to use Nvidia’s GA106 processor. At the top level, there are three GPCs (Graphics Processing Clusters), each with up to 10 SMs and 16 ROPs (the two blocks of eight blue rectangles each at the bottom of the GPC). The full chip has 30 SMs, while the 3060 knocks out two and ends up with 28 SMs, but everything else is left alone. (Note that the mobile RTX 3060 has all 30 SMs enabled, although it only comes with 6GB of memory, which is also clocked lower than the desktop card.)
Each SM contains 64 dedicated FP32 CUDA cores, plus 64 additional FP32+INT32 CUDA cores — only FP32 or INT32 can be used for each cycle. The SMs also include a second-generation RT core and four third-generation Tensor cores, each performing up to twice as fast as the previous generation’s cores, and at scarcity, the Tensor cores may be four times as fast. fast. fast as on Turing. Finally, there are six 32-bit memory interfaces, each associated with a single 8Gb or 16Gb GDDR6 module – the latter is currently reserved for desktops, while the 8Gb modules are used on laptops.
The entire GA106 chip has 12 billion transistors, compared to 17.4 billion in GA104. That reduces the die size from 393mm square to just 276mm square, which not only helps lower the cost of the chip, but also increases the number of chips Nvidia can get from a single wafer — and if you’re wondering, GA106 is less than half the size of GA102, which measures 628.4mm square and has 28.3 billion transistors. It is estimated that Nvidia can get about 130 dies per wafer with GA104 (some of which are defective, most of which end up as partially disabled chips), while GA106’s smaller size allows about 200 dies per wafer. More dies means better yields and more graphics cards to go around. That’s the hope.
LAKE: Best Graphics Cards
LAKE: GPU benchmarks and hierarchy
LAKE: All graphic content