It's 16. Ampere white paper sets it as a hard design choice in Ampere, 16 ROPS in 2 partitions per GPC, and T239 is 1 GPC. And if there is a non-shader bottleneck, that's probably it? But I'd still bet on shader perf being the limit.
Because I am a nerd, I have a spreadsheet full of DF benchmarks and card specs, so I can rapidly compare stuff without having to spend hours in google searches. Lemme show ya
Card | TFLOPS | TMU/ 100 cores | ROP/ 100 cores | Cache/bandwidth (kb/gb/s) | Bandwidth/TFLOP | FPS/TFLOP (1080p) | FPS/TFLOP (1440p) | FPS/TFLOP( 4k) | RT 1440p FPS/TFLOP |
---|
3050 | 9.1 | 31.25 | 12.5 | 9.14 | 24.6 | 9.0 | 6.3 | 3.7 | 2.4 |
3060 | 12.7 | 31.25 | 13.4 | 8.53 | 28.3 | 9.1 | 6.4 | 3.9 | 2.6 |
3070 | 20.3 | 31.25 | 16.3 | 9.14 | 22.1 | 8.2 | 6.0 | 3.8 | 2.5 |
3080 | 29.7 | 31.25 | 11 | 6.73 | 25.6 | 6.9 | 5.3 | 3.1 | 2.3 |
3090 | 34.1 | 31.25 | 10.9 | 6.53 | 26.3 | 6.1 | 4.9 | 3 | 2.2 |
T239 | 4* (high guess) | 31.25 | 10.4 | 8.53 | 30 | ??? | ??? | ??? | ??? |
Let me offer my analysis of this wall of data.
- The architectural sweet spot seems to be the 3050/3060, which happens to be the card that T239 most closely resembles.
- The higher end cards seem to underperform in raster load, relative to their shader cores.
- Some of that is probably not real. The 3090 is getting nearly 500fps on Doom Eternal, that card could push more, but the numbers are being brought down because the CPU is being overwhelmed.
- Some of that definitely is. The fact that the performance gap drops as you get to 4k, but doesn't go away, says that it really is a case where the card is overwhelmed.
- The big cards seem under specced in ROPS. That's because there is more binning in the lower cards, so some of the shader cores are disabled, leading to excess ROPS per core.
- The big cards also seem undespecced in cache. That's because cache goes up linearly, with memory controllers, but shader perf goes up quadratically, as both core counts and clock speeds are increasing as you go up the stack.
- It's not clear if ROPS or cache are the limiting factor on the top cards. But I'd bet the ROPS.
- RT is cache loving, and doesn't care about the pixel fill rate, and it's performance only marginally changes over the stack. That tends to point the finger at ROPS
So this all tends to point the finger at T239 being somewhat ROP limited, if it's limited at all.
But because I can't stop I actually have more thoughts on that as well. The key point being that PC benchmarks don't necessarily reflect how console ports will work.
None of these tests run DLSS, and RT is kept to a small subset.RT has a shader cost, but doesn't care about the pixel fill rate, as I said. DLSS is similar - it has a shader cost based on output resolution, but the game should be hitting the ROPS at the rate of the
input resolution.
I expect hardware Lumen to be common, but even if it's not, DLSS is going to be everywhere. Dedicated ports have the opportunity to tune their performance around the (potentially) limited pixel fillrate of t239, but even if they don't, simply using DLSS will increase the load on the shader cores disproportionate to the ROPS.
But all of this is trying to project rendering technology out by 7 years. With stuff like frame gen coming to consoles even now, years after the release of both the games and the hardware, rendering engines might evolve away from a design which favors Ampere, in which case, who knows. But to me, T239 looks like a pretty optimized design for it's usecase.