I remember this post below from Raploz that the result is a better Ray Tracing on the Ps5-like hardware than in the ampere hardware.
Raploz's analysis was insufficient here, though it's not obvious why.
The first problem is that he's comparing benchmarks from different youtube channels. Those benchmarks aren't using the same settings, resolutions, CPU, or RAM. All those things have huge effects on the numbers, so you can't really compare them
The second problem is that Minecraft with RT on isn't "pure RT". Like every other RT game, it's a mix of RT effects and old school raster rendering. No one is saying that PS5 won't beat T239 at old school rendering. Only measuring them combined lets sluggish RT on PS5 "hide" inside the faster rasterization.
Obviously real games are going to use both combined, but the way they combine those two things will vary from game to game. If we wanna speculate about it, we really need to isolate the RT part away from the rasterization part. And we want to do it on machines that are identical except for the GPUs, so we have real values. Here is what I came up with
Step one, benchmark a game with RT off:
For example, our test machine, with an RTX 3050, running
Control at high settings, 1440p, gets an average of 55 fps
Step two, repeat benchmark with RT on:
Same machine, same RTX 3050,
Control at high settings, still 1440p, but with RT effects "on", gets an average of 29 fps.
Step three, convert FPS to frame time
55 frames-per-second, 18.18ms for the RTX 3050 to draw a rasterized
Control frame
29 frames-per-second, 34.48ms per the RTX 3050 to draw a ray traced
Control frame.
Step four, remove the rasterization time from the ray trace time
To be fair here, when we turn on RT we turn off some rasterized lighting effects. So this isn't how much time for RT so much as how much extra time RT takes instead of raster effects. This will vary, again, depending on how many RT effects the game has, and at what level of fidelity.
34.48ms - 18.18ms = 16.30ms of time to add RT effects to
Control on a 3050.
Step five, convert that back to a "pure ray tracing" FPS
This is just to keep "higher is better" in all the numbers. It makes comparison with other metrics easier.
At 16.13ms to ray trace a frame of
Control, the RTX 3050 could do that 61.34 times a second.
Step 6, do that over and over again for a bunch of GPUs.
To be clear,
I didn't do this. I don't have this much hardware, the necessary analysis tools, or time. But Digital Foundry does, so I used their numbers
RTX 3050 | RTX 3060 | RTX 3060Ti | RTX 3070 | RTX 3070Ti | RX 6600X | RX 6600 XT | RX 6700XT |
---|
41.56 | 61.34 | 85.09 | 92.4 | 98.37 | 28.56 | 36.20 | 39.53 |
Step 7, Adjust for TFLOPS
Not all these cards are in the same performance categories, and we want to find out
generally what is the difference between AMD's hardware and Nvidia's. So we need to get the ray-tracing performance
per TFLOP. If this analysis makes sense, then we should get pretty similar numbers for all the RTX cards, and a different set of similar numbers for the RX cards. Low and behold
RTX 3050 | RTX 3060 | RTX 3060Ti | RTX 3070 | RTX 3070Ti | RX 6600 | RX 6600XT | RX 6700XT |
---|
4.56 | 4.83 | 5.25 | 4.55 | 4.53 | 3.19 | 3.41 | 2.99 |
Step 8, average for each architecture
At this point we're doing a lot of averages, so the error bars are pretty high, but we're clearly seeing better perf from Nvidia.
Step 8, add mores games
Not all games combine RT and raster effects the same. Fortunately, we also have data for a second game,
Metro Exodus. I'll save you the charts. but
Metro Exodus has lighter RT load than
Control
Game | Ampere | RDNA2 | Difference |
---|
Control | 4.74 | 3.20 | 148% |
Metro Exodus | 8.83 | 4.66 | 189% |
Combined score | 6.78 | 3.93 | 172% |
There is a likely reason that the leap gets higher for the easier game - Ampere doesn't just "do RT faster" it accelerates more parts of the RT pipeline than RDNA2. Depending on which parts of the pipeline you lean hardest on, you're going to get different numbers. But for this simple, back-of-the-envelope analysis we will combine the two benchmarks, in order to increase the number of data points we have.
Step 9, extrapolate for the consoles
We can take our per-TFLOP score for the architectures to generate per-machine scores for the consoles. We'll assume T239 is running at a cool 1GHz
T239 | Series S | PS5 | Series X |
---|
20.81 | 15.72 | 39.3 | 47.16 |
Conclusions, Part I
We've known that Nvidia outperforms AMD at RT, but this helps extract
how much in a way that lets us talk about our unreleased hardware. As we can see, T239 is outperforming the Series S by about a third, and the PS5 is a little shy of double the T239.
But we also know that games vary in how much they use RT - look at
Metro Exodus vs
Control - and we know that rasterization perf still matters. There is no single number that can tell us "how many RT effects will we get compared to Sony."
Conclusion Part II
In the real world, these consoles will
never go head to head like this. Let's imagine three make-believe games:
The Last Gen Port Of Us: Premastered - This is an imaginary PS3/PS4 era game, that later added some RT in a "remastered" version. This is a game that both Drake and the other consoles can max out the settings on. After maxing out, Drake is running close to full capacity, yet the others have plenty of horsepower to spare - horsepower that can be spent on turning on every new RT effect and calling it a day. Drake's nice dedicated RT cores aren't enough to overcome that gap.
President Weevil - a cross-gen showcase, built with RT in mind, but also with a strong non-RT backup. Designed to scale down to last gen consoles, and up to current gen. Series S gets basically the last gen version, running at a nice 1080p30, with baked lighting. PS5 has a RT mode that is also 30fps, but is 4k, and adds in RT effects. Drake can't even do Series S res, but it's got all this extra RT power hanging around it can actually keep the RT effects that Series S lost. 720p+RT
Mario Court - Camelot finally develops the Mario Jousting game that is their destiny, a Switch 2 exclusive. Unlike even Sony and Microsoft's exclusives, which still go to PC,
Mario Court was built
exclusively for a machine with dedicated RT hardware. In terms of number of RT effects used, it matches or outshines many PS5 games, partially because it is very carefully managing the amount of geometry, the number of light sources, and the resolution to maximize the effect, and keep costs low. Rendering nerds can pick it apart, easily, but everyone has to admit that it just looks good.
Conclusion Part III
I expect Nintendo to take better advantage of RT than any one else this gen, because they've only got to handle the one RT capable platform, and as has been pointed out, RT is definitely present, but it's not yet ubiquitous in the PC space. Even "better" performance can be dwarfed by having every single first party Nintendo game share a common, RT hardware exclusive, highly optimized lighting engine.
The theoretical performance of Switch 2 in RT rendering matters less than how much developers are willing to preserve those RT effects. If third parties aren't really yet taking advantage of RT, then going the extra mile to support RT on Switch 2 might take a back seat, especially if it's not well integrated into their engines.
It'll be interesting to see how Lumen and other hybrid GI solutions play out. Where RT is just a developer toggle inside a generic lighting solution, instead of customizing each individual RT feature for performance/quality, I imagine many developers will simply turn hardware lumen "on" or "off" based on what offers the best performance/quality. If a game is hurting for perf, going to software lumen might offer a benefit that's worth it, for 0 hardware RT effects. On the other hand if a game has some headroom, then turning on all hardware Lumen might be a late-game quality boost in a port that is already hitting it's performance target
Cavets
DLSS Upscaling obviously opens up "similar quality lower internal resolution" possibilities. Ideally, though, you want to run RT at the
output res, not the input. So maybe DLSS upscaling helps, by opening up GPU performance, but it's indirect, and a per-game thing.
DLSS Ray Reconstruction opens up a new possibility - "similar quality, fewer rays." But we don't know how much that's true, because the few games that use RR don't let us independently control number of rays and RR on or off. Right now RR is just a quality booster for high end experiences - kinda like how Upscaling was a performance booster for 4k only in the early days.
Just like upscaling, RR has a cost, and it's not clear to me that the performance advantage of fewer rays will overcome the performance cost of RR, at similar quality. Color me cautiously optimistic, emphasis on the cautious.
And all of this assumes that AMD's software doesn't improve, or that Sony's PS5 Pro doesn't bring new software tools to the base models - Nvidia's software is better and takes advantage of their custom hardware, but it's not magic. PS5 and Series X are just plain bigger, and David beating Goliath is a good story because it's
rare that the little guy wins.
Conclusions, Part IV
I think it'll be a better RT machine than the Series S - ie better at keeping RT on in cut down ports of 3rd party games, and more first party RT experiences. That doesn't mean that every other visual aspects of games will also be better. Just that the math on Drake for which settings to cut and keep will not match the other machines.