StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (Read the staff posts before commenting!)

Procheno · Jan 14, 2023

Cybergatuno said:
What's up with all these Switch 2 videos on the same day from the Spawncast?

God I hate Nintendo Youtubers, just cranking out mediocrity with very little value to their own content and input

Edit: Especially Arlo

Cybergatuno · Jan 14, 2023

oldpuck said:
I'm gonna give you the long ass answer to your question, as Dakhil already gave you the concise (correct) one.

The shader situation is well understood, and the Lapsus$ hack heavily suggests that they are both right about the shader situation.

This would be a book. Seriously, this is a get your PhD in compsci level question. Those systems use multiple BC technologies. But the one you probably care about is how are they compatible with PS4 shaders, and the answer is that they didn't break backwards compatibility when they designed the newer GPUs. They kept the hardware from the PS4 era to execute old shader instructions that modern games don't need anymore.

Rosetta 2 is a binary translator which monkey patches API calls. Here are two key points when talking about PC backwards compat:

PC apps never talk to the GPU directly, console games do. If PCs let random apps touch the GPU not only would you be infected with viruses that take over your screen, every random app could cause the entire system to crash. Minimize a calculator window and forget about it? Well, if it crashes your whole system reboots! Consoles are a different world, so console games get raw hardware access, for extra performance

This is why Rosetta 2 doesn't have to emulate a GPU at all.

PC games don't have precompiled shaders, console games do: When you change a tiny detail about a graphics card, you have to recompile your shaders. There is no way for PC games to do this for every combination that already exists, and even if they did, they wouldn't support new hardware. So PC games don't ship precompiled shaders, they ship the raw shader code, and your PC compiles it when it needs it.

Consoles are a single target, so it is feasible to ship precompiled shaders. This has big performance benefits. If you've heard of #stutterstruggle this is what they're talking about, and why consoles don't have it. But it is also why console games don't "just work" without changes on different GPUs

Rosetta runs on the CPU. It reads the code from an old program, dynamically retranslating as it goes, and then feeds the retranslated instructions back out to the CPU. In essence, the CPU doesn't even see the emulated program, it only sees Rosetta. Meanwhile, Rosetta only has to read tiny chunks of the emulated program at a time, very quickly recompile it, and send it on in tiny, fast bursts. Lots of CPU code is repetitive, going around in loops, so many times, Rosetta doesn't even have to do anything, because it has already recompiled the code.

GPUs don't work that way. There is no way to run something like Rosetta on the GPU, and even if it could, it wouldn't matter, because the game runs on the CPU. Shaders are GPU programs that get squirted over to the GPU in individul blobs, and are usually run immediately.

The way a Rosetta for Shaders would work, is it would run on the CPU, load up games into a kind of container where the game couldn't directly access the GPU. It would then intercept shaders, recompile them for the new architecture, and then squirt them over to the GPU itself.

This could work, emulators do it all the time. But it would introduce #stutterstruggle to the console games. Emulators try to work around this stutter, but they are 1) not completely successful and 2) it requires huge computing resources that Drake won't have.

Sorta? But it doesn't matter much, only the shaders are a tricky problem.

No, but again, only the shaders are tricky.

Console talk directly to the GPU. That means that each game basically builds in its own driver. Drivers need to know stuff like where physically on the chip parts of the GPU are, or exactly how many of component X or Y are, or the magic value of Q that only works on that chip.

Fortunately, even if every single one of these is different between TX1 and Drake, you don't need a complex emulator, you just need a remapper. If a Switch game wants to send command 0x800DBA14 to interrupt 0xD01E2A3D then all that drake has to do is catch it, and then look up those locations.

"Hmm, command 0x800DBA14 on TX1 is 'add' and interrupt 0xD01E2A3D is the first SM. My first SM is at 0xD02E3A0D, and my add command is 0x800DBA11. Let me just substitute one for the other it will Just Work"

Ampere and Maxwell are similar enough for this sort of technique to work. Driver emulation is simple and fast.

It's the shaders that are tricky, for the reasons I talked about before. Shaders are complicated, they're basically whole programs. Ampere might add an instruction that does what took 5 instructions before, while deleting 4 of those old useless instructions, which might take 2 new instructions to fake. Doing that translation is complex and slow.

You can't really partially emulate anything the way you mean, but you can essentially "pass through" the parts the are identical. Your emulator always has to sit in the middle of the whole thing, and you pay a performance cost for that. But when two systems are very very similar, your emulator only has to work hard at the differences.

But again, Shader recompilation isn't emulation in that way. That is why it is tricky

SM3DAS could do all these recompilation steps in advance, on a developer's powerful workstation. Drake has to do it in real time, using nothing but a fancy tablet.

SM3DAS's team had access to the raw source code of the original games, and could patch it if they wanted. Drake has to work with on every Switch game without every seeing the source code, and without ever changing it.

SM3DAS only had to work with 3 games, and doesn't have to emulate any features those games didn't use. Drake has to emulate every bit of the Switch hardware that any game uses.

SM3DAS's emulation only needed to reach the speed of the Wii, to run a 16 year old game. SM3DAS's emulation needs to reach the power of the Switch, to run games that haven't released yet.

Aren't pre-compiled shaders well identified sections within the game's data?
Couldn't those sections be decompiled and recompiled for another target?
All Switch games sit on Nintendo's servers. They could patch them and generate Drake-specific releases.
Switch carts wouldn't work out of the box and may require a patch download, or maybe download the whole Drake-specific game and the cart would only be a DRM.

Deleted member 887 · Jan 14, 2023

LiC said:
Not really? Because their contention isn't merely that a new GPU architecture can't natively/directly run the shaders in current Switch games -- which is true, and not something you need the leak to tell you --

There had been speculation that Nvidia would support it in hardware, which the shader model version seems to imply is not happening

LiC said:
it's that the solutions to this problem are intractable or that Nintendo won't pursue them for some reason. That's the stupidity of the arguments that they make (which the leak has no bearing on).

MVG has listed several BC solutions, and last I heard him talk about it, he mentioned what he thought Nintendo's strategy would be. (Recent vague comments are so contextless I don't think they're worth considering).

But I've said repeatedly about SciresM that they are right abut the problem and wrong about the solution space, but in retrospect, looking at the last time SciresM was brought up in this thread, they're statement is basically "Pro hardware running on Ampere isn't viable", and in the absence of hardware support for the older shader instructions, they're probably right. We're just not talking about a 2x upgrade anymore.

ILikeFeet · Jan 14, 2023

Cybergatuno said:
Aren't pre-compiled shaders well identified sections within the game's data?
Couldn't those sections be decompiled and recompiled for another target?
All Switch games sit on Nintendo's servers. They could patch them and generate Drake-specific releases.
Switch carts wouldn't work out of the box and may require a patch download, or maybe download the whole Drake-specific game and the cart would only be a DRM.

majority of games won't be worth the effort.

Cybergatuno · Jan 14, 2023

Dakhil said:
I think people are too obsessed with 100% backwards compatibility, hence the argument that Nintendo needs to have the Tegra X1 installed on the DLSS model*'s motherboard for backwards compatibility, which I think is a ridiculous argument to begin with.

* → a tentative name that I use

Indeed, a full X1 on the motherboard is a ridiculous argument.
But what about a full X1 GPU within Drake? I don't think the X1's Maxwell GPU, ported to Drake's node, would be that big, certainly a lot smaller than Drake's and may share a lot of subsystems and plumbing.

Not that I believe it would happen.

ILikeFeet · Jan 14, 2023

Cybergatuno said:
Indeed, a full X1 on the motherboard is a ridiculous argument.
But what about a full X1 GPU within Drake? I don't think the X1's Maxwell GPU, ported to Drake's node, would be that big, certainly a lot smaller than Drake's and may share a lot of subsystems and plumbing.

Not that I believe it would happen.

still ridiculous. a lot of engineering effort for little trade off

Deleted member 887 · Jan 14, 2023

Cybergatuno said:
Aren't pre-compiled shaders well identified sections within the game's data?

No, unfortunately. Shaders could be anywhere. The SDK organizes things by default in certain ways, but there are no constraints on where shaders wind up.

Cybergatuno said:
Couldn't those sections be decompiled and recompiled for another target?
All Switch games sit on Nintendo's servers. They could patch them and generate Drake-specific releases.
Switch carts wouldn't work out of the box and may require a patch download, or maybe download the whole Drake-specific game and the cart would only be a DRM.

This is basically the solution MVG proposed. MS does it, apparently, for 360 games on backwards compat. But you'll note that not all games are on the BC list, and MS has huge amounts of cloud resources they can throw at the problem of running a big recompile on the whole damn library.

Instro · Jan 14, 2023

Procheno said:
God I hate Nintendo Youtubers, just cranking out mediocrity with very little value to their own content and input

Edit: Especially Arlo

There are no good youtubers. Only vtubers

Ojoloco · Jan 14, 2023

They can make a traduction layer or something similar

LiC · Jan 14, 2023

oldpuck said:
MVG has listed several BC solutions, and last I heard him talk about it, he mentioned what he thought Nintendo's strategy would be. (Recent vague comments are so contextless I don't think they're worth considering).

I think this is giving too much credit and overlooking the underpinning contrarian attidue (as in the most recent instance, which came up after an essentially unrelated question about IP rights in Smash Ultimate). But as a podcast non-enjoyer I'm speaking second-hand here, so I won't go on about it.

Discostew · Jan 14, 2023

oldpuck said:
No, unfortunately. Shaders could be anywhere. The SDK organizes things by default in certain ways, but there are no constraints on where shaders wind up.

What if the location (and size) of each shader in each compiled Switch game were known ahead of time? Like from compiler logs before a game's release?

ShadowFox08 · Jan 14, 2023

At this point, perhaps our only chance of getting a 4 TFLOPS for T239 is if we get 4nm TSMC for T239. And that's still a maybe. Some people might have made some scenarios in the past with the 1.3Ghz GPU, but I don't recall. Can we get it within a respective power draw similar to that of the OG Switch at least (11-15 watts) in docked.. when accounting for a 1.3 Ghz GPU and 1.5-2GHz CPU? Who knows...

I think a 3 TFLOPS Drake would be pretty close/equal to PS4 Pro without DLSS, and is realistic on a 4nm samsung or 6nm TSMC. But it's really hard tosay for sure on on matching resolution (since that's what PS4 Pro is really all about), when Switch 2/Drake will only have 102 GB/s available (133 GB/s if we're absolutely lucky)., vs 220GB/s PS4 pro? Switch 2/Drake Nvidia ampere hardware is more efficient vs AMD when it comes to using bandwidth, and the cache might be bigger than PS4's, which should help.

But man.. the switch pulled off some crazy ports at 25GB/s vs Xbone and PS4. Its checkerboard rendering helped. Would have been interesting to see how much Switch would have narrowed the gap if it had TX2's 50GB/s.. but oh well.. At least Switch 2 will narrow the bandwidth the gap vs PS4 pro by a lot to only 45-50% less.

Can't wait for DF to release a PS4 Pro vs One X vs XSS vs Switch 2 showdown.

niconiconick64 said:
I can't find a geekbench 5 result for the PS4 Pro anywhere, only geekbench 4 results, which aren't comparable to 5 values.
@ILikeFeet @ReddDreadtheLead @oldpuck @LiC
correct me if I'm wrong in my line of thought here but:
If I'm not mistaken, drake would perform similarly in multicore compared to the steam deck but lose considerably in single core perf. under geekbench 5.
My point of reference is this benchmark for the AGX orin which I know, is a 12 core orin device instead of the 8 on drake and running on a 2GHz clock even. But afaik, clock speed doesn't scale linearly (far from it) on geekbench however, core count on multicore scores kinda does (again, afaik).

So I'm assuming at worse a ~33.3% reduction in multicore from the AGX Orin score(~4000ish points end result) followed by another drop (related to the clock speed drake would run at) which would put it around the same ballpark as the steam deck's CPU score (again, in multicore).

But considering this is what the current switch performs like on geekbench 5, and the arguments made ITT in the past, I expect at the very least, a 3x increase on that single core score under drake.

But going back to the whole PS4 Pro vs drake conversation (CPU-wise), I still think drake wouldn't lose by much versus the PS4 Pro's CPU.
The GPU talk is kinda pointless when it comes down to TFLOPs considering drake has half the bandwidth to work with compared to the PS4 Pro and series S.

What? I thought it was the other way around.

I thought A78 single core performance is qual or more efficient per Hz than SD's/current gen AMD Zen 2. It's the multi-threading at least that AMD should have an advantage over ARM though.

But if we look at the clockspeeds that you linked... Correct me if I'm wrong... Zen 2 having 830 for single core score and 3666 multi core vs AGX Orion's 763 and 7193. Single core is only 10% difference.

Serif · Jan 14, 2023

I've been reading Thraktor's post from mid December about shader assembly and how that could relate to BC. This is my understanding of it related to this discussion about BC...

EDIT: Refer to LiC's post below / here about Drake sharing the same ISA as Orin, so no changes. Translation remains a viable solution to GPU BC even if there is no indication of it as of yet.

So TX1's GPU compute compatibility version is 5.3, Drake's GPU is 8.8.

In an example given by Nvidia in their CUDA docs, a binary file generated for a GPU with compute compatibility 8.6 can run on a GPU with compute compatibility 8.9, but not vice versa. Also, that 8.6 binary cannot run on a GPU with compute compatibility 9.x, the major version number is different.

This tells us what we already know - that shaders compiled for Switch 1 are not immediately compatible with Drake, indicated by the compute compat major version number.

Maxwell and Ampere share a lot of instructions in their shader assembly instruction set though, around 75%. These instructions from Maxwell may not need translation to run on Ampere. Nintendo and Nvidia need to handle the remaining percentage and any issues with the shared ones, to ensure BC.

Thraktor presents binary translation as an option for GPU compat. Go through the shader binary line by line, skipping the supported instructions between Maxwell and Ampere and translating the rest. This is not a straightforward process so kudos (or - CUD-os

) to those GPU engineers.

He also notes that Orin's compute compat version 8.7 and Drake is 8.8. So binary compiled for Drake will not run on Orin, meaning Nvidia has made some changes to the instruction set for Drake. Some of these changes to Drake's instruction set may indicate some specific modifications to allow for translation from Maxwell shaders i.e. BC with the Tegra X1 GPU.

Still speculation of course, but I thought this was a worthwhile observation to bring up. (Also please correct me if I got any of this inaccurate).

Skittzo · Jan 14, 2023

Serif said:
I've been reading Thraktor's post from mid December about shader assembly and how that could relate to BC. This is my understanding of it related to this discussion about BC...

So TX1's GPU compute compatibility version is 5.3, Drake's GPU is 8.8.

In an example given by Nvidia in their CUDA docs, a binary file generated for a GPU with compute compatibility 8.6 can run on a GPU with compute compatibility 8.9, but not vice versa. Also, that 8.6 binary cannot run on a GPU with compute compatibility 9.x, the major version number is different.

This tells us what we already know - that shaders compiled for Switch 1 are not immediately compatible with Drake, indicated by the compute compat major version number.

Maxwell and Ampere share a lot of instructions in their shader assembly instruction set though, around 75%. These instructions from Maxwell may not need translation to run on Ampere. Nintendo and Nvidia need to handle the remaining percentage and any issues with the shared ones, to ensure BC.

Thraktor presents binary translation as an option for GPU compat. Go through the binary line by line, skipping the supported instructions between Maxwell and Ampere and translating the rest. This is not a straightforward process so kudos (or - CUD-os ) to those GPU engineers.

He also notes that Orin's compute compat version 8.7 and Drake is 8.8. So binary compiled for Drake will not run on Orin, meaning Nvidia has made some changes to the instruction set for Drake. Some of these changes to Drake's instruction set may indicate some specific modifications to allow for translation from Maxwell shaders i.e. BC with the Tegra X1 GPU.

Still speculation of course, but I thought this was a worthwhile observation to bring up. (Also please correct me if I got any of this inaccurate).

Yeah'd for the overall post but ESPECIALLY yeah'd for the CUD-os pun.

LiC · Jan 14, 2023

Discostew said:
What if the location (and size) of each shader in each compiled Switch game were known ahead of time? Like from compiler logs before a game's release?

There are no requirements for the format or location of a game's data prior to actually being loaded/run. There could have been some kind of record-keeping solution to this issue in order to make static recompilation possible, but Nintendo and Nvidia elected not to use one.

But as I've said before, framing this like a technical issue just isn't right. There is no outcome where Nintendo wants to do BC, but can't make it happen for technical reasons, because they screwed up the design of the precompiled shaders or whatever. There are only two outcomes: Nintendo wants BC and does what they need to do to implement it, or Nintendo doesn't want BC so they leave it out. Most people agree that the latter would be an insane decision, but it's the only way BC gets broken.

LiC · Jan 14, 2023

Serif said:
I've been reading Thraktor's post from mid December about shader assembly and how that could relate to BC. This is my understanding of it related to this discussion about BC...

So TX1's GPU compute compatibility version is 5.3, Drake's GPU is 8.8.

In an example given by Nvidia in their CUDA docs, a binary file generated for a GPU with compute compatibility 8.6 can run on a GPU with compute compatibility 8.9, but not vice versa. Also, that 8.6 binary cannot run on a GPU with compute compatibility 9.x, the major version number is different.

This tells us what we already know - that shaders compiled for Switch 1 are not immediately compatible with Drake, indicated by the compute compat major version number.

Maxwell and Ampere share a lot of instructions in their shader assembly instruction set though, around 75%. These instructions from Maxwell may not need translation to run on Ampere. Nintendo and Nvidia need to handle the remaining percentage and any issues with the shared ones, to ensure BC.

Thraktor presents binary translation as an option for GPU compat. Go through the shader binary line by line, skipping the supported instructions between Maxwell and Ampere and translating the rest. This is not a straightforward process so kudos (or - CUD-os ) to those GPU engineers.

He also notes that Orin's compute compat version 8.7 and Drake is 8.8. So binary compiled for Drake will not run on Orin, meaning Nvidia has made some changes to the instruction set for Drake. Some of these changes to Drake's instruction set may indicate some specific modifications to allow for translation from Maxwell shaders i.e. BC with the Tegra X1 GPU.

Still speculation of course, but I thought this was a worthwhile observation to bring up. (Also please correct me if I got any of this inaccurate).

8.8 is Drake's SM's version. It turns out the SM version is not always the same as the SPA version (aka ISA version). There's a mapping, and SM versions 8.6, 8.7, and 8.8 all map to SPA version 8.6. This would mean they share the same ISA, so Drake didn't add any new instructions.

So why does the SM version change if the ISA is the same? An Nvidia comment states that the SM version is "the "hardware revision" of the SM block" (interior quotes in source). There are known SM changes in Drake, like reverting from Orin's tensor cores to those found in desktop Ampere, but we can't say exactly what the significance of that number changing is.

JoshuaJSlone · Jan 14, 2023

kvetcha said:
Yeah, if you're talking about upscaling DLSS output to 4K by other means, I agree the higher target resolution is preferable. I was under the impression we were talking 720p/1080p targets for handheld.

Yes, we are. I just have to take the 720p output and triple its size and take the 1080p output and double its size to make them the same size for side-by-side comparison without introducing other scaling artifacts.

mariodk18 · Jan 14, 2023

oldpuck said:
I'm gonna give you the long ass answer to your question, as Dakhil already gave you the concise (correct) one.

The shader situation is well understood, and the Lapsus$ hack heavily suggests that they are both right about the shader situation.

This would be a book. Seriously, this is a get your PhD in compsci level question. Those systems use multiple BC technologies. But the one you probably care about is how are they compatible with PS4 shaders, and the answer is that they didn't break backwards compatibility when they designed the newer GPUs. They kept the hardware from the PS4 era to execute old shader instructions that modern games don't need anymore.

Rosetta 2 is a binary translator which monkey patches API calls. Here are two key points when talking about PC backwards compat:

PC apps never talk to the GPU directly, console games do. If PCs let random apps touch the GPU not only would you be infected with viruses that take over your screen, every random app could cause the entire system to crash. Minimize a calculator window and forget about it? Well, if it crashes your whole system reboots! Consoles are a different world, so console games get raw hardware access, for extra performance

This is why Rosetta 2 doesn't have to emulate a GPU at all.

PC games don't have precompiled shaders, console games do: When you change a tiny detail about a graphics card, you have to recompile your shaders. There is no way for PC games to do this for every combination that already exists, and even if they did, they wouldn't support new hardware. So PC games don't ship precompiled shaders, they ship the raw shader code, and your PC compiles it when it needs it.

Consoles are a single target, so it is feasible to ship precompiled shaders. This has big performance benefits. If you've heard of #stutterstruggle this is what they're talking about, and why consoles don't have it. But it is also why console games don't "just work" without changes on different GPUs

Rosetta runs on the CPU. It reads the code from an old program, dynamically retranslating as it goes, and then feeds the retranslated instructions back out to the CPU. In essence, the CPU doesn't even see the emulated program, it only sees Rosetta. Meanwhile, Rosetta only has to read tiny chunks of the emulated program at a time, very quickly recompile it, and send it on in tiny, fast bursts. Lots of CPU code is repetitive, going around in loops, so many times, Rosetta doesn't even have to do anything, because it has already recompiled the code.

GPUs don't work that way. There is no way to run something like Rosetta on the GPU, and even if it could, it wouldn't matter, because the game runs on the CPU. Shaders are GPU programs that get squirted over to the GPU in individul blobs, and are usually run immediately.

The way a Rosetta for Shaders would work, is it would run on the CPU, load up games into a kind of container where the game couldn't directly access the GPU. It would then intercept shaders, recompile them for the new architecture, and then squirt them over to the GPU itself.

This could work, emulators do it all the time. But it would introduce #stutterstruggle to the console games. Emulators try to work around this stutter, but they are 1) not completely successful and 2) it requires huge computing resources that Drake won't have.

Sorta? But it doesn't matter much, only the shaders are a tricky problem.

No, but again, only the shaders are tricky.

Console talk directly to the GPU. That means that each game basically builds in its own driver. Drivers need to know stuff like where physically on the chip parts of the GPU are, or exactly how many of component X or Y are, or the magic value of Q that only works on that chip.

Fortunately, even if every single one of these is different between TX1 and Drake, you don't need a complex emulator, you just need a remapper. If a Switch game wants to send command 0x800DBA14 to interrupt 0xD01E2A3D then all that drake has to do is catch it, and then look up those locations.

"Hmm, command 0x800DBA14 on TX1 is 'add' and interrupt 0xD01E2A3D is the first SM. My first SM is at 0xD02E3A0D, and my add command is 0x800DBA11. Let me just substitute one for the other it will Just Work"

Ampere and Maxwell are similar enough for this sort of technique to work. Driver emulation is simple and fast.

It's the shaders that are tricky, for the reasons I talked about before. Shaders are complicated, they're basically whole programs. Ampere might add an instruction that does what took 5 instructions before, while deleting 4 of those old useless instructions, which might take 2 new instructions to fake. Doing that translation is complex and slow.

You can't really partially emulate anything the way you mean, but you can essentially "pass through" the parts the are identical. Your emulator always has to sit in the middle of the whole thing, and you pay a performance cost for that. But when two systems are very very similar, your emulator only has to work hard at the differences.

But again, Shader recompilation isn't emulation in that way. That is why it is tricky

SM3DAS could do all these recompilation steps in advance, on a developer's powerful workstation. Drake has to do it in real time, using nothing but a fancy tablet.

SM3DAS's team had access to the raw source code of the original games, and could patch it if they wanted. Drake has to work with on every Switch game without every seeing the source code, and without ever changing it.

SM3DAS only had to work with 3 games, and doesn't have to emulate any features those games didn't use. Drake has to emulate every bit of the Switch hardware that any game uses.

SM3DAS's emulation only needed to reach the speed of the Wii, to run a 16 year old game. SM3DAS's emulation needs to reach the power of the Switch, to run games that haven't released yet.

Thanks for the comprehensive response and answers!

Maybe this is a dumb question, but why can’t they decompile the console shaders and then run them raw on PCs? That seems like it would solve that issue since I believe you said PC GPUs don’t run them precompiled.

And going off that, can Drake do the same? “Just” decompile the NSW PCS (precompiled shaders) and run them raw?

Last dumb questions. The FDE wouldn’t be able to help with decompiling would it? Or maybe because this is a custom chip, they make it Ampere-Lovelace-2 shots of Maxwell and modify it down the chip HW level? I don’t know, I’m just spitballing here.

Serif · Jan 14, 2023

LiC said:
8.8 is Drake's SM's version. It turns out the SM version is not always the same as the SPA version (aka ISA version). There's a mapping, and SM versions 8.6, 8.7, and 8.8 all map to SPA version 8.6. This would mean they share the same ISA, so Drake didn't add any new instructions.

Gotcha. At this point in development, could they be adding new / modifying existing instructions and updating Drake's ISA version? Considering the files we're working with date back to early last year at the latest. That is, if changing the ISA is necessary for translation. I'm assuming they'd go with translation over recompilation.

LiC said:
I think this is giving too much credit and overlooking the underpinning contrarian attidue (as in the most recent instance, which came up after an essentially unrelated question about IP rights in Smash Ultimate). But as a podcast non-enjoyer I'm speaking second-hand here, so I won't go on about it.

He also said in a podcast that the T239 Linux commit "doesn't mean anything in the grand scheme of things", which I strongly disagree with.

Deleted member 887 · Jan 14, 2023

I'm interested in it technically because 1) I just think it's neat and 2) what it implies about the performance of Drake. But I agree with LiC, no reason to believe there won't be BC, and this weird meme that Nintendo screwed up the Switch and BC isn't possible just doesn't hold up. These are the same problems that BC has had to solve since the beginning of consumer computing. That Nintendo is having to deal with it now is mostly a historical issue. The PS6 is, mostly likely, going to face this same issue.

There is a simple, 3 part solution that solves 99% of the problems pretty easily.

1) Nintendo puts a Maxwell transpiler into Drake. This works for every game, forever, with no patch from the internet. It does potentially introduce some minor stutter into games. However, because transpiling code is much much faster than recompiling code in the first place, it's not like PC stutter struggle at all.

2) Nintendo updates their SDK. Even if you're developing a base Switch game after Drake launches, Nintendo's updated SDK generates the Drake shaders for you and packages them along. The transpiler sees those shaders and just replaces the TX1 shaders with the ones the SDK generated in flight. This eliminates the stutter for new games, or any game patched with the new SDK.

3) Nintendo forces an intern to play the 100 most popular eShop games. While the intern plays the most popular eShop games on Drake, the transpiler runs and generates converted shaders. Nintendo then takes those out, and packages them up as updates.

This is a simple solution that doesn't require complex engineering, a huge investment of effort, would work basically on every game, out of the box, without an internet connection. Most games you care about would work with 0 hitches at launch, and any further performance issues discovered after the fact are quickly resolved.

This is like the lo-fi version of the MS patch solution, but it's much simpler because Drake is much more like TX1 than anything was like the Cell. It's similar to what Valve is likely to do with the steam deck, but again, much simpler because they don't have to support the whole breadth of PC gaming.

This is a very solvable problem.

Deleted member 887 · Jan 14, 2023

mariodk18 said:
Thanks for the comprehensive response and answers!

Maybe this is a dumb question, but why can’t they decompile the console shaders and then run them raw on PCs? That seems like it would solve that issue since I believe you said PC GPUs don’t run them precompiled.

And going off that, can Drake do the same? “Just” decompile the NSW PCS (precompiled shaders) and run them raw?

You can't run a shader raw. It has to be compiled. PC games just don't ship with them already compiled. When you run a PC game, your machine actually compiles them for your specific hardware.

If the game waits till the last second to do this, then what happens is your game stops entirely while the shader compile happens, then resumes. On modern games, especially built with UE4, there are lots of shaders. It's a serious problem in PC space.

What I suggest (in a post I just wrote) is that they'll do something called "transpiling". Compiling a shader is very processor intensive, partially because compilers do lots of optimizations. But when two arches are very similar, instead of decompiling and recompiling you just go instruction one at a same and go "that's the same, that's teh same, that's the same, that's gone." And when you reach something the new arch does, you just slap in a little precompiled blob that replaces it and move on.

Transpiling can be very fast, which can eliminate most/all of the stutter that compilation gives you. It produces shaders that are slower than the originals, but in the case of Drake, we are guaranteeing that the target hardware is faster than the original hardware the shader was written for, so that's not generally a problem.

mariodk18 said:
Last dumb questions. The FDE wouldn’t be able to help with decompiling would it?

A dedicated hardware block could help, but there is no indication the FDE was designed to do that

mariodk18 said:
Or maybe because this is a custom chip, they make it Ampere-Lovelace-2 shots of Maxwell and modify it down the chip HW level? I don’t know, I’m just spitballing here.

They also could do that, but as LiC has discovered they have not. Switch is a very size and power constrained machine, adding in a bunch of extra back compat silicon probably didn't make sense, if they had a software based solution.

I have a bad habit of diving into the technical details and then people hear "this problem is very hard" but I don't think BC is like this crazy mountain Nintendo can't climb. It's not a radical arch change, they've got a dedicated emulation team, and they're working with the same hardware partner. They're about as set up for success as they could be

karmitt · Jan 14, 2023

Sorry for dredging up the “Is the Zelda OLED real?” discussion but…

What was the source? I’ve seen people say it was the same person who leaked the Splatoon 3 OLED - is this true? And wasn’t that just them leaking that the announcement was imminent, not images of any physical device?

LinkURL · Jan 14, 2023

ShadowFox08 said:
At this point, perhaps our only chance of getting a 4 TFLOPS for T239 is if we get 4nm TSMC for T239. And that's still a maybe. Some people might have made some scenarios in the past with the 1.3Ghz GPU, but I don't recall. Can we get it within a respective power draw similar to that of the OG Switch at least (11-15 watts) in docked.. when accounting for a 1.3 Ghz GPU and 1.5-2GHz CPU? Who knows...

I think a 3 TFLOPS Drake would be pretty close/equal to PS4 Pro without DLSS, and is realistic on a 4nm samsung or 6nm TSMC. But it's really hard tosay for sure on on matching resolution (since that's what PS4 Pro is really all about), when Switch 2/Drake will only have 102 GB/s available (133 GB/s if we're absolutely lucky)., vs 220GB/s PS4 pro? Switch 2/Drake Nvidia ampere hardware is more efficient vs AMD when it comes to using bandwidth, and the cache might be bigger than PS4's, which should help.

But man.. the switch pulled off some crazy ports at 25GB/s vs Xbone and PS4. Its checkerboard rendering helped. Would have been interesting to see how much Switch would have narrowed the gap if it had TX2's 50GB/s.. but oh well.. At least Switch 2 will narrow the bandwidth the gap vs PS4 pro by a lot to only 45-50% less.

Can't wait for DF to release a PS4 Pro vs One X vs XSS vs Switch 2 showdown.

What? I thought it was the other way around.

I thought A78 single core performance is qual or more efficient per Hz than SD's/current gen AMD Zen 2. It's the multi-threading at least that AMD should have an advantage over ARM though.

But if we look at the clockspeeds that you linked... Correct me if I'm wrong... Zen 2 having 830 for single core score and 3666 multi core vs AGX Orion's 763 and 7193. Single core is only 10% difference.

I don't understand why they cap Drake to 11~15W docked, it made sense on Switch, because more than that would cause thermal throtlle, and the target resolution difference between handheld and docked was 50% per axis (720p ~ 1080p).
With Drake, we know that the target on the dock would most likely be 4K, and the difference per axis can be 2x or 3x (depending on whether the screen is 1080p or 720p), along with the Dock with LAN having the option of delivering more than 15W It makes room for us to see a GPU that may reach a clock above 1GHz even if it were made in Samsung 8nm.

LiC · Jan 14, 2023

Serif said:
Gotcha. At this point in development, could they be adding new / modifying existing instructions and updating Drake's ISA version? Considering the files we're working with date back to early last year at the latest. That is, if changing the ISA is necessary for translation. I'm assuming they'd go with translation over recompilation.

It seems unlikely to me, since I think that would have been done during hardware design, well before the point where SM version 8.8 started appearing in the driver software tree (which is what got leaked), and certainly before the point where they were testing on physical chips (which might have been last April). But I could be wrong.

Concernt · Jan 14, 2023

Cybergatuno said:
What's up with all these Switch 2 videos on the same day from the Spawncast?

Tinfoil hat time:

Things are heating up behind the scenes and people have got wind of it.

This is it people, this is how September 2016 was, come on February 2023 reveal! I want to believe!

karmitt · Jan 14, 2023

Concernt said:
Tinfoil hat time:

Things are heating up behind the scenes and people have got wind of it.

This is it people, this is how September 2016 was, come on February 2023 reveal! I want to believe!

I’d love for you to be right! :]

ILikeFeet · Jan 14, 2023

Concernt said:
Tinfoil hat time:

Things are heating up behind the scenes and people have got wind of it.

This is it people, this is how September 2016 was, come on February 2023 reveal! I want to believe!

nintendo drama, fake or otherwise, is cheap clicks, and people click on them

Dekuman · Jan 14, 2023

Worst case may not be no BC, but very limited BC, like only for Nintendo games with a small upgrade upcharge, leaving 3rd parties fuming because they have to spend money themselves to recompile their games for BC and having to deal with consumer backlash if they charge for it, and consumers being constantly reminded most of their library no longer works.

mariodk18 · Jan 14, 2023

oldpuck said:
You can't run a shader raw. It has to be compiled. PC games just don't ship with them already compiled. When you run a PC game, your machine actually compiles them for your specific hardware.

If the game waits till the last second to do this, then what happens is your game stops entirely while the shader compile happens, then resumes. On modern games, especially built with UE4, there are lots of shaders. It's a serious problem in PC space.

What I suggest (in a post I just wrote) is that they'll do something called "transpiling". Compiling a shader is very processor intensive, partially because compilers do lots of optimizations. But when two arches are very similar, instead of decompiling and recompiling you just go instruction one at a same and go "that's the same, that's teh same, that's the same, that's gone." And when you reach something the new arch does, you just slap in a little precompiled blob that replaces it and move on.

Transpiling can be very fast, which can eliminate most/all of the stutter that compilation gives you. It produces shaders that are slower than the originals, but in the case of Drake, we are guaranteeing that the target hardware is faster than the original hardware the shader was written for, so that's not generally a problem.

A dedicated hardware block could help, but there is no indication the FDE was designed to do that

They also could do that, but as LiC has discovered they have not. Switch is a very size and power constrained machine, adding in a bunch of extra back compat silicon probably didn't make sense, if they had a software based solution.

I have a bad habit of diving into the technical details and then people hear "this problem is very hard" but I don't think BC is like this crazy mountain Nintendo can't climb. It's not a radical arch change, they've got a dedicated emulation team, and they're working with the same hardware partner. They're about as set up for success as they could be

Gotcha. So if I’m understanding correctly, method one could allow for “100%” BC, but not equal performance BC across the board. And then if a game was to have native-like or at least equal to NSW performance, patches would be required. For example, Smash as a fighting game really needs a consistent 60 fps, so a game like that would be more optimized for Drake.

Concernt · Jan 14, 2023

ILikeFeet said:
nintendo drama, fake or otherwise, is cheap clicks, and people click on them

You make what they say about smoke, though. Somewhere, a fire is burning.

Concernt · Jan 14, 2023

My two cents on the BC issue is that it will use, as OldPuck clarified, a kind of emulator. Virtualization. Whatever you want to call it. Shader transpiling to deal with well, shaders, as much as can be run on the hardware itself run on the hardware.

I'd describe it as likely being something between an emulator and a compatibility layer. Not a whole emulator, not just a compatibilit layer.

The power increase and the similarities in architecture we will see with this new device means I'm just not worried about BC. Not only is it possible, but it will probably perform fine. Some games might even be allowed perform better even without patches, depending on the solution. Some may need specific updates to run at all. Some may never work. But it's not something I'm worried about.

Dekuman · Jan 14, 2023

Concernt said:
My two cents on the BC issue is that it will use, as OldPuck clarified, a kind of emulator. Virtualization. Whatever you want to call it. Shader transpiling to deal with well, shaders, as much as can be run on the hardware itself run on the hardware.

I'd describe it as likely being something between an emulator and a compatibility layer. Not a whole emulator, not just a compatibilit layer.

The power increase and the similarities in architecture we will see with this new device means I'm just not worried about BC. Not only is it possible, but it will probably perform fine. Some games might even be allowed perform better even without patches, depending on the solution. Some may need specific updates to run at all. Some may never work. But it's not something I'm worried about.

The 'some many never run' part concerns me. Reminds me of 1980s micro computer scene where a generational upgrade yield broad but not perfect BC and the manufacturer has to admit they don't even know which games aren't compatible. It's a PR nightmare.

If Nintendo's going this route of non 100% BC, they better have a list ready and a generous funding mechanism to get partners to patch up their games where needed.

The only upside here is Nintendo has much more control over what releases on their platform than Commodore which ran open platforms so any tom dick and harry dev could develop for it, so Nintendo have a library of every Switch game they could test. The question is if they will bother with it.

Deleted member 887 · Jan 14, 2023

niconiconick64 said:
If I'm not mistaken, drake would perform similarly in multicore compared to the steam deck but lose considerably in single core perf. under geekbench 5.
My point of reference is this benchmark for the AGX orin which I know, is a 12 core orin device instead of the 8 on drake and running on a 2GHz clock even. But afaik, clock speed doesn't scale linearly (far from it) on geekbench however, core count on multicore scores kinda does (again, afaik).

Here is a benchmark putting an actual, internal Valve geekbench test up against Orin: https://browser.geekbench.com/v5/cpu/compare/18647313?baseline=12800007

You'll see that in single core performance they're about the same. Orin is clocked at 2.2 GHz, and Steam Deck at 2.8. You can see that the A78 is very similar, per clock, to the Zen 2 CPU. However, no one expects the clock to be set than high on Drake. My guess would be closer to 1.2 GHz. So yes, single core would fall behind, purely because of clock.

If you look at the multicore performance, you'll see that Orin comes out about 2x of Steam Deck. Orin has 12 cores, and Steam Deck has 8, so you expect Orin to come out ahead. But not by 2x. A78's multicore perf holds up a little better than the Zen 2. Steam Deck's will still beat Drake's multi-core perf, just because the clocks are lower, but it won't be as much of a gap as single core performance.

Additionally, Steam Deck has to spend a significant amount of CPU perf on its OS. Proton is pretty efficient but Drake should have more CPU resources available to games.

niconiconick64 said:
So I'm assuming at worse a ~33.3% reduction in multicore from the AGX Orin score(~4000ish points end result) followed by another drop (related to the clock speed drake would run at) which would put it around the same ballpark as the steam deck's CPU score (again, in multicore).

But considering this is what the current switch performs like on geekbench 5, and the arguments made ITT in the past, I expect at the very least, a 3x increase on that single core score under drake.

That 3x number doesn't hold up, I think. A78 probably has 3x or more increase... iso power. Meaning if you put the same amount of electricity through it, and you made it on the node ARM wants you to build on. That would, of course, increase clock speed.

Here is a TX1 bench up against Orin:

Jetson AGX Orin vs Jetson-TX1 - Geekbench Browser

You can see the 3x single core boost, that Orin obviously has higher clock speeds - 2.2 to TX1's 1.7 - but it isn't outrageous. I think a 1.5-2x gain if the clock speeds are the same is about right.

You'll also note that Orin wipes the floor with TX1 in multicore, far more than its 12 threads would suggest. With Drake's 2x cores, an 8x increase in multicore perf is in the cards.

niconiconick64 said:
But going back to the whole PS4 Pro vs drake conversation (CPU-wise), I still think drake wouldn't lose by much versus the PS4 Pro's CPU.

It will wipe the floor with it. Here is a late era Jaguar CPU up against Orin (I believe the version of Jaguar in the One X and PS4 Pro)

Jetson AGX Orin vs AMD X64 - Geekbench Browser

Very similar clocks, Orin is 2.68x in single threaded perf. While I expect Drake clocks to be much below the 2.1 GHz of the PS4 Pro, the A78C is likely well ahead. Frankly, the TX1 is already in that ballpark against the PS4, it's just the PS4's number of core that give it a fighting chance.

niconiconick64 said:
The GPU talk is kinda pointless when it comes down to TFLOPs considering drake has half the bandwidth to work with compared to the PS4 Pro and series S.

Yes and no.

PS4 Pro was built on AMD's GCN architecture. GCN - as well as earlier NVidia arches - rendered the whole scene at once. That required a giant chunk of bandwidth which got used in one big spike, then quiesced while rendering happened.Both Maxwell and Ampere use tile based rendering, which cut the scene up into chunks and render each chunk one at a time. This uses memory bandwidth over a longer period of frame time, but at less amounts each time.

The PS4 Pro's giant amount of memory bandwidth isn't a win relative to Drake for that reason. If you look at Drake's bandwidth compared to other Ampere video cards it's about on par (thanks to @Look over there for helping me understand all this).

The Series S is built on RDNA2 (1.5 if you're being cheeky). RDNA uses tile based rendering, so in that case the Series S bandwidth wins are more real. But in the case of the 9th gen consoles, their version of RDNA doesn't use the infinity cache, and they likely need that giant amount of bandwidth to make up for it. So while Series S does have huge bandwidth wins, they aren't as massive as they appear to be.

I think you're right that at 4TFLOPS, Drake is way more bandwidth constrained than Series S, and its bandwidth is better suite for a 3TFLOPS device. But compared to the PS4 Pro, Drake is in much better shape bandwidth wise, as almost any clock

(More accurately, Infinity Cache makes up for the lack of bandwidth. The console makers made the right call here)

Concernt · Jan 14, 2023

Dekuman said:
The 'some many never run' part concerns me. Reminds me of 1980s micro computer scene where a generational upgrade yield broad but not perfect BC and the manufacturer has to admit they don't even know which games aren't compatible. It's a PR nightmare.

If Nintendo's going this route of non 100% BC, they better have a list ready and a generous funding mechanism to get partners to patch up their games where needed.

The only upside here is Nintendo has much more control over what releases on their platform than Commodore which ran open platforms so any tom dick and harry dev could develop for it, so Nintendo have a library of every Switch game they could test. The question is if they will bother with it.

Some PS4 games don't run on PS5. It's just part and parcel of changing hardware. Some games, some systems, some configurations will always be difficult to emulate. I don't think missing out on say, Snake Pass and Chuchyba's Challenge 2 will make or break PR on Drake, and I'm very confident that any titles of any amount of noteriety will be checked and double checked to make sure they run. I'd say many first party titles could receive patches to improve BC, or even push performance above the original, as long as they're evergreen. I know this may seem out of left field, but I'm fairly certain a Drake patch for Mario Kart 8 Deluxe will drop with 4K in tow alongside one of its DLC waves. Call it a hunch.

Pokemaniac · Jan 14, 2023

oldpuck said:
I'm gonna give you the long ass answer to your question, as Dakhil already gave you the concise (correct) one.

The shader situation is well understood, and the Lapsus$ hack heavily suggests that they are both right about the shader situation.

This would be a book. Seriously, this is a get your PhD in compsci level question. Those systems use multiple BC technologies. But the one you probably care about is how are they compatible with PS4 shaders, and the answer is that they didn't break backwards compatibility when they designed the newer GPUs. They kept the hardware from the PS4 era to execute old shader instructions that modern games don't need anymore.

Rosetta 2 is a binary translator which monkey patches API calls. Here are two key points when talking about PC backwards compat:

PC apps never talk to the GPU directly, console games do. If PCs let random apps touch the GPU not only would you be infected with viruses that take over your screen, every random app could cause the entire system to crash. Minimize a calculator window and forget about it? Well, if it crashes your whole system reboots! Consoles are a different world, so console games get raw hardware access, for extra performance

This is why Rosetta 2 doesn't have to emulate a GPU at all.

PC games don't have precompiled shaders, console games do: When you change a tiny detail about a graphics card, you have to recompile your shaders. There is no way for PC games to do this for every combination that already exists, and even if they did, they wouldn't support new hardware. So PC games don't ship precompiled shaders, they ship the raw shader code, and your PC compiles it when it needs it.

Consoles are a single target, so it is feasible to ship precompiled shaders. This has big performance benefits. If you've heard of #stutterstruggle this is what they're talking about, and why consoles don't have it. But it is also why console games don't "just work" without changes on different GPUs

Rosetta runs on the CPU. It reads the code from an old program, dynamically retranslating as it goes, and then feeds the retranslated instructions back out to the CPU. In essence, the CPU doesn't even see the emulated program, it only sees Rosetta. Meanwhile, Rosetta only has to read tiny chunks of the emulated program at a time, very quickly recompile it, and send it on in tiny, fast bursts. Lots of CPU code is repetitive, going around in loops, so many times, Rosetta doesn't even have to do anything, because it has already recompiled the code.

GPUs don't work that way. There is no way to run something like Rosetta on the GPU, and even if it could, it wouldn't matter, because the game runs on the CPU. Shaders are GPU programs that get squirted over to the GPU in individul blobs, and are usually run immediately.

The way a Rosetta for Shaders would work, is it would run on the CPU, load up games into a kind of container where the game couldn't directly access the GPU. It would then intercept shaders, recompile them for the new architecture, and then squirt them over to the GPU itself.

This could work, emulators do it all the time. But it would introduce #stutterstruggle to the console games. Emulators try to work around this stutter, but they are 1) not completely successful and 2) it requires huge computing resources that Drake won't have.

Sorta? But it doesn't matter much, only the shaders are a tricky problem.

No, but again, only the shaders are tricky.

Console talk directly to the GPU. That means that each game basically builds in its own driver. Drivers need to know stuff like where physically on the chip parts of the GPU are, or exactly how many of component X or Y are, or the magic value of Q that only works on that chip.

Fortunately, even if every single one of these is different between TX1 and Drake, you don't need a complex emulator, you just need a remapper. If a Switch game wants to send command 0x800DBA14 to interrupt 0xD01E2A3D then all that drake has to do is catch it, and then look up those locations.

"Hmm, command 0x800DBA14 on TX1 is 'add' and interrupt 0xD01E2A3D is the first SM. My first SM is at 0xD02E3A0D, and my add command is 0x800DBA11. Let me just substitute one for the other it will Just Work"

Ampere and Maxwell are similar enough for this sort of technique to work. Driver emulation is simple and fast.

It's the shaders that are tricky, for the reasons I talked about before. Shaders are complicated, they're basically whole programs. Ampere might add an instruction that does what took 5 instructions before, while deleting 4 of those old useless instructions, which might take 2 new instructions to fake. Doing that translation is complex and slow.

You can't really partially emulate anything the way you mean, but you can essentially "pass through" the parts the are identical. Your emulator always has to sit in the middle of the whole thing, and you pay a performance cost for that. But when two systems are very very similar, your emulator only has to work hard at the differences.

But again, Shader recompilation isn't emulation in that way. That is why it is tricky

SM3DAS could do all these recompilation steps in advance, on a developer's powerful workstation. Drake has to do it in real time, using nothing but a fancy tablet.

SM3DAS's team had access to the raw source code of the original games, and could patch it if they wanted. Drake has to work with on every Switch game without every seeing the source code, and without ever changing it.

SM3DAS only had to work with 3 games, and doesn't have to emulate any features those games didn't use. Drake has to emulate every bit of the Switch hardware that any game uses.

SM3DAS's emulation only needed to reach the speed of the Wii, to run a 16 year old game. SM3DAS's emulation needs to reach the power of the Switch, to run games that haven't released yet.

I understand how the mistake could be made, because the reporting around this is somewhat ambiguous, but Switch games don't actually talk directly to the GPU. That's the job of NV services, which is roughly analogous to the kernel mode driver you'd find on other operating systems.

When people say Switch games include a GPU driver, I'm pretty sure that's only referring to the user space portions of the driver that don't need special privileges to run. That still is probably not an ideal boundary for BC concerns, but it's considerably different from having to deal with raw hardware interaction. I imagine the interface of the service probably isn't going to change too much (in a breaking way) between the consoles in the first place, and that service seems like the most probable candidate for where the responsibility of runtime translation will fall.

Dekuman · Jan 14, 2023

Concernt said:
Some PS4 games don't run on PS5. It's just part and parcel of changing hardware. Some games, some systems, some configurations will always be difficult to emulate. I don't think missing out on say, Snake Pass and Chuchyba's Challenge 2 will make or break PR on Drake, and I'm very confident that any titles of any amount of noteriety will be checked and double checked to make sure they run. I'd say many first party titles could receive patches to improve BC, or even push performance above the original, as long as they're evergreen. I know this may seem out of left field, but I'm fairly certain a Drake patch for Mario Kart 8 Deluxe will drop with 4K in tow alongside one of its DLC waves. Call it a hunch.

I agree with you about MK8, my feeling is MK9 is years away and Nintendo will lead with MK8 60fps 4K for their next platform, with a free or very inexpensive patch for existing MK8 owners to give a sense of continuity on the ecosystem.

My bigger concern is how Nintendo will interface with 3rd parties on these. There's a lot of good game, impossible ports , and widely played games that may not rank highly in our bubble (2K's many games come to mind, Civ, Borderlands, and their NBA games likely sold millions) will they get a cash incentive to patch those games should they not work in the new platform? My hope is there is some program to help 3rd parties patch.

mariodk18 · Jan 14, 2023

There’s a solution for those that want 100% flawless BC:

Keep your Switch.

Dekuman · Jan 14, 2023

mariodk18 said:
There’s a solution for those that want 100% flawless BC:

Keep your Switch.

I keep all my Nintendo consoles, but i, like most people want the convenience of moving my library to a new piece of hardware.

Sol · Jan 14, 2023

mariodk18 said:
There’s a solution for those that want 100% flawless BC:

Keep your Switch.

So keep both the Switch and Switch 2 plugged into my TV? Or continually swap them out every single time I want to play an older game?

And when traveling I'll have to bring along both systems?

And what happens when it inevitably breaks down and Nintendo stops manufacturing and selling them? Am I to just accept that all of the digital titles on my Nintendo account are now off limits for me unless I somehow manage to find a used Switch console off Ebay?

Yeah, none of that is viable. Persistent libraries are the name of the game now. It's unacceptable in this day and age to force consumers to start over and abandon their entire library.

kvetcha · Jan 14, 2023

JoshuaJSlone said:
Yes, we are. I just have to take the 720p output and triple its size and take the 1080p output and double its size to make them the same size for side-by-side comparison without introducing other scaling artifacts.

Feels like it also introduces more opportunities for artifacting, which would be my main concern. But I guess my original argument was not properly apples to apples.

Deleted member 887 · Jan 14, 2023

Pokemaniac said:
I understand how the mistake could be made, because the reporting around this is somewhat ambiguous, but Switch games don't actually talk directly to the GPU. That's the job of NV services, which is roughly analogous to the kernel mode driver you'd find on other operating systems.

When people say Switch games include a GPU driver, I'm pretty sure that's only referring to the user space portions of the driver that don't need special privileges to run.

I’m showing my Linux biases here, where GPU ioctls historically were little more than pass through for X and did need special perms. I believe the graphics server on Windows and Mac are kernel mode for perf, but are separate from the driver?

Regardless, you are right, and this does look like a decent chunk of the driver is in the OS.

Pokemaniac said:
That still is probably not an ideal boundary for BC concerns, but it's considerably different from having to deal with raw hardware interaction. I imagine the interface of the service probably isn't going to change too much (in a breaking way) between the consoles in the first place, and that service seems like the most probable candidate for where the responsibility of runtime translation will fall.

Concernt · Jan 14, 2023

Dekuman said:
I agree with you about MK8, my feeling is MK9 is years away and Nintendo will lead with MK8 60fps 4K for their next platform, with a free or very inexpensive patch for existing MK8 owners to give a sense of continuity on the ecosystem.

My bigger concern is how Nintendo will interface with 3rd parties on these. There's a lot of good game, impossible ports , and widely played games that may not rank highly in our bubble (2K's many games come to mind, Civ, Borderlands, and their NBA games likely sold millions) will they get a cash incentive to patch those games should they not work in the new platform? My hope is there is some program to help 3rd parties patch.

If the 4K patches for first party games aren't free you have permission to call me a hack and a buffoon. They're going to be free.

Evergreens sell Switches. Without free patches, they can't sell Drakes nearly as effectively.

There isn't a world where Nintendo says "Plays select Nintendo Switch games in 4K" and slaps "After $20 graphical improvement DLC" in small print under it.

If they want a healthy ecosystem across the generations and lots of Switch players to upgrade fast, those patches will not cost a dime. And they want those things. So these updates will not cost a dime.

Sorry to be so dogmatic about it, it's just something I have near 100% certainty in. Used to work PR for a... Let's call them a controversial political organization, and I like to think I know a potential PR disaster when I see one. Charging for 4K patches for Drake would be one, and that's something they will want to avoid.

Deleted member 887 · Jan 14, 2023

Sol said:
Yeah, none of that is viable. Persistent libraries are the name of the game now. It's unacceptable in this day and age to force consumers to start over and abandon their entire library.

It’s kind of unfortunate that this expectation is setting in at the exact moment the technology to make it possible is dying off.

karmitt · Jan 14, 2023

Sol said:
So keep both the Switch and Switch 2 plugged into my TV? Or continually swap them out every single time I want to play an older game?

And when traveling I'll have to bring along both systems?

And what happens when it inevitably breaks down and Nintendo stops manufacturing and selling them? Am I to just accept that all of the digital titles on my Nintendo account are now off limits for me unless I somehow manage to find a used Switch console off Ebay?

Yeah, none of that is viable. Persistent libraries are the name of the game now. It's unacceptable in this day and age to force consumers to start over and abandon their entire library.

I kind of thought the point being made was 99%+ for Drake. If you need true 100% you need to just keep your old machine

Concernt · Jan 14, 2023

karmitt said:
I kind of thought the point being made was 99%+ for Drake. If you need true 100% you need to just keep your old machine

That has almost always been true for backwards compatibility of any kind, even hardware. This isn't a massively different situation, really, just the way it's achieved is.

Serif · Jan 14, 2023

Even with the 'perfect' hardware back compat of past consoles, there were still games that slipped through the cracks.
Certain GB/C titles that exploit hardware bugs on the original Game Boy to run or use the GBC's IR sensor refuse to boot on a GBA. Games that needed the GBA slot (Guitar Hero) on the DS/Lite didn't work on DSi or 3DS.
There were some games on Wii that needed the GameCube port (like, very few fitness games), that didn't work on Wii U.
Nintendo's official wording on their sites even says 'Nearly all Wii games can be enjoyed on our newest home console'.

Of course, in most of these instances the issue wasn't really that new hardware couldn't 'run' the game, but rather a missing input. The point is - it was already 99[.999....]%, and Nintendo's just going to deal with the remaining percent with careful wording and testing after the fact.

TheGreatMightyPoo · Jan 14, 2023

mariodk18 said:
There’s a solution for those that want 100% flawless BC:

Keep your Switch.

I trade my systems in for credit towards the new one.

I’m not a museum.

Raccoon · Jan 14, 2023

mariodk18 said:
There’s a solution for those that want 100% flawless BC:

Keep your Switch.

yeah instead of carrying around one oversized portable I'll do two

Deleted member 887 · Jan 14, 2023

TheGreatMightyPoo said:
I trade my systems in for credit towards the new one.

I’m not a museum.

Then you probably don’t care about 100% compat?

Pokemaniac · Jan 14, 2023

oldpuck said:
I’m showing my Linux biases here, where GPU ioctls historically were little more than pass through for X and did need special perms. I believe the graphics server on Windows and Mac are kernel mode for perf, but are separate from the driver?

Regardless, you are right, and this does look like a decent chunk of the driver is in the OS.

I don't know a whole lot about GPU drivers, but I think roughly this separation tends to exist on all platforms. Even on Linux, I'm pretty sure the kernel module is what responds to those ioctls (which the Switch driver amusingly seems to imitate) and not the hardware directly.

StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (Read the staff posts before commenting!)

I exist sometimes

Moblin

Deleted member 887

Guest

Warpstar Knight

Moblin

Warpstar Knight

Deleted member 887

Guest

Like Like

Hi 😄😄😄

Member

Moblin

Paratroopa

𝕽𝖊𝖓𝖊𝖌𝖆𝖉𝖊 𝕬𝖓𝖌𝖊𝖑

Baba Yaga Hut

Member

Member

Kremling

Install Base Forum Namer

𝕽𝖊𝖓𝖊𝖌𝖆𝖉𝖊 𝕬𝖓𝖌𝖊𝖑

Deleted member 887

Guest

Deleted member 887

Guest

Starman

Bob-omb

Member

Optimism is non-negotiable

Starman

Warpstar Knight

Kremling

Install Base Forum Namer

Optimism is non-negotiable

Optimism is non-negotiable

Kremling

Deleted member 887

Guest

Optimism is non-negotiable

Caught: 1025

Kremling

Install Base Forum Namer

Kremling

Moblin

hoopy frood

Deleted member 887

Guest

Optimism is non-negotiable

Deleted member 887

Guest

Starman

Optimism is non-negotiable

𝕽𝖊𝖓𝖊𝖌𝖆𝖉𝖊 𝕬𝖓𝖌𝖊𝖑

Like Like

Fox Brigade

Deleted member 887

Guest

Caught: 1025