• Hey everyone, staff have documented a list of banned content and subject matter that we feel are not consistent with site values, and don't make sense to host discussion of on Famiboards. This list (and the relevant reasoning per item) is viewable here.
  • Do you have audio editing experience and want to help out with the Famiboards Discussion Club Podcast? If so, we're looking for help and would love to have you on the team! Just let us know in the Podcast Thread if you are interested!

StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (Read the staff posts before commenting!)

Hopefully a very big one. The bigger the better for more bandwidth.

giphy.gif
Hehehehehehhehehehehehehehehhehehehehehehehehehehehehehehehehehehehehehehehheheheheheheheheheheheheh
 
0
Someone quoted it at Nikki on twitter, who seemed to confirm. But I assumed that was from someone with access to a clone of the repo and could see file modification dates (all I've seen is a repo export, which has dates of exfiltration, all on the 21st, 2 days before Nvidia found out about the leak, and a week before the rest of us).

2019 modification dates would actually match with a 2020 ship date for dev tools (matching with leaks of dev kits) if NVN2 was branched from NVN about a year beforehand.
Wasn’t it that there are dates from 2019? Not that it started in 2019?
The repo export is all there is. There are files from with copyright dates of 2012 (at least) through 2022. NVN2 files aren't easily placed because some started as NVN1 files, so they go as far back as 2014/2015. 2019 may be or may not be the origin date for some of the files, but it's not a universal constant. Throwing it around just causes more confusion than it solves.
 
Well we know the GPU config now outside of clock speeds, and the CPU is at least known to the CPU cores used so I say

Portable: Series S performance at 720p after DLSS
Docked: Rivaling the PS5 after DLSS?
Getting a little ahead of yourself there. DLSS is not a magic bullet, I'd really caution treating it like a simple multiplier. Especially for handheld mode.

All we can really know at the moment is that the API (in its current form) sees 12SMs which gives us minimum and maximum bounds to work with.
 
Getting a little ahead of yourself there. DLSS is not a magic bullet, I'd really caution treating it like a simple multiplier. Especially for handheld mode.

All we can really know at the moment is that the API (in its current form) sees 12SMs which gives us minimum and maximum bounds to work with.

But the info we do have now…regardless of how/if the tensor cores and rt cores are utilized…definitely implies native power that would be similar to ps4 power portable and ps4 pro power docked.

No?
 
But the info we do have now…regardless of how/if the tensor cores and rt cores are utilized…definitely implies native power that would be similar to ps4 power portable and ps4 pro power docked.

No?
Yeah when native.

So even if you say DLSS isn't a 2x multiplier in a portable at least (it should be at least a 2x boost when docked), it should bring the portable mode performance at least 1.5x in performance, being a tad behind the PS4 Pro/Series S GPUs.
 
Yeah when native.

So even if you say DLSS isn't a 2x multiplier in a portable at least (it should be at least a 2x boost when docked), it should bring the portable mode performance at least 1.5x in performance, being a tad behind the PS4 Pro/Series S GPUs.

Ah so you are saying he’s over selling the DLSS comparisons, got it.
 
But the info we do have now…regardless of how/if the tensor cores and rt cores are utilized…definitely implies native power that would be similar to ps4 power portable and ps4 pro power docked.

No?
Assuming it uses the same or higher clocks than the original Switch I think that's a fair comparison. Flops wise it won't match up quite the same but efficiency gains from the newer architecture should make up for that.
 
Getting a little ahead of yourself there. DLSS is not a magic bullet, I'd really caution treating it like a simple multiplier. Especially for handheld mode.

All we can really know at the moment is that the API (in its current form) sees 12SMs which gives us minimum and maximum bounds to work with.

Discussions about overall image quality is probably the better topic to touch on more-so than just raw specs.
I know early on in this thread many of us were placing this new Switch device much closer to Microsoft's Series S and these latest leaks definitely corroborates some of that theory a bit more. So I fully expect us to see many future game comparisons by the likes of Digital Foundry between the two devices...

DLSS could allow this Drake "Knight Wing Bling" Switch to run many 3rd party cross platform games with better performance, superior image quality and improved RT capabilities over the Series S.
 
Ah so you are saying he’s over selling the DLSS comparisons, got it.
Misintended reply?


I am saying that assuming a 2x boost in both it would reach that level.
But even if portable mode DLSS can't boost as high in effective performance it still would not be weak in any way for Portable mode.
 
But the info we do have now…regardless of how/if the tensor cores and rt cores are utilized…definitely implies native power that would be similar to ps4 power portable and ps4 pro power docked.

No?

This seems like a jump over ~XBO handheld, ~PS4 docked that I've been reading earlier. (before members here were digging into the source files). This is based on the 12 SMs in the source code, right? And I'm assuming that's a pretty safe bet since it's an actual reference in the NVN2 API.
 
0
How will this perform?

It’s hard to say. We don’t have all the details and even if we did there are software decisions that matter

But how will it perform though?
We still need to see software running to be sure - it devs don’t make exclusives that can really take advantage of the power of the new machine, it could still be held back by the old Switch

GRAPHICS GRAPHICS GRAPHICS HOW WILL IT DO??????

Okay okay. How is this?

While docked and when running games that have been optimized for its power it will feel like maybe the least performant member of the current generation - instead of feeling like an older generation machine that happens to play Mario and fit in your pocket.

This is Nintendo closing - but not eliminating - the gap between it and it’s competitors that has existed since the Wii.
 
Really? That sounds insane! The early rumors were that it's around Xbone in terms of power. Now it's above PS4!?!
Probably more accurate to say that there were no rumors, it was informed speculation from folks in these threads based on what was going on with the Tegra line, and tidbits of chip leaks from Nvidia leakers. That speculation always came from the perspective that Nintendo would be somewhat conservative, and would stick to a similar power profile and size as the original Switch. What's come out from this Nvidia hack has changed that mindset since the chip appears to have significantly more cores than what people were thinking it would have.
 
How will this perform?

It’s hard to say. We don’t have all the details and even if we did there are software decisions that matter

But how will it perform though?
We still need to see software running to be sure - it devs don’t make exclusives that can really take advantage of the power of the new machine, it could still be held back by the old Switch

GRAPHICS GRAPHICS GRAPHICS HOW WILL IT DO??????

Okay okay. How is this?

While docked and when running games that have been optimized for its power it will feel like maybe the least performant member of the current generation - instead of feeling like an older generation machine that happens to play Mario and fit in your pocket.

This is Nintendo closing - but not eliminating - the gap between it and it’s competitors that has existed since the Wii.
Yeah, even though DLSS when docked will shoot past the Series S, it will fall behind the PS5 still.

At best, assuming a developer maximizes the benefits of the Tensor Cores, DLSS, and all that and drew every ounce of power out of Drake, it likely could match, or even shoot a little bit above PS5 (Assuming the PS5 is running at native performance levels here)

but it would require a hecklot of effort and sort of exposes the big thing that makes Drake hard to predict after DLSS outside of a 2x Multiplier on average when docked because it's all software optimization.

What a dev does to take advantage of DLSS when given the environment to actually optimize around it is an unknown quantity to us as all we have is the average unoptimized DLSS numbers from PC in which the 2x average comes from.

But something else to consider is the exponential performance cost of increasing resolution that DLSS dodges but PS5/Series S|X have to worry about for the most part unless you are using UE5's TSR
 
Last edited:
Yeah, even though DLSS when docked will shoot past the Series S, it will fall behind the PS5 still.

At best, assuming a developer maximizes the benefits of the Tensor Cores, DLSS, and all that and drew every ounce of power out of Drake, it likely could match, or even shoot a little bit above PS5 (Assuming the PS5 is running at native performance levels here)

but it would require a hecklot of effort and sort of exposes the big thing that makes Drake hard to predict after DLSS outside of a 2x Multiplier on average when docked because it's all software optimization.

What a dev does to take advantage of DLSS when given the environment to actually optimize around it is an unknown quantity to us as all we have is the average unoptimized DLSS numbers from PC in which the 2x average comes from.

But something else to consider is the exponential performance cost of increasing resolution that DLSS dodges but PS5/Series S|X have to worry about for the most part unless you are using UE5's TSR

Would there be any reason NOT to use DLSS if the system has it built-in? I get that for PC games, not everyone with a PC has an Nvidia RTX card, but with the Super Switch, we know it has the functionality built-in. Digital Foundry has shown that even upscaling from 1080p, DLSS 4K can at times look better than vanilla 4K. That was from DLSS 2.0. Dare I say 60fps with 1080p upscaled to 4K.
 
0
Misintended reply?


I am saying that assuming a 2x boost in both it would reach that level.
But even if portable mode DLSS can't boost as high in effective performance it still would not be weak in any way for Portable mode.

Hey I agree with you! I was just trying to interpret what Skittzo was saying

[edit: oh I see, yea I replied to the wrong poster, my bad :/]
 
Last edited:
0
Theoretically the maximum is about 3TF, if we assume the hard upper limit of Orin GPU clocks (1GHz) applies to Drake.
Watch Nintendo match 76.8 % of that. 2.3 TFLOPs "Please understand!"

5-6 TFLOPs for 39 watts for the whole system sounds way too good to be true. Especially considering steam deck runs up to 30 watts for 1.6 TFLOPs GPU max and maybe 3Ghz for it's 4 core CPU 🤔

Ah, I didn't know that, thanks. I doubt Nintendo would have quite as much flexibility with a smaller GPU and tighter power limits, but still very interesting nonetheless.



It's hard to say. I don't think compression has much effect at this point, Nvidia already had framebuffer compression technology back with Maxwell, and there's only so far you can go with compression. The bigger question is probably how much of a difference the bigger cache makes. All of Nvidia's recent GPU architectures (since Maxwell, I believe) use tile-based renderers, where the idea is that the tile being rendered is stored in cache, and therefore the most intensive memory accesses are kept to the cache, without hitting actual memory. However, I'm not really in a position to speculate on how much of an impact the larger cache would have. Some, certainly, but it's impossible to say how much without careful profiling of Ampere's memory access patterns, which we don't have.



Thanks for this. I wasn't really sure on that part, so it's very interesting to read more on it.



If it is the case that the new model shares a dock with the OLED model, and if the ability of the OLED dock to deliver 39W is based on supporting the new model (both reasonably big ifs), then I would assume that the 39W is to cover the maximal use-case of both operating at full power and fully charging the battery at the same time. So probably something like 25W of actual power draw plus around 14W for charging. Still, 25W isn't a small amount for a device like the Switch. Steam Deck has a 25W maximum power draw in a slightly thicker case and by all accounts the fan on it is pretty loud, so if Nintendo are hitting that kind of power draw I hope they've got a quiet fan solution sorted out.
25 watts or lower makes sense since Orion NX's highest profile is that much. Of course that's for the SOC, and Nintendo and Nvidia can inactivate the camera stuff, while having camera and other irrelevant machine stuff off. Would be interesting if Nintendo goes with 10 watts for handheld mode 🤔

Anyway, portable PS4 power using 10 watts for the whole system sounds too good to be true. There is absolutely no way we can get that on an 8nm Samsung. If it isn't 5nm more efficient, it's impossible.
 
Last edited:
Watch Nintendo match 76.8 % of that. 2.3 TFLOPs "Please understand!"

5-6 TFLOPs for 39 watts for the whole system sounds way too good to be true. Especially considering steam deck runs up to 30 watts for 1.6 TFLOPs GPU max and maybe 3Ghz for it's 4 core CPU 🤔


25 watts or lower makes sense since Orion NX's highest profile is that much. Of course that's for the SOC, and Nintendo and Nvidia can inactivate the camera stuff, while having camera and other irrelevant machine stuff off. Would be interesting if Nintendo goes with 10 watts for handheld mode 🤔

Anyway, portable PS4 power using 10 watts for the whole system sounds too good to be true. There is absolutely no way we can get that on an 8nm Samsung. If it isn't 5nm more efficient, it's impossible.
I will note that the 8nm that Orin (and therefore Drake likely) uses is different than the 8nm that the Consumer Ampere cards use.
 
Man 12 SMs sounds too good to be true for Switch 2, but when even Thraktor is hyped about this and thinks it's pausigos plus a 192 bit bus bandwidth, and when people are more optimistic to a node more efficient and newer than 8nm samsung, it's pretty crazy. Usually we are on WUST hype since Wii era days and end up with a "Please understand" like half the clockspeedd we hoped from leakers (I remember when Emily said switch tx1 would be close to xbone, but got the fp16 1 TFLOP count).

We'll see how it plays out and if hackers really do release clock speeds tomorrow. Gonna expect steam deck like specs with 8SMs for docked before DLSS is counted just to not get myself dissapointed. I hope I'm wrong though.

Yeah, this is all very exited, but people should have on mind with last 3 Nintendo consoles we always got weaker hardware (or lower clocks) than was generally expected based on rumors.
I dont say this time will be same, but its always better to stay cool and to have lower expectations.
 
Last edited:
Talking about comparison with PS4, PS4 Pro and XsS hardware, here people mostly mention only CPU and GPU,
while not comparing memory bandwidth that's also very important when we talk about getting most of hardware or talking about potential bottleneck.

New Switch hardware based on current infos should have around 100GB/s memory bandwidth, PS4 has 176GB/s, PS4 Pro has 218GB/s and XsS has 224GB/s,
so that should be take in comparison also.
 
Yeah, this is all very exited, but people should have on mind with last 3 Nintendo consoles we always got weaker hardware (or lower clocks) than was generally expected.
I dont say this time will be similar, but its always better to stay cool and to have lower expectations.
I look at it from a vague view of possible support, in which I'm guessing will be Switch like, but better. Seems like PS4/XB1 level stuff should be fine, while PS5/XBS ports might actually be reasonably plausible (rather than say, the "impossible" Switch ports that took some herculean dev efforts).

A bunch of stuff besides the hardware itself can help here, from the scalable modern toolchains, to the long ass tail of the PS4/XB1 hardware and existence of the XBSS kinda forcing most devs to be flexible with their output. Whatever the actual raw numbers end up being, the real world effective capability taking everything into account should basically be the closest thing they've had to a "next gen" console since the GameCube.
 
0
Yeah, this is all very exited, but people should have on mind with last 3 Nintendo consoles we always got weaker hardware (or lower clocks) than was generally expected based on rumors.
I dont say this time will be same, but its always better to stay cool and to have lower expectations.

I understand where you're coming from, but with the source being actual code that's going to be used on the new HW, it is very unlikely that it still changes drastically. Only thing we don't know (and Nintendo/Nvidia currently figuring out) are final clockspeeds for docked and handheld play.
 
Yeah, this is all very exited, but people should have on mind with last 3 Nintendo consoles we always got weaker hardware (or lower clocks) than was generally expected based on rumors.
I dont say this time will be same, but its always better to stay cool and to have lower expectations.

Yep
They tend to always be verrrrry conservative with clockspeeds.
But it should be a good machine it seems.


What about BC ?

Edit : damn phone !
 
Last edited:
Talking about comparison with PS4, PS4 Pro and XsS hardware, here people mostly mention only CPU and GPU,
while not comparing memory bandwidth that's also very important when we talk about getting most of hardware or talking about potential bottleneck.

New Switch hardware based on current infos should have around 100GB/s memory bandwidth, PS4 has 176GB/s, PS4 Pro has 218GB/s and XsS has 224GB/s,
so that should be take in comparison also.
I think the reason it’s not discussed so much when compared to those systems, is that in general NVidia’s architectures have been more memory efficient than their AMD counterpart thus they can perform about the same with less bandwidth. AMD made an effort to mitigate this bandwidth inefficiency of their architecture (and lack of going to GDDR6X) by including a very large L3 cache pool that they call “Infinity Cache” that increases the effective bandwidth throughput for their cards while also helping with energy efficiency.

Due to the large GPU, there is also 4MB of L2 cache which really helps this device vs the other three aforementioned systems. PS5 for example only has 4MB for itself and Series X has 5MB, yet the Drake with its lower shader count has a very close amount so it’s possible that they can aid in keeping the GPU fed enough.

There’s also a possibility that there’s more L1 cache available for the device, such as 50% more, but that’s just a guess. It can follow the same 128KB that the regular desktop Ampere has or it can have 192KB of L1 cache per SM like ORIN, either way this will be a helpful aspect here.


All in all, the raw memory bandwidth is an issue yes, but perhaps not so pronounced. It should be able to trade blows decently well enough.


This does not factor the CPU of course, I think most of us agree that it won’t be anywhere near the other three consoles.

Yep
They tend to always be verrrrry conservatrice with clockspeeds.
But it should be a good machine it seems.


What about BC ?
Nintendo has talked about BC in an implicit way with their investors in financial briefings, so it’s likely to have BC here. Nintendo hasn’t broken compatibility with a previous system unless they absolutely had no choice. They’ve aimed to keep BC with at least the direct predecessor

and even with Nintendo being rather conservative, it is still a really performant portable device in the end here.
 
Yeah, this is all very exited, but people should have on mind with last 3 Nintendo consoles we always got weaker hardware (or lower clocks) than was generally expected based on rumors.
I dont say this time will be same, but its always better to stay cool and to have lower expectations.
The Swicth has way better hardware than what was expected for a Nintendo handheld. People were expecting a Vita+. It's only weak in concern trolling discourses, and for people refusing to accept what it is and go "but I have never took my swicth out of the dock!" (which I would say they are also concern trolling).
 
The Swicth has way better hardware than what was expected for a Nintendo handheld. People were expecting a Vita+. It's only weak in concern trolling discourses, and for people refusing to accept what it is and go "but I have never took my swicth out of the dock!" (which I would say they are also concern trolling).
A lot of people in threads similiar to this, did expect more but not a lot more.

It came as a surprise to many (not all) that Nintendo went 100% off the shelf X1 at 20nm. We did consider the possibility of Nintendo making customizations to the TX1 as a likely one.

Adding 1 more gb and running their own clocks doesnt count as customizations.
 
I think we need to keep in mind while DLSS is nice and all, 99% of games would probably not be optimized for it nor the new chip at launch to take the full potential of this rumored hardware.

I already see the "lazy devs" takes on the horizon.
 
A lot of people in threads similiar to this, did expect more but not a lot more.

It came as a surprise to many (not all) that Nintendo went 100% off the shelf X1 at 20nm. We did consider the possibility of Nintendo making customizations to the TX1 as a likely one.

Adding 1 more gb and running their own clocks doesnt count as customizations.
I was curious to know if the switch had customization such as on chip memory (bigger caches or VRAM), but I remember discussing with dark10x that even the off the shelve TX1 was a full ~8-10x ahead of Vita, a generation jump. And we got close to the best possible hardware later with Mariko, IMHO.

I believe that some people, like dark10x, were worried because they knew the performance that Tx1 was getting on the Pixel C and Shield Tv, which it wasn't that impressive and they were worried that the device was going to be weaker than a 360. But we later learned that that was because Android is a crap OS for gaming.

There were many legit concerns, we knew that 16nm would have been better than 20nm for the most relevant one, but I think most of arguments came from Ignorance and many from bad faith. The best example was the screen resolution: It is not hard to find opinions that go "720p only?!". Yet, 6 years later the Steam Deck is barely above that resolution and we don't get the same argument from the same people. I said that 720p was too high and 540p would have been better at the time.
 
Last edited:
I think we need to keep in mind while DLSS is nice and all, 99% of games would probably not be optimized for it nor the new chip at launch to take the full potential of this rumored hardware.

I already see the "lazy devs" takes on the horizon.
Lazy devs, paired with "HW isn't as good as we thought LOL they cheaped out again". However, i think Nintendo will have at least one showcase title ready (May or may not be exclusive to the new device, idk)
 
Yep
They tend to always be verrrrry conservative with clockspeeds.
But it should be a good machine it seems.


What about BC ?

Edit : damn phone !

I agree it should be very good machine in any case, just think people should lower expectations compared to more optimistic expectations,
especially when comes to clocks.

I would said that BC is certain in any case.


The Swicth has way better hardware than what was expected for a Nintendo handheld. People were expecting a Vita+. It's only weak in concern trolling discourses, and for people refusing to accept what it is and go "but I have never took my swicth out of the dock!" (which I would say they are also concern trolling).

I talking about general expectations based on credible rumors and leakers not about troll comments (it will be little stronger than Vita, will sell like Wii U..).
Before Switch revel, based on rumors (on NeoGaf and plenty people from here were also there back than) most people expected performance similar to XB1.
For instance, I also remember when was generally expected that Switch ARM CPU will be clocked at 1.5GHz only few months before Switch was launched,
and then disappointed few months later when actually proved its 1GHz.


I think the reason it’s not discussed so much when compared to those systems, is that in general NVidia’s architectures have been more memory efficient than their AMD counterpart thus they can perform about the same with less bandwidth. AMD made an effort to mitigate this bandwidth inefficiency of their architecture (and lack of going to GDDR6X) by including a very large L3 cache pool that they call “Infinity Cache” that increases the effective bandwidth throughput for their cards while also helping with energy efficiency.

All in all, the raw memory bandwidth is an issue yes, but perhaps not so pronounced. It should be able to trade blows decently well enough.

This does not factor the CPU of course, I think most of us agree that it won’t be anywhere near the other three consoles.

I dont think thats a reason, I mean you can also say that Nvidia GPU architecture is also more efficient,
simple people forget about memory bandwith or dont like to compare memory bandwith of next Switch with PS4/PS4Pro/XsS.

New Switch will almost certain have stronger CPU and probably stronger GPU even without DLSS compared to PS4,
but it will most likely have weaker memory bandwith compared to PS4, not to mention PS4 Pro or XsS.
 
Last edited:
I talking about general expectations based on credible rumors and leakers not about troll comments (it will be little stronger than Vita, will sell like Wii U..).
Months before Switch revel, based on rumors (on NeoGaf and plenty people from here were also there back than) most people expected performance similar to XB1.
For instance, I also remember when was generally expected that Switch ARM CPU will be clocked at 1.5GHz,
and then disappointed few months later when actually proved its 1GHz.
This is an unfair criticism, since we didn't knew that the NX was going to be a handheld with docked mode. The at least XB1 performance prediction was in the context of Nintendo releasing a traditional console. If there were some people legitimately thinking that once we knew it was a Tegra based handheld device, then they were indeed deluded.
 
This is an unfair criticism, since we didn't knew that the NX was going to be a handheld with docked mode. The at least XB1 performance was in the context of Nintendo releasing a traditional console. If there were some people legitimately thinking that once we knew it was a Tegra based handheld device, then they were indeed deluded.

That's not criticism, its just point that proves fact that Nintendo hardware (including Switch) generally has weaker hardware than people generally expected.
Maybe you can say that in that time we didnt know what exactly Switch is, but credibly rumour sources and leakers heard about what potential hardware we talking about, and like I wrote 1.5GHz for CPU (for GPU were also expected higher clocks) was expected only few months before Switch was launched when we already know what NX is and that will have Tegra X1.
 
I think we need to keep in mind while DLSS is nice and all, 99% of games would probably not be optimized for it nor the new chip at launch to take the full potential of this rumored hardware.

I already see the "lazy devs" takes on the horizon.
That depends entirely on implementation. You could say the same thing about, say, RT cores, but in reality, engines like UE5 compile with hardware like that in mind to achieve the desired output, so I don't see DLSS implementations being much different in that respect.
 
Now my theory has no actual basis in fact but as a huge resident evil fan the recent announcement of a next gen patch for RE2 Remake, RE3 Remake and RE7 coming at the end of this year has me super excited.

I am hoping we hear more about what the patches entail soon. For me, if they add in both DLSS and RTX support I will be wondering if part of the reason is to launch those games on this new switch console.

100% the primary reason will be to reinvigorate interest in those titles on the new platforms and get a surge of new sales. However we also know Capcom is one of the biggest third party supporters of the switch and that they even had some say in the final specs of the original device, feeding back to Nintendo that they needed more memory.

I can see them having planned this with Nintendo for a while being a close third party partner and launching these three titles as showcase pieces for the hardware launch whilst offering cloud versions to OG switch owners.

My one reservation is that Capcom built a custom version of its internal RE-Engine for monster hunter Rise due to the primary branch not supporting the kind of resolution switching needed to Dock and undock the switch, but the next gen patch has been made possible by a recent update to the RE Engine so that problem may have been addressed in the main build, rather than supporting two separate versions.

Thoughts?
 
Well we know the GPU config now outside of clock speeds, and the CPU is at least known to the CPU cores used so I say

Portable: Series S performance at 720p after DLSS
Docked: Rivaling the PS5 after DLSS?
This would be amazing. But in terms of teraflops we would be with more than the PS4, it means ?
 
0
That's not criticism, its just point that proves fact that Nintendo hardware (including Switch) generally has weaker hardware than people generally expected.
Maybe you can say that in that time we didnt know what exactly Switch is, but credibly rumour sources and leakers heard about what potential hardware we talking about, and like I wrote 1.5GHz for CPU (for GPU were also expected higher clocks) was expected only few months before Switch was launched when we already know what NX is and that will have Tegra X1.
The first time we saw the clocks (for docked mode) was when a participant in that discussions (blu?) that had a rooted Shield Tv ran benchmarks and reported the sustained clocks, which were exactly the docked clocks that DF reported a few days later from their insider contacts. So, this is an argument based on ignorance, we didn't knew what were the maximum sustainable clocks for the chip, so we took Nvidia for its word on the advertised maximum and concluded that Nintendo was going to run those clocks for docked mode, because why they wouldn't?

Now, do people say the same with Sony and the Vita? Did Sony under deliver? Up to this day, there are people that believe that the Vita has a 2Ghz CPU and a when in reality it caps at 400Mhz CPU when Wifi is off, 300Mhz otherwise. When dose specs were revealed, people were incredulous "No way KZM is running on a 14Gflops GPU with 333Mhz CPU machine!". But it is what it is. If I go back to 2016 GAF and tell them that the next Nintendo handheld was going to run Crysis 2-3 at nearly locked 30fps at 720p with better graphics than the PS3, none would have believe me.

I believe we are having a much more mature conversation this time around, considering possible power consumption, chip size and configuration and possible fab nodes. And the truth is, that the leaked specs are much better than the most optimistic projection this time around. Pretty much none is saying Drake is going to run 8 A72 cores at 2GHz, which is the specified maximum by Nvidia, and many are pondering if 4 of the 12 SM are going to be binned, since the leaked chip is too strong.
 
Last edited:
Hello. This is my first post on Famiboards, but I've been lurking in these threads (Wii U and Switch editions too) since way back.
I'm not very tech savvy, so I have a few questions regarding recent findings.
Would these power levels be possible in the current form factor?
PS4-level portable mode is Steam Deck territory. Does that mean increase in size to match it?
39W in docked mode would require a good cooling solution in the dock, but wouldn't it affect the ability to instantly pull the device out and play portably? It would have to be constantly cooled to human-tolerable levels while docked.
 
Hello. This is my first post on Famiboards, but I've been lurking in these threads (Wii U and Switch editions too) since way back.
I'm not very tech savvy, so I have a few questions regarding recent findings.
Would these power levels be possible in the current form factor?
Theoretically it should be. None of us here know if the exact cooling requirements will change, but the fact that the API sees 12SMs means that's what the chip has. It's essentially confirmed.
PS4-level portable mode is Steam Deck territory. Does that mean increase in size to match it?
It's highly likely this will reuse the OLED model s dock, so any increase in size will have to be small enough to allow it to fit there still. It can get wider by a few mm and thicker by a few mm but not much beyond that.
39W in docked mode would require a good cooling solution in the dock, but wouldn't it affect the ability to instantly pull the device out and play portably? It would have to be constantly cooled to human-tolerable levels while docked.
39W is the theoretical maximum for playing while also charging the battery and joycons. And possibly some reserved for the USB ports but I'm not clear on that. The wattage supplied to the unit should be spread out enough not to make it absurdly hot.

Also IIRC Digital Foundry did some testing on a hacked Switch to see what kind of temperatures you'd get at high enough wattage and it was never really too hot to the touch.


EDIT: Also, welcome to Fami!
 
Last edited:
Watch Nintendo match 76.8 % of that. 2.3 TFLOPs "Please understand!"

5-6 TFLOPs for 39 watts for the whole system sounds way too good to be true. Especially considering steam deck runs up to 30 watts for 1.6 TFLOPs GPU max and maybe 3Ghz for it's 4 core CPU 🤔
1.4 TFlops portable(Same 460MHz clock of current Switch Portable) and 2.3 TFlops Docked(Same 768MHz clock od current Switch Docked) would still be amazing bro. And it won't use 39 Watts for the whole system docked. It will be way less. 39W is probably the worst-case scenario, where the Switch will be playing while recharging.
 
I thought I'd do a quick round-up of what we know, and give some general idea of how big our margin of error is on the known and unknown variables on the new chip.

Chip

Codenamed Drake/T239. Related to Orin/T234. We don't have confirmation on manufacturing process. The base assumption is 8nm (same as Orin), however kopite7kimi, who previously leaked info about the chip and said 8nm, is now unsure on the manufacturing process. The fact that the GPU is much larger than expected may also indicate a different manufacturing process, but we don't have any hard evidence. We also don't know the power consumption limits Nintendo have chosen for the chip in either handheld or docked mode, which will impact clock expectations.

GPU
This is what the leaks have been about so far, so we have much more detailed info here. In particular, on the die we have:

12 SMs
Ampere architecture with 128 "cores" per SM, and tensor performance comparable to desktop Ampere per SM. Some lower-level changes compared to desktop Ampere, but difficult to gauge the impact of those.
12 RT cores
No specific info on these, in theory they could have changes compared to desktop Ampere, but personally I'm not going to assume any changes until we have evidence.
4MB L2 cache
This is higher than would be expected for a GPU of this size (most comparable would be RTX 3050 laptop, with 2MB L2). Same as PS5 GPU L2 and only a bit smaller than XBSX GPU L2 of 5MB. This should help reduce memory bandwidth requirements, but it's impossible to say exactly by how much. Note this isn't really an "infinity cache", which range from 16MB to 128MB on AMD's 6000-series GPUs, it's just a larger than normal cache.

Things we don't know: how many SMs are actually enabled in either docked or handheld mode, clocks, ROPs.

Performance range in docked mode: It's possible that we could have a couple of SMs binned for yields, as this is a bigger GPU than expected. This would probably come in the form of disabling one TPC (two SMs) brining it down to 10. Clocks depend heavily on the manufacturing process and whether Nintendo have significantly increased their docked power consumption over previous models. I'd expect clocks between 800MHz-1GHz are probably most likely, but on the high end of expectations (better manufacturing process and higher docked power consumption) it could push as high as 1.2GHz. I doubt it will be clocked lower than the 768MHz docked clock of the original Switch, but that's not strictly impossible.

Low-end: 10 SMs @ 768MHz - 1.97 Tflops FP32
High-end: 12 SMs @ 1.2GHz - 3.68 Tflops FP32

Obviously there's a very big range here, as we don't know power consumption or manufacturing process. It's also important to note that you can't simply compare Tflops figures between different architectures.

Performance range in handheld mode: This gets even trickier, as Drake is reportedly the only Ampere GPU which supports a particular clock-gating mode, which could potentially be used to disable SMs in handheld mode. This makes sense, though, as peak performance per watt will probably be somewhere in the 400-600MHz range, so it's more efficient to, say, have 6 SMs running at 500MHz than all 12 running at 250MHz. Handheld power consumption limits are also going to be very tight, so performance will be very much limited by manufacturing process. I'd expect handheld clocks to range from 400MHz to 600MHz, but this is very dependent on manufacturing process and the number of enabled SMs.

One other comment to make here is that we shouldn't necessarily expect the <=2x performance difference between docked and handheld that we saw on the original Switch. That was for a system designed around 720p output in portable mode and 1080p output docked, however here we're looking at a 4K docked output, and either 720p or 1080p portable, so there's a much bigger differential in resolution, and therefore a bigger differential in performance required. It's possible that we could get as much as a 4x differential between portable and docked GPU performance.

Low-end: 6 SMs @ 400 MHz - 614 Gflops FP32
High-end: 8 SMs @ 600 MHz - 1.2 Tflops FP32

There is of course DLSS on top of this, but it's not magic, and shouldn't be taken as a simple multiplier of performance. Many other aspects like memory bandwidth can still be a bottleneck.

CPU

The assumption here is that they'll use A78 cores. That isn't strictly confirmed, but given Orin uses A78 cores, it would be a surprise if Drake used anything else. We don't know either core count or clocks, and again they will depend on the manufacturing process. The number of active cores and clocks will almost certainly remain the same between handheld and docked mode, so the power consumption in handheld mode will be the limiting factor.

For core count, 4 is the minimum for compatibility, and 8 is probably the realistic maximum. The clocks could probably range from 1GHz to 2GHz, and this will depend both on the manufacturing process and number of cores (fewer cores means they can run at higher clocks).

The performance should be a significant improvement above Switch in any case. In the lower end of the spectrum, it should be roughly in line with XBO/PS4 CPU performance, and at the high-end it would sit somewhere between PS4 and PS5 CPU performance.

RAM

Again, the assumption is that they'll use LPDDR5, based on Orin using it, and there not being any realistic alternatives (aside from maybe LPDDR5X depending on timing). The main question mark here is the bus width, which will determine the bandwidth. The lowest possible bus width is 64-bit, which would give us 51.2GB/s of bandwidth, and the highest possible would be 256-bit, which would provide 204.8GB/s bandwidth. Bandwidth in handheld mode would likely be a lot lower to reduce power consumption.

Quantity of RAM is also unknown. On the low end they could conceivably go with just 6GB, but realistically 8GB is more likely. On the high end, in theory they could fit much more than that, but cost is the limiting factor.

Storage

There are no hard facts here, only speculation. Most people expect 128GB of built-in storage, but in theory it could be more or less than that.

In terms of speeds, the worst case scenario is that Nintendo retain the UHS-I SD card slot, and all games have to support ~100MB/s as a baseline. The best case scenario is that they use embedded UFS for built-in storage, and support either UFS cards or SD Express cards, which means games could be built around a 800-900MB/s baseline. The potential for game card read speeds is unknown, and it's possible that some games may require mandatory installs to benefit from higher storage speeds.
 
Should the new information arrive today?
The hackers said they would dump the Nvidia data they stole today if Nvidia didn't meet their demands. Nvidia didn't and we're waiting. Haven't seen anything so far. It's a possibility that they might not even release anything and it was just a bluff. In case nothing is released, our best bet for new info will be this year GDC.
 
I thought I'd do a quick round-up of what we know, and give some general idea of how big our margin of error is on the known and unknown variables on the new chip.

Chip

Codenamed Drake/T239. Related to Orin/T234. We don't have confirmation on manufacturing process. The base assumption is 8nm (same as Orin), however kopite7kimi, who previously leaked info about the chip and said 8nm, is now unsure on the manufacturing process. The fact that the GPU is much larger than expected may also indicate a different manufacturing process, but we don't have any hard evidence. We also don't know the power consumption limits Nintendo have chosen for the chip in either handheld or docked mode, which will impact clock expectations.

GPU
This is what the leaks have been about so far, so we have much more detailed info here. In particular, on the die we have:

12 SMs
Ampere architecture with 128 "cores" per SM, and tensor performance comparable to desktop Ampere per SM. Some lower-level changes compared to desktop Ampere, but difficult to gauge the impact of those.
12 RT cores
No specific info on these, in theory they could have changes compared to desktop Ampere, but personally I'm not going to assume any changes until we have evidence.
4MB L2 cache
This is higher than would be expected for a GPU of this size (most comparable would be RTX 3050 laptop, with 2MB L2). Same as PS5 GPU L2 and only a bit smaller than XBSX GPU L2 of 5MB. This should help reduce memory bandwidth requirements, but it's impossible to say exactly by how much. Note this isn't really an "infinity cache", which range from 16MB to 128MB on AMD's 6000-series GPUs, it's just a larger than normal cache.

Things we don't know: how many SMs are actually enabled in either docked or handheld mode, clocks, ROPs.

Performance range in docked mode: It's possible that we could have a couple of SMs binned for yields, as this is a bigger GPU than expected. This would probably come in the form of disabling one TPC (two SMs) brining it down to 10. Clocks depend heavily on the manufacturing process and whether Nintendo have significantly increased their docked power consumption over previous models. I'd expect clocks between 800MHz-1GHz are probably most likely, but on the high end of expectations (better manufacturing process and higher docked power consumption) it could push as high as 1.2GHz. I doubt it will be clocked lower than the 768MHz docked clock of the original Switch, but that's not strictly impossible.

Low-end: 10 SMs @ 768MHz - 1.97 Tflops FP32
High-end: 12 SMs @ 1.2GHz - 3.68 Tflops FP32

Obviously there's a very big range here, as we don't know power consumption or manufacturing process. It's also important to note that you can't simply compare Tflops figures between different architectures.

Performance range in handheld mode: This gets even trickier, as Drake is reportedly the only Ampere GPU which supports a particular clock-gating mode, which could potentially be used to disable SMs in handheld mode. This makes sense, though, as peak performance per watt will probably be somewhere in the 400-600MHz range, so it's more efficient to, say, have 6 SMs running at 500MHz than all 12 running at 250MHz. Handheld power consumption limits are also going to be very tight, so performance will be very much limited by manufacturing process. I'd expect handheld clocks to range from 400MHz to 600MHz, but this is very dependent on manufacturing process and the number of enabled SMs.

One other comment to make here is that we shouldn't necessarily expect the <=2x performance difference between docked and handheld that we saw on the original Switch. That was for a system designed around 720p output in portable mode and 1080p output docked, however here we're looking at a 4K docked output, and either 720p or 1080p portable, so there's a much bigger differential in resolution, and therefore a bigger differential in performance required. It's possible that we could get as much as a 4x differential between portable and docked GPU performance.

Low-end: 6 SMs @ 400 MHz - 614 Gflops FP32
High-end: 8 SMs @ 600 MHz - 1.2 Tflops FP32

There is of course DLSS on top of this, but it's not magic, and shouldn't be taken as a simple multiplier of performance. Many other aspects like memory bandwidth can still be a bottleneck.

CPU

The assumption here is that they'll use A78 cores. That isn't strictly confirmed, but given Orin uses A78 cores, it would be a surprise if Drake used anything else. We don't know either core count or clocks, and again they will depend on the manufacturing process. The number of active cores and clocks will almost certainly remain the same between handheld and docked mode, so the power consumption in handheld mode will be the limiting factor.

For core count, 4 is the minimum for compatibility, and 8 is probably the realistic maximum. The clocks could probably range from 1GHz to 2GHz, and this will depend both on the manufacturing process and number of cores (fewer cores means they can run at higher clocks).

The performance should be a significant improvement above Switch in any case. In the lower end of the spectrum, it should be roughly in line with XBO/PS4 CPU performance, and at the high-end it would sit somewhere between PS4 and PS5 CPU performance.

RAM

Again, the assumption is that they'll use LPDDR5, based on Orin using it, and there not being any realistic alternatives (aside from maybe LPDDR5X depending on timing). The main question mark here is the bus width, which will determine the bandwidth. The lowest possible bus width is 64-bit, which would give us 51.2GB/s of bandwidth, and the highest possible would be 256-bit, which would provide 204.8GB/s bandwidth. Bandwidth in handheld mode would likely be a lot lower to reduce power consumption.

Quantity of RAM is also unknown. On the low end they could conceivably go with just 6GB, but realistically 8GB is more likely. On the high end, in theory they could fit much more than that, but cost is the limiting factor.

Storage

There are no hard facts here, only speculation. Most people expect 128GB of built-in storage, but in theory it could be more or less than that.

In terms of speeds, the worst case scenario is that Nintendo retain the UHS-I SD card slot, and all games have to support ~100MB/s as a baseline. The best case scenario is that they use embedded UFS for built-in storage, and support either UFS cards or SD Express cards, which means games could be built around a 800-900MB/s baseline. The potential for game card read speeds is unknown, and it's possible that some games may require mandatory installs to benefit from higher storage speeds.
Very good summary, but a couple things are wrong. The portable range, you list 8SM as the maximum, but that is assuming they disable SM, the actual max range would be 12SM with a ~500MHz clock. For all we know they went with 5nm TSMC for Drake, as the original chip was codenamed Dane, and while T239 is shared between both versions of the chip, we do not know what was changed.

Storage could also be a mix of standard microSD cards and UFS storage, which a much higher speed.

RAM, there are cell phones today that have 12GBs-16GBs, with given that XBSS is 10GBs, I do think 8GBs would be enough and is the most likely result, but it shouldn't be taken for granted, just last week I think we would both agree that 12SM for the GPU seems outlandish and above the maximum we should expect.

I think Drake has exceeded our expectations so far, we will have to wait and see what other surprises Nintendo has in store for us, but it doesn't seem like they are realistically going to match a 12SM GPU with a 4 core CPU, while I agree it is the minimum, I do think 6 cores is much more likely given the cache of A78C cores and what they might be doing to mitigate memory bandwidth.
 
I thought I'd do a quick round-up of what we know, and give some general idea of how big our margin of error is on the known and unknown variables on the new chip.

Chip

Codenamed Drake/T239. Related to Orin/T234. We don't have confirmation on manufacturing process. The base assumption is 8nm (same as Orin), however kopite7kimi, who previously leaked info about the chip and said 8nm, is now unsure on the manufacturing process. The fact that the GPU is much larger than expected may also indicate a different manufacturing process, but we don't have any hard evidence. We also don't know the power consumption limits Nintendo have chosen for the chip in either handheld or docked mode, which will impact clock expectations.

GPU
This is what the leaks have been about so far, so we have much more detailed info here. In particular, on the die we have:

12 SMs
Ampere architecture with 128 "cores" per SM, and tensor performance comparable to desktop Ampere per SM. Some lower-level changes compared to desktop Ampere, but difficult to gauge the impact of those.
12 RT cores
No specific info on these, in theory they could have changes compared to desktop Ampere, but personally I'm not going to assume any changes until we have evidence.
4MB L2 cache
This is higher than would be expected for a GPU of this size (most comparable would be RTX 3050 laptop, with 2MB L2). Same as PS5 GPU L2 and only a bit smaller than XBSX GPU L2 of 5MB. This should help reduce memory bandwidth requirements, but it's impossible to say exactly by how much. Note this isn't really an "infinity cache", which range from 16MB to 128MB on AMD's 6000-series GPUs, it's just a larger than normal cache.

Things we don't know: how many SMs are actually enabled in either docked or handheld mode, clocks, ROPs.

Performance range in docked mode: It's possible that we could have a couple of SMs binned for yields, as this is a bigger GPU than expected. This would probably come in the form of disabling one TPC (two SMs) brining it down to 10. Clocks depend heavily on the manufacturing process and whether Nintendo have significantly increased their docked power consumption over previous models. I'd expect clocks between 800MHz-1GHz are probably most likely, but on the high end of expectations (better manufacturing process and higher docked power consumption) it could push as high as 1.2GHz. I doubt it will be clocked lower than the 768MHz docked clock of the original Switch, but that's not strictly impossible.

Low-end: 10 SMs @ 768MHz - 1.97 Tflops FP32
High-end: 12 SMs @ 1.2GHz - 3.68 Tflops FP32

Obviously there's a very big range here, as we don't know power consumption or manufacturing process. It's also important to note that you can't simply compare Tflops figures between different architectures.

Performance range in handheld mode: This gets even trickier, as Drake is reportedly the only Ampere GPU which supports a particular clock-gating mode, which could potentially be used to disable SMs in handheld mode. This makes sense, though, as peak performance per watt will probably be somewhere in the 400-600MHz range, so it's more efficient to, say, have 6 SMs running at 500MHz than all 12 running at 250MHz. Handheld power consumption limits are also going to be very tight, so performance will be very much limited by manufacturing process. I'd expect handheld clocks to range from 400MHz to 600MHz, but this is very dependent on manufacturing process and the number of enabled SMs.

One other comment to make here is that we shouldn't necessarily expect the <=2x performance difference between docked and handheld that we saw on the original Switch. That was for a system designed around 720p output in portable mode and 1080p output docked, however here we're looking at a 4K docked output, and either 720p or 1080p portable, so there's a much bigger differential in resolution, and therefore a bigger differential in performance required. It's possible that we could get as much as a 4x differential between portable and docked GPU performance.

Low-end: 6 SMs @ 400 MHz - 614 Gflops FP32
High-end: 8 SMs @ 600 MHz - 1.2 Tflops FP32

There is of course DLSS on top of this, but it's not magic, and shouldn't be taken as a simple multiplier of performance. Many other aspects like memory bandwidth can still be a bottleneck.

CPU

The assumption here is that they'll use A78 cores. That isn't strictly confirmed, but given Orin uses A78 cores, it would be a surprise if Drake used anything else. We don't know either core count or clocks, and again they will depend on the manufacturing process. The number of active cores and clocks will almost certainly remain the same between handheld and docked mode, so the power consumption in handheld mode will be the limiting factor.

For core count, 4 is the minimum for compatibility, and 8 is probably the realistic maximum. The clocks could probably range from 1GHz to 2GHz, and this will depend both on the manufacturing process and number of cores (fewer cores means they can run at higher clocks).

The performance should be a significant improvement above Switch in any case. In the lower end of the spectrum, it should be roughly in line with XBO/PS4 CPU performance, and at the high-end it would sit somewhere between PS4 and PS5 CPU performance.

RAM

Again, the assumption is that they'll use LPDDR5, based on Orin using it, and there not being any realistic alternatives (aside from maybe LPDDR5X depending on timing). The main question mark here is the bus width, which will determine the bandwidth. The lowest possible bus width is 64-bit, which would give us 51.2GB/s of bandwidth, and the highest possible would be 256-bit, which would provide 204.8GB/s bandwidth. Bandwidth in handheld mode would likely be a lot lower to reduce power consumption.

Quantity of RAM is also unknown. On the low end they could conceivably go with just 6GB, but realistically 8GB is more likely. On the high end, in theory they could fit much more than that, but cost is the limiting factor.

Storage

There are no hard facts here, only speculation. Most people expect 128GB of built-in storage, but in theory it could be more or less than that.

In terms of speeds, the worst case scenario is that Nintendo retain the UHS-I SD card slot, and all games have to support ~100MB/s as a baseline. The best case scenario is that they use embedded UFS for built-in storage, and support either UFS cards or SD Express cards, which means games could be built around a 800-900MB/s baseline. The potential for game card read speeds is unknown, and it's possible that some games may require mandatory installs to benefit from higher storage speeds.
Do you personally believe it's impossible that all 12 SMs are active in handheld mode, assuming this is still 8nm? Would the power draw be simply unrealistically high?

There was some speculation about the clock gating being specific to this device in order to allow one TPC to be active during standby/sleep mode occasionally.
 
Do you personally believe it's impossible that all 12 SMs are active in handheld mode, assuming this is still 8nm? Would the power draw be simply unrealistically high?

There was some speculation about the clock gating being specific to this device in order to allow one TPC to be active during standby/sleep mode occasionally.
It would also make sense to use it for BC.
 
0
If we have enough info to extrapolate I'd love to see a graph with a power curve for a theoretical 12SM 8nm Ampere GPU versus a 2SM 20nm Maxwell GPU.
 
Please read this staff post before posting.

Furthermore, according to this follow-up post, all off-topic chat will be moderated.
Last edited:


Back
Top Bottom