• Hey everyone, staff have documented a list of banned content and subject matter that we feel are not consistent with site values, and don't make sense to host discussion of on Famiboards. This list (and the relevant reasoning per item) is viewable here.

StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (Read the staff posts before commenting!)

Somehow there are some reverts on the commit of T239 on Linux. Don't quote me on this though since I just saw this stuff on some banned source.

Edit: Looks like this has already been talked about.
Yep. They were likely reversed because someone realized T239 would never be on Android.

I guess it makes a T239 powered shield that's been speculated less likely.
 
Last edited:
Do you heard maybe if TSMC 4N is indeed node?
You would never, ever, E V E R hear this from a developer for video game software. Yes they bleed out into other aspects of CS like software engineering and also semi who are in contact here and there, but you will never ever hear about what node it is from those that make games.

It’s not their job and it’s not what they care about. Only the company who makes it, or the company who sells it, will disclose that information.
 
You would never, ever, E V E R hear this from a developer for video game software. Yes they bleed out into other aspects of CS like software engineering and also semi who are in contact here and there, but you will never ever hear about what node it is from those that make games.

It’s not their job and it’s not what they care about. Only the company who makes it, or the company who sells it, will disclose that information.
But If Necro is right it's around 600mhz then either 8nm is as efficient as we expected 4nm to be, or Drake is 4nm. Either way we're getting the upper bounds of what we expected.
 
But If Necro is right it's around 600mhz then either 8nm is as efficient as we expected 4nm to be, or Drake is 4nm. Either way we're getting the upper bounds of what we expected.
I’m not denying that, but I have my own reservations regarding that report.
 
Really looking forward to Nintendo making an official announcement but man, how I'm going to miss coming into this forum, reading 5 pages full of numbers and specs and nodding my head pretending I understand it all when I really don't understand anything.
 
Really looking forward to Nintendo making an official announcement but man, how I'm going to miss coming into this forum, reading 5 pages full of numbers and specs and nodding my head pretending I understand it all when I really don't understand anything.

Rejoice then, you're being able to do that for quite some time even after a reveal by Nintendo. Cause they usually give jack-shit in-depth tech-details and you usually have to wait for someone getting hands on a unit and doing a teardown. ;D
 
But If Necro is right it's around 600mhz then either 8nm is as efficient as we expected 4nm to be, or Drake is 4nm. Either way we're getting the upper bounds of what we expected.
It may not be either node. Quite apart from the fact that this is a dedicated SoC and the customer has a say in whether he wants Samsung, Nvidia itself may also have a strategic interest in not opting for TSMC here. 8nm is not the only possible solution at Samsung, according to what is said in this thread when we ask the question, so personally I would not summarize the speculation to an alternative between Samsung 8nm or TSMC 4nm.
 
It may not be either node. Quite apart from the fact that this is a dedicated SoC and the customer has a say in whether he wants Samsung, Nvidia itself may also have a strategic interest in not opting for TSMC here. 8nm is not the only possible solution at Samsung, according to what is said in this thread when we ask the question, so personally I would not summarize the speculation to an alternative between Samsung 8nm or TSMC 4nm.
There's a small chance of a wildcard node, but it doesn't change what I said about 600mhz being the upper bound of expectations on tsmc 4.
 
Well, as a layman, i feel like there's more recent "tech-based" rumors / speculation for using TSMN4nm over Samsung 8nm, which (unless i forgot some rumors) are mostly based on "because Nintendo".
 
theres a significant downside in Nintendo knowleging it next console: it will severly/negatively impact the sales of Nintendo Switch and Nintendo surely want to avoid this, even if this mean keep us in dark for 2/3 more years if necessary
Their software pipeline this year can't be saved by not acknowledging new hardware is coming. The Switch decline will be steep because they have no major new releases this year.
 
There's a small chance of a wildcard node, but it doesn't change what I said about 600mhz being the upper bound of expectations on tsmc 4.

What do you think are the chances of them using N4P rather than that NVIDIA custom 4N? After the announcement of N4C I hope for the former, but it seems unlikely.
 
BotW on Wii U or Switch?
There's no difference between the versions

And BotW isn't lumen-like. The game uses a single probe situated over Link

If it's running on ARM, then it's not really a "PC handheld" anymore. The whole draw of something like the ROG Ally to a PC gamer is that it's the exact same as an equivalent PC under the hood with no architectural weirdness, and having to deal with x86 to ARM translation would add quite a bit of jank. So long as Nvidia doesn't have an x86 license and Intel continues to prioritize productivity in their CPU architectures, AMD will continue to dominate the handheld PC market.
PCs aren't defined by x86 or ARM. Being a translation layer won't matter, especially when you'll be able to play steam games on an arm laptop in the next two months
 
There's no difference between the versions

And BotW isn't lumen-like. The game uses a single probe situated over Link


PCs aren't defined by x86 or ARM. Being a translation layer won't matter, especially when you'll be able to play steam games on an arm laptop in the next two months
The botw probe effect is similar to lumen, as oldpuck said some time ago。The global illumination effect achieved by this probe is really advanced technology for 8th generation games.
 
Last edited:
What do you think are the chances of them using N4P rather than that NVIDIA custom 4N? After the announcement of N4C I hope for the former, but it seems unlikely.
Doubt Nintendo would pay extra for it. Maybe as a battery life revision in a couple of years. Woudnt be anywhere near as significant as the Mariko revision though.
 
Anyone claiming 8nm, especially those who claim to have sources, but then do not address the obvious problems of power draw for a chip that big are either very ignorant or are intentionally avoiding the elephant in the room. I'm not saying 8nm is impossible, but when talking about T239 on 8nm, if power draw concerns aren't a topic of conversation, then I have to question your knowledge of the topic at hand. I want to slam my head on the table when the DF guys talk about T239 being on 8nm, but nobody ever brings up power consumption concerns. It's a big ass concern for a portable console. Erista Switch units were already on the low end of battery life at 3 hours, hard to imagine T239 getting that on 8nm.
I think their mindset is that regardless power draw concerns they think Nintendo will be so much focused on the cost that they will choose 8nm over 4nm just because they think Nintendo will inherently choose withered tech over newer tech. And its true in some ways, if Nintendo follows their traditional model of focus on withered technology, it would make no sense that they would use a 4nm node for the Switch 2. But maybe the withered technology era of Nintendo is over?
 
I think their mindset is that regardless power draw concerns they think Nintendo will be so much focused on the cost that they will choose 8nm over 4nm just because they think Nintendo will inherently choose withered tech over newer tech. And its true in some ways, if Nintendo follows their traditional model of focus on withered technology, it would make no sense that they would use a 4nm node for the Switch 2. But maybe the withered technology era of Nintendo is over?
Simply put, for the current Nintendo, Yokoi Gunpei's theory of wither technology is just a guarantee to ensure that every game console console sold can make a profit if it can make a profit, then for the current Nintendo, the wither technology is of little use.
 
I think their mindset is that regardless power draw concerns they think Nintendo will be so much focused on the cost that they will choose 8nm over 4nm just because they think Nintendo will inherently choose withered tech over newer tech. And its true in some ways, if Nintendo follows their traditional model of focus on withered technology, it would make no sense that they would use a 4nm node for the Switch 2. But maybe the withered technology era of Nintendo is over?

I mean in some ways Ampere is withered tech, they didn't go with the absolute latest Nvidia could offer.

That being said, on an equal node Ampere is an arms length behind Lovelace.
 
I mean in some ways Ampere is withered tech, they didn't go with the absolute latest Nvidia could offer.

That being said, on an equal node Ampere is an arms length behind Lovelace.
But that is probably why some speculate that 8nm is more likely, 4nm is far from withered technology today, while 8nm is pure withered tech in classic Nintendo style.
 
But that is probably why some speculate that 8nm is more likely, 4nm is far from withered technology today, while 8nm is pure withered tech in classic Nintendo style.
Yep, Orin is 8nm and every other gaming Ampere chip is 8nm. So from that pow, 8nm makes sense.

But when you start looking at power consumption figures for how Ampere performs on 8nm, it stops making sense for a 12SM handheld.
 
Yep, Orin is 8nm and every other gaming Ampere chip is 8nm. So from that pow, 8nm makes sense.

But when you start looking at power consumption figures for how Ampere performs on 8nm, it stops making sense for a 12SM handheld.

Again laymans thoughts here, but imo going strictly with "Orin/Ampere are on 8nm so this one must be too" ignores the fact that nVidia's been doing a lot of custom work, going by the rumors and leaks so far.

(Not saying you do, generally spoken.)
 
The botw probe effect is similar to lumen, as oldpuck said some time ago。The global illumination effect achieved by this probe is really advanced technology for 8th generation games.
Similar in result and general idea of storing lighting information, but calling what Zelda does similar to lumen is kinda setting up lofty expectations. Maybe if you balloon out zelda's one probe into a myriad of probes (and trace them), we'd start to come closer to ddgi

Light probes have been around for a while though. The big difference being that they can sample the world in real-time at pretty high fidelity. Hence why Zelda only has one, to save rendering time. It has a pretty short radius at that
 
Yep, Orin is 8nm and every other gaming Ampere chip is 8nm. So from that pow, 8nm makes sense.

But when you start looking at power consumption figures for how Ampere performs on 8nm, it stops making sense for a 12SM handheld.

Tegra X1 is based off Maxwell GPU, which was fabbed on 28nm node, but X1 uses 20nm and then was shrunk to 16nm. So I'd say there's no connection between architecture and process, especially if we are talking about custom SOCs.
 
Their software pipeline this year can't be saved by not acknowledging new hardware is coming. The Switch decline will be steep because they have no major new releases this year.
until we know Nintendo full line up for this year, we cant simply state, oh no! Nintendo only have this games, every year is the same song e dance(everyone trought Switch sucessor will launch next year, because Nintendo only have Legend of Zelda Tears of the Kingdom, but the june Direct revealed Super Mario Bros Wonder, a new WarioWare game and many more, why this year this can also apply, Nintendo could release a new 2D/3D Donkey Kong game, Metroid Prime 4 and many more.
 
until we know Nintendo full line up for this year, we cant simply state, oh no! Nintendo only have this games, every year is the same song e dance(everyone trought Switch sucessor will launch next year, because Nintendo only have Legend of Zelda Tears of the Kingdom, but the june Direct revealed Super Mario Bros Wonder, a new WarioWare game and many more, why this year this can also apply, Nintendo could release a new 2D/3D Donkey Kong game, Metroid Prime 4 and many more.
Nintendo desperately needs to have a general direct soon, either this month or next month
 
I think there's less optimism about Nintendo's 2nd half of the year release schedule because the year has been so not great so far and because Pokemon ZA is skipping this year, lol.
 
Doubt Nintendo would pay extra for it. Maybe as a battery life revision in a couple of years. Woudnt be anywhere near as significant as the Mariko revision though.

Do we actually know if it's more expensive? As far as I know N4P was only confirmed to be smaller and more efficient, and N4C was outright advertised as cheaper (though possibly only because it is, again, smaller).
 
Similar in result and general idea of storing lighting information, but calling what Zelda does similar to lumen is kinda setting up lofty expectations. Maybe if you balloon out zelda's one probe into a myriad of probes (and trace them), we'd start to come closer to ddgi

Light probes have been around for a while though. The big difference being that they can sample the world in real-time at pretty high fidelity. Hence why Zelda only has one, to save rendering time. It has a pretty short radius at that
Light Probes based GI is 8th gen advanced, and the fact that the switch is in fact only at 7th gen+ performance levels is a good rebuttal to the statement that Nintendo won't be using advanced rendering technology on it.
 
'Withered' carries an implication of 'dead / obsolete' which is more descriptive of the tech in the Game & Watch / Game Boy that Gunpei Yokoi worked with, and later on the Wii.

The Switch is a whole bunch of 'lateral thinking' but the tech at the time of release, was not 'withered'. Releasing a tablet in 2017 with one of the best mobile SoCs available, even if that mobile chip was 1.5-2 years old, indicates they did not go with an obsolete option.

The Switch 2's use of DLSS can be 'lateral thinking with a seasoned/familiar technology' (although at the time they'd have started work on the console, DLSS was fairly new), it would be the first handheld console in history with AI cores for upscaling, and developers know how to program for this feature.

On PC DLSS is a "toggle" to sustain very high resolutions or framerates while enabling taxing features like RT, developers can't guarantee the system configuration of end users so there exists multiple presets. On Switch 2 every console has access to DLSS and every developer has the option to use it, whether through NVN2 or popular game engines. Games can be built around it and it can be used as a power saving measure, which is significant for an energy constrained device.

If there's at least one throughline that can be drawn all the way from the Game Boy to now, it's power efficiency. It allowed the GB to triumph over its competitors and led to the development of the Switch, one of the most power efficient consoles ever made. And from what we've seen of T239, it's been customized exactly for this purpose.

(this doesn't answer any question of whether they'd go with 8 or 4 for the process node but you can guess which way I'm inclined...)
 
Last edited:
question: let's say someone has access to the switch 2's pcb (t239 soc included), could they accurately guess the node based on the dimensions of the die?
 
Just out of curiosity when was the first Switch Pro/next-gen hardware speculation thread created on Resetera? 2019?
 
But that is probably why some speculate that 8nm is more likely, 4nm is far from withered technology today, while 8nm is pure withered tech in classic Nintendo style.

Nintendo doesn't always have to use withered technology, the haptics in the joycons weren't considered old by any standard.
Also TSMC's 5nm (4N) is not that new or any newer than 20nm was when the Switch came out. The only issue is the foundries haven't advanced node shrinkage as fast as they were back when 20nm was once upon a time new.
 
Are 3 nm nodes and beyond “only” considered bad due to SRAM scaling practically dying and nodes slowing down, or was there something specifically affecting those on top of that?
 
Are 3 nm nodes and beyond “only” considered bad due to SRAM scaling practically dying and nodes slowing down, or was there something specifically affecting those on top of that?

I believe IO scaling is also dead, but also, the costs of these nodes is extremely high due to

1. TSMC having no realistic competition
2. TSMC having to expense a ton of EUV machines that are very expensive

They seem to do well at reducing power consumption so that's useful... But their costs per chip are getting very high.
 
Last edited:
question: let's say someone has access to the switch 2's pcb (t239 soc included), could they accurately guess the node based on the dimensions of the die?
No, the chip is much larger than the silicon that sits on it. Erista and Mariko's silicon area have differences, but with the interposer, they're about the same size

Maybe if you remove the heat spreader, you could guess it
 
Of course RT is affected by horsepower. It's affected three ways.

1) RT cores run at the same clocks as the rest of the GPU. Faster GPU, more RT power, slower GPU, less RT power.
2) RT is costly. The faster that non-RT code runs, the more time to allocate to RT, and vice versa.
3) RT cores speed up the process of light bouncing through a scene. You still need to draw that light, which comes down to regular shaders.



RT is not a switch that is either on, or off. There are multiple affects that you might layer into a scene with RT. You can do ray traced shadows, or ray traced reflections, or ray traced ambient occlusion, or any combination. You could do ray traced global illumination. Or you could fully path-trace a game and do 100% of rendering via ray-tracing.

And you can do each of these effects at higher or lower resolution. You can use more rays or fewer rays. Rays can have more bounces or fewer bounces.You can include all of the scene in ray tracing, or just part of it. Or you can do the whole scene, but with lower poly geometry.

There are hundreds if not thousands of combinations. But, to give you two examples: Unreal Engine actually kinda does have an RT switch. On low-end Nvidia hardware, turning it on costs something like 4% of performance. Control has a "medium" RT setting which is basically just reflections. Turning it on eats ~50% of performance on the same hardware.

So you can see that there is wildly different results here.
So, the 550MHz is the peak power efficiency for GPU, but not for RT cores, right? Maybe put it at 650 MHz is to maximize the RT Cores and DLSS capacities.
 
No, the chip is much larger than the silicon that sits on it. Erista and Mariko's silicon area have differences, but with the interposer, they're about the same size

Maybe if you remove the heat spreader, you could guess it

There's usually no heat spreader on these SoCs. Die is directly under the heatsink. Based on available information and die shot of related chips, it might be possible.
 
No, the chip is much larger than the silicon that sits on it. Erista and Mariko's silicon area have differences, but with the interposer, they're about the same size

Maybe if you remove the heat spreader, you could guess it

SwitchOLED_116-1536x864.jpg


If I'm not mistaken, the way Switch is assembled there isn't really a "heat spreader" like the lid on an desktop CPU (iFixit teardown of OLED Model attached), so the bear PCB would show you the die unless the heat sink is attached.
 
If it's really that high a clock, I wouldn't be surprised to see 5X, but we have to be careful not to think about it backwards.

Remember, Nintendo is developing games in concert with the hardware. Their hardware team has been pretty clear about how that process goes. Game Devs ask for more power and more features, which drives up costs, makes the hardware bigger, and more fragile. Hardware Devs aggressively cut things that developers aren't using.

This results in the lowest cost hardware that can support games that Nintendo thinks can compete in the market. If, during development, EPD really really wanted that extra compute power, then they could push clocks. And if it turned out they were really really hampered by memory bandwidth, depending on the timing, they could rework the memory controller.

Nvidia isn't going to push LPDDR5X (which is more expensive, at least initially) onto their customer to support 4 TFLOPS. Game devs are going to ask Nvidia for more throughput, if they need it.

That's why I think we get that 2x difference between handheld and docked. Not because some designer on day one sets that as the target, but because that is naturally the sweet spot that they're going to find in development for getting games to work on both.

That's also why I find 4 TFLOPS dubious. The difference between 3.5 TFLOPS and 4 TFLOPS seems pretty massive in terms of power and heat, but pretty small in terms of performance. I just don't believe that EPD will take 3.5 TFLOPS and say "we have 9x the power of the Switch, but it's not enough for Mario! Mario needs 11x the power of the Switch, no matter the cost!"

It just doesn't seem plausible to me.

I place very little faith into MLID's "sources", but I wouldn't be too surprised if 1.3GHz (and therefore 4 TFLOPS) was an internal target for Nvidia, even if they didn't expect Nintendo to clock it that high. Even if Nintendo had settled on, say 1.1GHz as a target clock in docked mode during the design process, the actual clock speed could end up lower or higher than that based on the performance of the silicon, other hardware design changes, etc., so there would want to be some wiggle room in achievable clocks to accommodate that. A 1.3GHz clock is probably a reasonable upper limit of what Nintendo may choose, and also a round number in both clock speed and Gflops, so it's a plausible target for Nvidia. Besides, every other Ampere GPU can clock well past 1.3GHz, so it's really quite a modest target for the architecture.

On the LPDDR5X side, I could see it happening for two reasons outside of performance requirements. The first is simply that Nvidia was already working on an in-house LDDDR5X controller (for Grace) with a similar tape-out timescale, and it may have been the case that using the updated controller would have a near-zero impact on cost and timelines, so it was basically the default option.

The second is availability of LPDDR5/X RAM down the line. Back when Nintendo and Nvidia started supporting LPDDR4X with Mariko, I assumed it was purely for the sake of the power efficiency benefit. This may have been partly the case, but in the years after release, LPDDR4 has all but disappeared from the market, with LPDDR4X almost completely replacing it. Being able to use a widely available (and therefore cheap) form of RAM for the rest of the console's life was likely a major reason behind switching from LPDDR4 to 4X.

With Switch 2, it seems pretty unlikely that there will be a Mariko-style updated SoC at any point in the console's lifespan, which means they'll need to design a system which they can still buy parts for perhaps as much as 8 years later. Nintendo (and, I'm sure Nvidia) have surely spent a lot of time talking with RAM manufacturers about the production lifespan of the parts they're interested in, and if it's expected that LPDDR5X will almost completely replace 5 in the same way 4X replaced 4, then there would be a very strong incentive for Nintendo to have an SoC with LPDDR5X support even if they had no intention of using the higher speeds.
 
So, the 550MHz is the peak power efficiency for GPU, but not for RT cores, right? Maybe put it at 650 MHz is to maximize the RT Cores and DLSS capacities.

I actually don't know if this is possible or not with NVIDIA's cards, but it would be very interesting if possible.

The RT cores in particular are basically the only cores that act during the BVH traversal step and they only make up like 2-3% of the entire chip.... Maybe you could double the clock speed there?

I would assume this is not possible as if it was, it would have already been used on NVIDIA's GPUs on PC, but maybe.
 
Last edited:
Rejoice then, you're being able to do that for quite some time even after a reveal by Nintendo. Cause they usually give jack-shit in-depth tech-details and you usually have to wait for someone getting hands on a unit and doing a teardown. ;D
Man, I'd love a "Road to PS5" Mark Cerny-like spec video for the Switch 2. Get some popcorn (or flamin hot Cheetos) and enjoy the show. If only..
 
Last edited:
I actually don't know if this is possible or not with NVIDIA's cards, but it would be very interesting if possible.

The RT cores in particular are basically the only cores that act during the BVH traversal step and they only make up like 2-3% of the entire chip.... Maybe you could double the clock speed there?

I would assume this is not possible as if it was, it would have already been used on NVIDIA's GPUs on PC, but maybe.
i wondered the same, but also for tensor cores. if 4k dlss would take too long like DFs tests indicate, could they somehow decouple tensor/rt clocks and run them faster for better performance?
 
I don’t know who’s desperate, but I don’t think it’s actually Nintendo.
I don't think Nintendo's desperate either, but I think what Jake meant is, wouldn't Nintendo have to anyway? There's no Nintendo games announced beyond June.

Or are we going to expect Twitter-drops from Nintendo for the games coming in July and afterwards, no Direct?
 
i wondered the same, but also for tensor cores. if 4k dlss would take too long like DFs tests indicate, could they somehow decouple tensor/rt clocks and run them faster for better performance?

This seems harder as I think you could realistically have the CUDA and tensor cores working at the same time (with the tensor cores doing DLSS and the CUDA cores doing post processing for example) and I think the tensor cores make up more of the chip, but maybe.
 
So, the 550MHz is the peak power efficiency for GPU, but not for RT cores, right? Maybe put it at 650 MHz is to maximize the RT Cores and DLSS capacities.
I don't think RT Cores should be graded on the same spectrum, so to speak. there isn't any kind of measure about the efficiency of RT cores because they're an intrinsic part of the design and has the same clock as the shader cores. you want more efficient RT, you just have to make it so through your asset design, rather than clocking your RT cores a certain way
 
Please read this staff post before posting.

Furthermore, according to this follow-up post, all off-topic chat will be moderated.
Last edited:


Back
Top Bottom