• Hey everyone, staff have documented a list of banned content and subject matter that we feel are not consistent with site values, and don't make sense to host discussion of on Famiboards. This list (and the relevant reasoning per item) is viewable here.

StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (Read the staff posts before commenting!)

Not gonna lie, a 3050 mobile gpu even before DLSS is kind of a beast. I've played a bit on one and that thing runs anything from the PS4 but at twice the framerate and graphics settings. I'm guessing thermals might makebit "worse" but then DLSS and focused development and optimization for it will bring it back to what I expect.
 
So there are several paths here

1. The video is incorrect and something is going wrong with these various games because these DLSS frametimes don't line up at all with NVIDIA documentation.
2. DLSS is not very usable on the Switch 2 as it breaks down at lower tensor core/teraflop counts
3. The Switch 2 will receive a custom/mobile version of DLSS that is pruned heavily by NVIDIA and thus looks a good bit worse (maybe FSR2 bad?) but runs at like 10-20% the cost.
 
Rich says DLSS takes 18ms to do 4K is what the entire discussion is about...

Like, Busby 3D DLSS 4K/60 would be impossible going by this.
on one data point and a drastically different execution of render resources. Bubsy 3D would render at 100+ FPS before DLSS. 18ms added on is literally nothing

EDIT: wasn't thinking about the frame time aspect, whoops
 
Last edited:
Yep, its important to remember that even in terms of raw performance, Drake is significantly more powerful than the Tegra X1. Take a game like Super Mario Bros Wonder that renders at 1080p 60fps on Switch, Drake is roughly 8x as powerful as the Tegra X1 and could will be able to render that game at 4K 60fps natively. Same goes with a game like Mario Kart 8 Deluxe. So even if 4K 60fps DLSS turns out to be to expensive, you will still see 4K games on SNG because it has the grunt to do it.
Dynamic 4K isn't an unrealistic expectation for games like Super Mario Bros. Wonder. In fact I might say it's likely to get an NG patch;

A 2D Mario and a Pokémon game optimised for your system at launch is killer, even if they're from months ago.
 
looks like t239 has a DL accelerator :)

(i hope)
LiC and others already looked at Nvidia data and Linux commits. T239 does not feature DLA and I'm puzzled as why he added that to his video (As, iirc, Oldpuck asked the same question here ages ago based on his conversation with Rich and we were able to see that T239 lacked DLA). Guess it's just something he didn't could cut from his video.
A lot of the information provided during the DF session confuses me. Granted, DF's findings aren't the word of god, but it just seems rather odd to me considering BotW was originally not an easy game to run, and 4K gaming is taxing as hell.

What sort of fuckery is Nintendo employing to get 4K BotW working?
I mean, BoTW is a Wii U game. It shouldn't be difficult for Nintendo to run it at 4K60 DLSS on the new hardware. Although the reports of BoTW 4K60 goes against Rich finding of 18ms 4K DLSS frametime cost, so that's something to keep in mind.
 
Honestly, setting aside granular specifications - Seeing Plague Tale Requiem run as it did in DF’s video and then realizing that‘s without any dedicated config for the platform and is just a baseline on similar specs was enough for me to be excited. Games with dedicated effort behind their development for Switch 2 are gonna look good.
 
A lot of the information provided during the DF session confuses me. Granted, DF's findings aren't the word of god, but it just seems rather odd to me considering BotW was originally not an easy game to run, and 4K gaming is taxing as hell.

What sort of fuckery is Nintendo employing to get 4K BotW working?
BOTW runs at 900p on switch, not a crazy thought that it could just be native 4k60, no DLSS involved.
 
Anyway, I would not be shocked if the difference between the frametimes given in the video and the frametimes given in the NVIDIA documentation are that these games offloaded DLSS entirely to the tensor cores instead of having both the tensor cores and CUDA cores work on DLSS and they leave the CUDA cores idle during the DLSS step because it's usually <1ms of frametime.

This is my guess.
 
Inference based on T234 being 8 nm (@1:45)

then @29:39 he says
Easy findings to disregard when things don't, in fact, point to 8nm.
Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.
 
See this shit is what's so exciting to me. 2.2TF with a target resolution of 360p. 4.35TF

It seems the more the time passes without Nintendo announcing the Switch 2, the higher the clocks go 🤣
Handheld is already at ~720MHz, and docked [to reach 4.35TF] has even passed the ultimate best (and very unlikely) scenario, with that "stress test clock" at 1.38GHz.

If the announcement is planned to 2024, then we'll get to PS5 clocks before the year ends 🤣
 
A lot of the information provided during the DF session confuses me. Granted, DF's findings aren't the word of god, but it just seems rather odd to me considering BotW was originally not an easy game to run, and 4K gaming is taxing as hell.

What sort of fuckery is Nintendo employing to get 4K BotW working?

Well BOTW is a Wii U game at its core, so I would expect it to render nicely to 4K resolutions on new hardware with DLSS. Whereas some of the other titles in the video are far more demanding. Probably anything that ran at 900p or 1080p Switch Docked could hit 4K on Switch 2.
 
NVIDIA making a mobile pruned DLSS is an interesting idea because it offers significant advantages to them (getting their tech in way more games) and disadvantage (makes it harder to promote their unique architecture even though it executes neural networks much faster)
 
0
It seems the more the time passes without Nintendo announcing the Switch 2, the higher the clocks go 🤣
Handheld is already at ~720MHz, and docked [to reach 4.35TF] has even passed the ultimate best (and very unlikely) scenario, with that "stress test clock" at 1.38GHz.

If the announcement is planned to 2024, then we'll get to PS5 clocks before the year ends 🤣
I mistyped, I meant to put 3.45TF. Which, no, isn't expecting clock speeds to exceed what we saw in the leak. In fact, speeds of up to 4TF were explicitly present, I just doubt that's the target for production. Mockery is not, I think, productive.
 
That DF video was practically as impressive as I expected it to be. I need to say that for a handheld and tests without specific optimization, this thing definitely rocks. Nintendo's got a monster on their hands, a little one even.
 
Honestly, setting aside granular specifications - Seeing Plague Tale Requiem run as it did in DF’s video and then realizing that‘s without any dedicated config for the platform and is just a baseline on similar specs was enough for me to be excited. Games with dedicated effort behind their development for Switch 2 are gonna look good.
This mirrors my take away. I basically believe his results show the ground floor.
  • It will have at least twice the ram.
  • It may be clocked higher than this example config.
  • It will probably have access to more modern hardware features.
  • It will not be running on a windows machine! (how much did this effect his results?)
  • Games will be better optimized for it.
Since we don't know what bottlenecks Rich was running into, we don't really know if it will be solved in the Switch 2. It seemed like like paltry 4GB ram that mobile chip has could have had a big effect. It may be that 4k/60 isn't an issue, or not. I'm not super concerned with that though cause 1440p looks fantastic.
 
Last edited:
It does not, from the looks of it on Linux commits
LiC and others already looked at Nvidia data and Linux commits. T239 does not feature DLA and I'm puzzled as why he added that to his video (As, iirc, Oldpuck asked the same question here ages ago based on his conversation with Rich and we were able to see that T239 lacked DLA). Guess it's just something he didn't could cut from his video.
The DLA mention is a technical misstep of mine. I couldn't locate the DLA stuff, and told Rich that it wasn't eliminated
 
I mistyped, I meant to put 3.45TF. Which, no, isn't expecting clock speeds to exceed what we saw in the leak. In fact, speeds of up to 4TF were explicitly present, I just doubt that's the target for production. Mockery is not, I think, productive.

No need to feel hurt over a simple joke.

You mistyped the "handheld clock" too? Because the leak (which, btw, both LiC and Thraktor have already explained multiple times why we shouldn't use it as expected clocks for Drake) showed 660MHz as the lowest clock, which gives us 2.02TF (and not 2.2TF). To get to 2.2TF we're talking about ~720MHz already.

I couldn't imagine you mistyped for both clocks.

Anyway, like I said, it was just a joke.
 
You didn't read my post.


Reduce the OUTPUT resolution slightly to get below 16ms.


Then just use older upscaling to get you the rest of the way.
This is something it could be done, yes. Either a spatial pass after the render pipeline or some simple integer upscaling.
The DLA mention is a technical misstep of mine. I couldn't locate the DLA stuff, and told Rich that it wasn't eliminated
Ah, that explains it. Thank you for info.
 
If that was the case, I guess every developer except Falcom and Marvelous ignored them.
If the guideline exists, it's evident that it is relatevely new and we will see more games following it with the passing of time.

In any case, I don't know why we are doubting necrolipe so much in this precise topic that has not so much relevance. He is a reliable person who was given a lot of accurate reports and doesn't have any interest in making it up. But everyone is in their right to not trusting him.
 
Frankly, that DF video has me hyped even more.

It's just a ballpark estimation but they're running modern PC games with zero porting/optimization effort, and it runs pretty well at 1080p.
I don't care about 4K and never did. I want it to look good handheld primarily.

I have no doubt good port studios will do wonders with a little effort on any current-gen title, specially with Steam handhelds setting the baseline.

And regarding first-parties, when I see Metroid Prime or TotK running on a potato X1, I can barely imagine what they could do on the t239.
 
Wheeeeeeeeeeeee, the CUDA cores are idle while the tensor cores are active.


"
To accelerate the execution of Machine Learning
applications, recent GPUs use Tensor cores to speed up the
general matrix multiplication (GEMM), which is the heart of
deep learning. The Streaming Processors in such GPUs also
contain CUDA cores to implement general computations. While
the Tensor cores can significantly improve the performance of
GEMM, the CUDA cores remain idle when Tensor cores are
running. This leads to inefficient resource utilization. In this
work, we propose to offload part of the GEMM operations from
Tensor cores to CUDA cores to fully utilize GPU resources.
We investigated the performance bottleneck in such offloading
schemes and proposed architectural optimization to maximize the
GPU throughput. Our technique is purely hardware-based and
does not require a new compiler or other software support. Our
evaluation results show that the proposed scheme can improve
performance by 19% at the maximum."

Utilizing both doesn't help as much as you would think as it's hard to offload parts of it.

This is a pretty big issue for NVIDIA to solve, we'll see.
 
The developers who are good with animation and art direction are going to make incredible looking stuff with the Switch 2 specs, the others will still make stuff that looks a generation ahead of the current Switch, which should be enough for most consumers to see the improvements from Swithch 1 to 2. The DF video shows that at the bare minimum DLSS to 1080p is viable in docked mode for demanding games which is all I was hoping for from the video since it is not a perfect representation of what Switch 2 will do in it's own tailored environment.
 
Frankly, that DF video has me hyped even more.

It's just a ballpark estimation but they're running modern PC games with zero porting/optimization effort, and it runs pretty well at 1080p.
I don't care about 4K and never did. I want it to look good handheld primarily.

I have no doubt good port studios will do wonders with a little effort on any current-gen title, specially with Steam handhelds setting the baseline.

And regarding first-parties, when I see Metroid Prime or TotK running on a potato X1, I can barely imagine what they could do on the t239.
Yeah, this is actually an incredible sign for what we're about to get. This little chip actually does put up with these games just fine without any actual optimizations on the table, and of course... The comments on the video are full of people in denial, how charming.
 
i realize that everyone was hyped up about the 4k/60 rumors but, seeing in the DF vid that this could still produce picture quality at 1440p with a locked 30 FPS for some of these very taxing games is still impressive, imo. especially taking into consideration the possibility that nintendo might actually do a "pro" revision this time around, getting up to that 4k/60fps this generation doesn't look out of the picture to me.

plague tale looked honestly... legit! that was the most impressive one in the whole video.

question though, and i know this has been talked about heavily in this thread so apologies if it's been confirmed as not possible but, why wouldn't it be possible to DLSS from 480p? i noticed DF didn't bother testing that at all in the video, unless i missed it.
 
After having read/listen to many opinions on the latest warioware and its motion controls implementation, I once again was reminded why I have so much expectation for a better solution on switch 2. Having played it myself with others, Nintendo really must step up on something they created themselves but are now far from what the technology can offer. I just don't know if people who have already given up on motion controls would be enticed to try it again with better tech, IF that happens...
 
If DLSS takes like 5 ms on the tensor cores and the CUDA cores can't help, I wonder if Nintendo devs could do post-processing effect calculations simultaneously on the CUDA cores and then apply those effects immediately after to the DLSS'd image.

As someone who has worked with neural networks... It is hard to imagine how you could split up a neural network if the tensor cores and CUDA cores act semi independently of each other instead of in tandem.

Now I'm just confused by Rich's numbers though as NVIDIA documentation was clearly just using the tensor cores.
 
0
i realize that everyone was hyped up about the 4k/60 rumors but, seeing in the DF vid that this could still produce picture quality at 1440p with a locked 30 FPS for some of these very taxing games is still impressive, imo. especially taking into consideration the possibility that nintendo might actually do a "pro" revision this time around, getting up to that 4k/60fps this generation doesn't look out of the picture to me.

plague tale looked honestly... legit! that was the most impressive one in the whole video.

question though, and i know this has been talked about heavily in this thread so apologies if it's been confirmed as not possible but, why wouldn't it be possible to DLSS from 480p? i noticed DF didn't bother testing that at all in the video, unless i missed it.
The 720p and above results were already good enough for them, I suppose. I really hope they make another video with Alan Wake 2 soon, perhaps upscaling from there...
 
i realize that everyone was hyped up about the 4k/60 rumors but, seeing in the DF vid that this could still produce picture quality at 1440p with a locked 30 FPS for some of these very taxing games is still impressive, imo. especially taking into consideration the possibility that nintendo might actually do a "pro" revision this time around, getting up to that 4k/60fps this generation doesn't look out of the picture to me.

plague tale looked honestly... legit! that was the most impressive one in the whole video.

question though, and i know this has been talked about heavily in this thread so apologies if it's been confirmed as not possible but, why wouldn't it be possible to DLSS from 480p? i noticed DF didn't bother testing that at all in the video, unless i missed it.
I've never seen anyone say that 4K 60 fps would be anything close to a standard for current-gen games/ports, so I don't think the DF video really disproved any expectations here (tenuous PC comparison aside). 1440p 30 fps versions of games will still be very impressive (and some heavier titles will likely dip lower). Instead I'm seeing people go too far in the other direction and conclude that there won't be 4K games at all, which is just wrong.

As for 480p to 4K DLSS, I don't know if's actually viable quality-wise, and whether the native render savings are enough to be worth it, but you can certainly do it. You can use DLSS to go from 60p to 4K if you want to.
 
The CUDA cores being idle during the DLSS step actually raises some interesting electrical optimization questions.

Like, if the CUDA cores are idle in that step, you're saving a lot of power for those milliseconds (potentially, I suck at electrical engineering and may be wrong here)

So you could just clock the tensor cores extremely high for the DLSS step?
 
DLSS 720p>4K is like twice as expensive in frametime as DLSS 720p>1440p going by NVIDIA's documentation so I would expect 720p>1440p to be the standard.

But 18ms sounds incredibly high compared to NVIDIA's documentation.

image.png


Obviously these chips are all much stronger than the Switch 2, but the gap shouldn't be this large unless DLSS starts scaling really badly at low spec.

If it does, hopefully NVIDIA can develop a pruned DLSS neural network for the Switch 2... We'll see if they're interested as pruned neural networks are really expensive (in terms of dollars and manpower) to create and the results are less good (obviously)
So there are several paths here

1. The video is incorrect and something is going wrong with these various games because these DLSS frametimes don't line up at all with NVIDIA documentation.
2. DLSS is not very usable on the Switch 2 as it breaks down at lower tensor core/teraflop counts
3. The Switch 2 will receive a custom/mobile version of DLSS that is pruned heavily by NVIDIA and thus looks a good bit worse (maybe FSR2 bad?) but runs at like 10-20% the cost.
It’s not so surprising from a raw computational point of view. Just to pick a GPU from that table, say the 4090: it has 128 SMs and is clocked, in the Lovelace white paper, at 2520 Hz. By comparison, the underclocked 2050 in the video has 16 SMs and is clocked at 750 MHz. The 0.51 ms in the table for the 4090 would scale to 13.7 ms on this card, and that’s not the only consideration: memory bandwidth, post-processing at output resolution, unknown clock speeds in the DLSS programming guide table - pick your poison. 18 ms seems perfectly reasonable to me.

As far as pruning goes: you can’t use pruning in neural networks without sparse data structures. The only sparsity supported in the tensor cores is structured sparsity. If they’re using it in the Switch 2, they’re probably already using it in desktop DLSS. To me, it’s not a consideration.
 
I've never seen anyone say that 4K 60 fps would be anything close to a standard for current-gen games/ports, so I don't think the DF video really disproved any expectations here. 1440p 30 fps versions of games will still be very impressive (and some heavier titles will likely dip lower). Instead I'm seeing people go too far in the other direction and conclude that there won't be 4K games at all, which is just wrong.

As for 480p to 4K DLSS, I don't know if's actually viable quality-wise, and whether the native render savings are enough to be worth it, but you can certainly do it. You can use DLSS to go from 60p to 4K if you want to.
yeah, after reading through more of the replies in this thread, it does seem like the consensus after DF's analysis is more positive than i thought. maybe read too many YT comments lol.

ah okay, good to know. i guess my question now is, would rendering from 480p to 4k DLSS allow for better FPS performance (i.e. closer to 60) or is that basically a fool's errand? i suppose you wouldn't 100% know without actually testing it out, which is why i kind of wish DF did just to see.

edit: @Lancelot yeah alan wake 2 would be really interesting to see!
 
0
I recommend taking what Tech_Reve posts with a healthy grain of salt, since Tech_Reve did posted rumours with factually incorrect and likely incorrect information. I've put a couple of examples below.






One of the slides from the Microsoft leaks explicitly mentioned co-designing the GPU with AMD or licencing the Navi 5x GPU IP from AMD.
XBOX-SERIES-NEXT-GEN-SPEC-1200x671.jpg

And considering AMD mentioned being able to design Arm based SoCs for AMD's customers, and AMD rumoured to design Arm based SoCs for PCs, the rumours are probably not accurate with respect to Microsoft.

Are ARM CPUs generally better than x86 from a performance standpoint? I hear a lot about ARM being more power efficient but i thought that was based on its architecture.
 
0
It’s not so surprising from a raw computational point of view. Just to pick a GPU from that table, say the 4090: it has 128 SMs and is clocked, in the Lovelace white paper, at 2520 Hz. By comparison, the underclocked 2050 in the video has 16 SMs and is clocked at 750 MHz. The 0.51 ms in the table for the 4090 would scale to 13.7 ms on this card, and that’s not the only consideration: memory bandwidth, post-processing at output resolution, unknown clock speeds in the DLSS programming guide table - pick your poison. 18 ms seems perfectly reasonable to me.

As far as pruning goes: you can’t use pruning in neural networks without sparse data structures. The only sparsity supported in the tensor cores is structured sparsity. If they’re using it in the Switch 2, they’re probably already using it in desktop DLSS. To me, it’s not a consideration.

No, that's not the only way, that's just the easy way, lol.

You can cut whole branches instead of setting parameters to 0.

Lossy pruning that involves cutting neurons sucks and takes a ton of time to be clear, but it's very possible.
 
I've never seen anyone say that 4K 60 fps would be anything close to a standard for current-gen games/ports, so I don't think the DF video really disproved any expectations here (tenuous PC comparison aside). 1440p 30 fps versions of games will still be very impressive (and some heavier titles will likely dip lower). Instead I'm seeing people go too far in the other direction and conclude that there won't be 4K games at all, which is just wrong.

As for 480p to 4K DLSS, I don't know if's actually viable quality-wise, and whether the native render savings are enough to be worth it, but you can certainly do it. You can use DLSS to go from 60p to 4K if you want to.
Are there any good examples of how the end result would look if you go from very low resolution to 4K? Like tests on YouTube or such.
 
Reddit is such a bubble, you can‘t take anything seriously there without checking the sources at least at one other place elsewhere.

Anyway in 1999 I got 4 years old and I think it was when my older cousin gave me his old Gameboy. It stopped working two weeks after my birthday. It was my personal Y2K Problem.
Oh, we're the same age.
 
Rich went straight for the big boys to make that poor chip struggle, but I would have loved to see the results with some earlier, less taxing PS4 games like RE7 or Doom 2016.
 
Please read this staff post before posting.

Furthermore, according to this follow-up post, all off-topic chat will be moderated.
Last edited:


Back
Top Bottom