• Hey everyone, staff have documented a list of banned content and subject matter that we feel are not consistent with site values, and don't make sense to host discussion of on Famiboards. This list (and the relevant reasoning per item) is viewable here.
  • Do you have audio editing experience and want to help out with the Famiboards Discussion Club Podcast? If so, we're looking for help and would love to have you on the team! Just let us know in the Podcast Thread if you are interested!

StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (Read the staff posts before commenting!)

Right now, none of this changes my thoughts on what Ampere can do in the Switch, but it makes some of the decisions click for me in a new way. If nothing else, the emphasis on parallel operation shows how much dedicated ports might be able to squeeze extra performance by overlaying RT/Upscaling/Rendering tasks in ways that PC games don't
First off, thank you for the lovely write up. Very much enjoyed reading it. I have to say that I was surprised by how much emphasis there was on parallel operation. I know the smart peeps like you and @Thraktor had kinda touched on that a little bit as a "what if" for clawing performance back, but it seems like with how much emphasis there is on it, it seems like a matter of "when" and not "if" parallel operation would be used.

Maybe that's just my optimism speaking, though. Regardless, it definitely adds an extra wrinkle to the whole porting situation. Thanks again!
 
Yeah - that's why something like 15ms of DLSS cost actually doesn't make 4k60 DLSS impossible on Drake. Upscaling can be happening on the next frame without interfering with the frame's rendering.

I've been wondering about this - while I'm really excited to see if it gets used, is there any possibility that it could have issues in a power-constrained device like the Switch 2? As in, using all 3 types of cores at once could draw more power than using tensor cores separately from shader and RT cores, so the whole chip needs to run slower? Or is that not how it works?
 
My question about DLSS concurrence is: Wouldn't this eat up a shit ton of cache if you're running a neural network based on various buffers from frame N while also trying to rasterize frame N+1?

And it's not like mobile hardware likes cache misses at all, those are very bad for mobile RAM.

Maybe deferred frames would work better if you threw in a shit ton of 3D Cache, but NVIDIA isn't doing 3D Cache yet and 3D Cache is currently too electricity expensive for mobile hardware.

DLSS concurrence would also probably stop any other uses of the tensor cores (neural radiance caching as an example) from being realistically possible from a timing perspective. While the listed examples probably aren't viable in general on the Switch 2, other smaller neural networks could be developed by NVIDIA for other purposes. I expect VRAM compression and decompression to eventually be handled by the tensor cores and that would be very useful for mobile hardware and not really possible with DLSS concurrency.
 
Last edited:
I've been wondering about this - while I'm really excited to see if it gets used, is there any possibility that it could have issues in a power-constrained device like the Switch 2? As in, using all 3 types of cores at once could draw more power than using tensor cores separately from shader and RT cores, so the whole chip needs to run slower? Or is that not how it works?

I think the tensor cores plus RT cores only make up ~8-9% of the GPU on NVIDIA's current architecture so it would probably only be like 10% extra power consumption from the GPU so maybe like 2-3% extra for the whole system.
 
Nintendo would reuse the same power brick that the Switch has if they can. making something unorthodox as a round power brick is dumb for a myriad of reasons

I see no point to humoring cold readings. low tier engagement bait
 
Nintendo would reuse the same power brick that the Switch has if they can. making something unorthodox as a round power brick is dumb for a myriad of reasons

I see no point to humoring cold readings. low tier engagement bait
I know I'm NOT the silicon person, but from what I understand of T239, assuming it's on a reasonable node, the power brick also wouldn't NEED to be more powerful. Maybe it would need to be changed to comply with EU regulations, but I'm under the impression it already complies with them enough.

This is a power limited device because it needs to be held comfortably and have a battery life measured in hours rather than minutes, and that translates to running cool enough and being small enough, even when it's in TV mode, it doesn't change shape.

It's also a device where price optimisation is important and we have reason to believe Nintendo is very concerned with price optimisation given the likely screen technology. If the device doesn't need more juice, and I very much doubt it would, and the charger is compliant in relevant markets, it's a pretty clear cost cutting measure to keep it the same. Especially with Nintendo Switch being sold alongside the device, you benefit from economies of scale and production maturity. Frankly, electricity hasn't changed since 2017 and it's unlikely that they could find enough cost cutting measures in the brick itself to justify a redesign.

Furthermore, the existing brick has a rounded design. Heheh.
 
d7273bc66e176b9d2fe977caecf22207.gif



Hang in there friends. I know we've taken a bad beat this year but we gotta hold hope we are in the timeline that sees the Switch 2 become the best and most successful it can be. Now is a great time to explore franchises you haven't tried before, Destroy your backlog, or even try out some old classics that gave you joy in the past.
Hey, welcome back. Now that March and probably April is truly a bust, any whispers from your shareholder buddies?
 
* Hidden text: cannot be quoted. *
I am glad no one is falling for this.
I'd probably go as far as saying if we don't hear anything in time for the meeting, or at the meeting, then this thing isn't launching by March 2025.
Can you imagine the news when it won't come out until fall or holidays season?

Well, truth be told, no one knows how long the delay is. Just it is in 2025.
 
I've been wondering about this - while I'm really excited to see if it gets used, is there any possibility that it could have issues in a power-constrained device like the Switch 2? As in, using all 3 types of cores at once could draw more power than using tensor cores separately from shader and RT cores, so the whole chip needs to run slower? Or is that not how it works?
Running all the cores at once eats up additional power, absolutely. But that doesn't mean that running the core faster-but-in-serial is better for power management. Having the cores idle also eats small amounts of power, just to keep them on.

I would bet that running them on top of each other is the best bet for frame/watt, especially if it keeps you in a slightly lower clock speed.
 
Can we just not entertain this SamusHunter wannabe? That was already vivid enough thank you, is it really neccesary to keep the thread alive?
 
d7273bc66e176b9d2fe977caecf22207.gif



Hang in there friends. I know we've taken a bad beat this year but we gotta hold hope we are in the timeline that sees the Switch 2 become the best and most successful it can be. Now is a great time to explore franchises you haven't tried before, Destroy your backlog, or even try out some old classics that gave you joy in the past.
Hey, welcome back.

We've got 2 weeks until the big day, so let's just take our time. Also play Shadows of Doubt, that game is shockingly great for as jank as it is.
 
The switch 2 will have RTX and DLSS, there will be Pokémon, Mario, Zelda, Kirby, Metroid, Sonic, and 3rd party support by Ubisoft.
 
First off, thank you for the lovely write up. Very much enjoyed reading it. I have to say that I was surprised by how much emphasis there was on parallel operation. I know the smart peeps like you and @Thraktor had kinda touched on that a little bit as a "what if" for clawing performance back, but it seems like with how much emphasis there is on it, it seems like a matter of "when" and not "if" parallel operation would be used.

Maybe that's just my optimism speaking, though. Regardless, it definitely adds an extra wrinkle to the whole porting situation. Thanks again!

One thing worth keeping in mind is that concurrency is something which developers only have indirect control over. Whether an FP32 op and a tensor core op execute concurrently within an SM partition is decided by the dispatch unit, and is going to be determined by a lot of low-level architectural details, like the number of cycles each instruction takes to run. The positive side of this is that, even if a developer ignored (or wasn't even aware of) concurrency and just threw a bunch of shaders, tensor core and RT code at the GPU, they would likely get some benefit of concurrency automatically.

From a console developer's point of view, you'd be looking to maximise opportunities for concurrency. This is one of those cases where optimising around a single hardware target really helps. At a simple, high level, you'd want to make sure that your (say) lighting pass and your DLSS pass can actually be issued to the same SM at the same time. Each SM has a limited amount of register space and a limited number of total warps/threads it can support at one time. If your lighting pass thread block and the DLSS thread block have a combined register usage that exceeds what's available in an SM, or if they have more combined threads than an SM can support, then they'll never be issued to the same SM at the same time, so you'll get no concurrency.

Of course managing thread block sizes/register use to ensure you're getting good utilisation of the GPU isn't new for game developers, but these limits can change from architecture to architecture, even from a single manufacturer, so having just one architecture to optimise for simplifies things a lot. I'm sure there's much more low-level optimisation to be done from that point, driven by profiling, but Nvidia doesn't publicly document most of the lower-level architectural details which come into play here, so we can only guess at how easy it would be to achieve high levels of concurrency across the different execution units. I would assume that tensor core operations require many cycles to complete, and therefore are well-suited to concurrent execution, but that's impossible to say without access to the documentation.

Another factor is memory bandwidth, cache usage, etc., as your regular graphics code and DLSS will be competing against each other for these resources. You could easily have a situation where the tensor core is working on an instruction for DLSS, and the dispatch unit has the opportunity to issue an FP32 instruction which would execute in parallel to it, but can't, because the warp in question is stalled waiting for data from RAM, because previous DLSS memory accesses pushed that data out of the cache. This is another case where optimising around a single architecture helps, but it's still going to be a limiting factor in terms of how much performance benefit you're going to get out of concurrency.
 
no, they prioritize in making Animal Crossing 6 run at 60fps
I get that. But don't you think that a next gen animal crossing wouldn't have enough if resources to do 4K60? I mean we aren't in hyrule.
Animal Crossing is not a franchise that demand a lot for a console, it will be easy to do 4K60 on Animal Crossing 6
It wasn't 1080p60 on Switch. So I both lack confidence that they'll choose to max out the display next time, or even that they'll choose to prioritize frames over resolution. Rare Switch game with a worse frame rate than its Wii predecessor.
Remember, Animal Crossing for the GameCube was literally a port of the N64 game with - as far as I know - literally no graphical changes whatsoever!
Well, higher resolution and frame rate.
 
animal crossing on switch has loads of visual effects and is overall pretty high fidelity, I really don't understand assessments to the contrary
 
It wasn't 1080p60 on Switch. So I both lack confidence that they'll choose to max out the display next time, or even that they'll choose to prioritize frames over resolution. Rare Switch game with a worse frame rate than its Wii predecessor.
Someone here said it could be because of the object permanence.
You're talking about the Switch , right?
No, I am talking about the Switch 2. I am using the same tactic ninspider is doing. Repeating known info to get whatever
bait he wants to get out of this place.
 
you could have pretty much said anything about nintendo switch pro/2 online in the past four years except a release date
 
Please read this staff post before posting.

Furthermore, according to this follow-up post, all off-topic chat will be moderated.
Last edited:


Back
Top Bottom