StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (Read the staff posts before commenting!)

ItWasMeantToBe19 · Apr 16, 2024

ninspider said:
* Hidden text: cannot be quoted. *

I mean, at least this is a newish gimmick?

BreakAtmo · Apr 16, 2024

Darknut85 said:
Either they could make a custom chip with 32 bit support, build in a 32 bit supporting chip soley for BC or create an translation layer.

Well, at least there's options.

Semi Lazy Gamer · Apr 16, 2024

Darknut85 said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Discostew · Apr 16, 2024

Darknut85 said:
Either they could make a custom chip with 32 bit support, build in a 32 bit supporting chip soley for BC or create an translation layer.

Assuming Nintendo wants to support BC that far back, I would imagine a translation layer would be enough. They'd already have it for the GPU side of things, so why not the CPU? This is Switch 3 trying to run Switch games, so there would likely be plenty of power to do so.

Anatole · Apr 16, 2024

oldpuck said:
This is fantastic, thank you! Nvidia seems to be pushing the FP4 support specifically for generative LLMs, so perhaps there is some super narrow use case there? Regardless, they're only talking about it in relation to Blackwell, and right now all their public Blackwell info is about the datacenter chips.

It'll be interesting to see if Nvidia continues to segment out some of these features for enterprise customers, where the margins are still high, or let them flow down into consumer products.

Happy to oblige!

Low precision is definitely valid for some applications; I know someone posted that 1.58-bit LLM paper in here at some point. I’m just skeptical of FP4 specifically because, on the surface level, it doesn’t seem like it offers any range advantage over INT4, which is the primary reason to use floating points. It will come down to how effective the scaling is; they call it “microtensor” scaling in the brief they’ve released. If I had to guess what that is, I suppose they are probably breaking up larger tensors into a bunch of smaller ones and individually scaling each of those tensors to fit in the FP4 dynamic range. But as far as I know, there’s no further public information yet.

Until Nvidia actually proves with independently verifiable data that the quality of standard architectures trained on or operating in FP4 exceeds INT4 or is comparable to FP8/FP16, I’m treating it all as marketing speak. And either way, I don’t think low precision will work well for DLSS, unfortunately.

EDIT: I actually found what I believe is the specification, with a lot of info! I’ll write it up sometime this week. Short version: FP4 does have some major caveats and often has significantly reduced quality. “Microtensor scaling,” it turns out, indeed involves taking one of the dimensions of your tensor (for example, the column of a matrix) and normalizing all the elements along that axis to the maximum value, instead of normalizing all the elements in the tensor to the global maximum. Anyway, more to come!

olobolger · Apr 17, 2024

ninspider said:
* Hidden text: cannot be quoted. *

Magnet buttons

lattjeful · Apr 17, 2024

ninspider said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Andre3333456 · Apr 17, 2024

ninspider said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

palemire · Apr 17, 2024

fucking magnets how do they work?

Eclissi · Apr 17, 2024

palemire said:
fucking magnets how do they work?

Ask Ninspider: he knows

thegodsend · Apr 17, 2024

I think what he is alluding to is magnets to connect the joycons to the body, but I also think this is all fake

Aufhebung RPG · Apr 17, 2024

Got it, his id is Ninspider then he must be referring to the spider magnet!

ertaboy356b · Apr 17, 2024

If it's magnet, then probably a clam shell design where the magnet engages the sleep function.

ShadowFox08 · Apr 17, 2024

Dakhil said:
Samsung Develops Industry’s Fastest 10.7Gbps LPDDR5X DRAM, Optimized for AI Applications

Industry-leading features come with 25% higher performance, 30% more capacity and 25% higher power efficiency The new LPDDR5X is the optimal solution for future on-device applications and is expected to expand adoption into PCs, accelerators, servers and automobiles

news.samsung.com

Although I don't really see LPDDR5X-10700 being adopted by Nintendo any time soon, especially since I doubt JEDEC is going to formally approve LPDDR5X-10700, I still find this fascinating.

micron got 9.6 at least. But that would be too late for Switch 2.

Anyway, Samsung's version will be mass produced in latter half of 2024 and is said to give 25% power savings.. Perhaps it's possible for a Switch 2 revision.. Especially if launch switches have lpddr5(x) from Samsung

Stinky Horse · Apr 17, 2024

Dakhil said:
Samsung Develops Industry’s Fastest 10.7Gbps LPDDR5X DRAM, Optimized for AI Applications

Industry-leading features come with 25% higher performance, 30% more capacity and 25% higher power efficiency The new LPDDR5X is the optimal solution for future on-device applications and is expected to expand adoption into PCs, accelerators, servers and automobiles

news.samsung.com

Although I don't really see LPDDR5X-10700 being adopted by Nintendo any time soon, especially since I doubt JEDEC is going to formally approve LPDDR5X-10700, I still find this fascinating.

That's kind of nuts, almost twice the speed that Steam Deck had.

darthdiablo · Apr 17, 2024

ninspider said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Brubi · Apr 17, 2024

ShadowFox08 said:
micron got 9.6 at least. But that would be too late for Switch 2.

Anyway, Samsung's version will be mass produced in latter half of 2024 and is said to give 25% power savings.. Perhaps it's possible for a Switch 2 revision.. Especially if launch switches have lpddr5(x) from Samsung

Or the delay was because of this and we get that nice LPDDR5X Memory from samsung.
And yeah that is insane hopium and very likely not the case

PedroNavajas · Apr 17, 2024

Anatole said:
Happy to oblige!

Low precision is definitely valid for some applications; I know someone posted that 1.58-bit LLM paper in here at some point. I’m just skeptical of FP4 specifically because, on the surface level, it doesn’t seem like it offers any range advantage over INT4, which is the primary reason to use floating points. It will come down to how effective the scaling is; they call it “microtensor” scaling in the brief they’ve released. If I had to guess what that is, I suppose they are probably breaking up larger tensors into a bunch of smaller ones and individually scaling each of those tensors to fit in the FP4 dynamic range. But as far as I know, there’s no further public information yet.

Until Nvidia actually proves with independently verifiable data that the quality of standard architectures trained on or operating in FP4 exceeds INT4 or is comparable to FP8/FP16, I’m treating it all as marketing speak. And either way, I don’t think low precision will work well for DLSS, unfortunately.

EDIT: I actually found what I believe is the specification, with a lot of info! I’ll write it up sometime this week. Short version: FP4 does have some major caveats and often has significantly reduced quality. “Microtensor scaling,” it turns out, indeed involves taking one of the dimensions of your tensor (for example, the column of a matrix) and normalizing all the elements along that axis to the maximum value, instead of normalizing all the elements in the tensor to the global maximum. Anyway, more to come!

The main thing, we are still learning how NN (specially the large ones) really work. I have always been in the camp that continuity (aka high precision floats) was an artifact of the methods we were using for training the neural networks and there was no reason why purely binary neurons would not work. This has been a bit fringe position for a long time, after all the term "differentiable programming" its been used for deep learning. But it seems that we will be proven right in the end. I mean, 4 bit is already much closer to discrete maths than continuous maths already.

However, all this is talking about from a mathematical and theoretical CS point of view. The question if 1, 1.56, 2, 4, 8,... bits would be more efficient from a hardware pov is out of my expertise.

Quangcute03 · Apr 17, 2024

Well Nvidia just released the RTX A400, and that thing has 6SMs (?!), and is rated for 2.7TFLOPS of FP32 perf. Would be nice for someone to test gaming performance of that thing as a proxy of Switch 2 performance.

Zedark · Apr 17, 2024

Quangcute03 said:
Well Nvidia just released the RTX A400, and that thing has 6SMs (?!), and is rated for 2.7TFLOPS of FP32 perf. Would be nice for someone to test gaming performance of that thing as a proxy of Switch 2 performance.

Not sure if it adds anything beyond the RTX2050M comparison, especially considering the VRAM is also limited at 4GB like in the RTX2050M. A downclocked A1000, on the other hand, might give some additional insight into docked performance with its 8GB of VRAM (with the necessary caveats), but then again the faster RAM bandwidth would still obfuscate things. No such thing as a perfect point of comparison, unfortunately.

Dakhil · Apr 17, 2024

ShadowFox08 said:
Anyway, Samsung's version will be mass produced in latter half of 2024 and is said to give 25% power savings.. Perhaps it's possible for a Switch 2 revision.. Especially if launch switches have lpddr5(x) from Samsung

Samsung mentioned that LPDDR6 is coming in 2026 at the earliest.

And assuming LPDDR6 has actual power efficiency improvements compared to LPDDR5/5X, unlike how there weren't any power efficiency improvements between LPDDR5 and LPDDR5X, then using LPDDR6 makes much more sense than using LPDDR5X-10700, especially since JEDEC's very likely to officially validate LPDDR6, whereas I don't really see JEDEC officially validating LPDDR5X-10700, especially since I haven't seen JEDEC officially validating LPDDR5X-9600.

ItWasMeantToBe19 · Apr 17, 2024

Outside of RAM and architecture, is there anything coming in the near (<5 years) future that would really justify a Switch 2 Pro.

Same cores/clocks/node but with faster RAM and more advanced architecture would be kind of weird for a revision. Not sure how easy it would be to program both versions.

darthdiablo · Apr 17, 2024

ItWasMeantToBe19 said:
Outside of RAM and architecture, is there anything coming in the near (<5 years) future that would really justify a Switch 2 Pro.

Same cores/clocks/node but with faster RAM and more advanced architecture would be kind of weird for a revision. Not sure how easy it would be to program both versions.

Not much I can see that would justify Switch 2 Pro, other than maybe minimization, increased internal storage, better display type, and increased clock speeds (which you already mentioned).

I am skeptical Nintendo would add RAM, that runs a real risk of dividing up the Switch 2 library, something I don't think Nintendo is too keen on doing.

Increasing RAM means a real possibility of games being designed mainly/solely for higher RAM version, which means running poorly or not at all on lower RAM version. Expansion for DK64 on N64 was less than ideal scenario.

ItWasMeantToBe19 · Apr 17, 2024

It's pretty hard to see where mobile gaming hardware goes after TSMC 5nm+ other than just architectural improvements and maybe just flooding the chips with tensor cores. 3nm and 2nm do not seem good at all.

I think the current silicon distribution for RTX is like 90% CUDA, 10% tensor and RT cores... Could see a future where it's like 30% CUDA, 60% tensor, 10% RT and PhysX.

ItWasMeantToBe19 · Apr 17, 2024

We currently have neural networks for

1. Upscaling and anti-aliasing
2. Frame generation
3. Ray tracing denoising
4. Ray tracing caching

And will probably have neural networks in the future for

5. VRAM compression/decompression
6. Temporal ghosting cleanup

I believe there are also some theoretical papers on using neural networks to generate simulated dynamic global illumination, but these don't seem like they'll continue as we'll just have full RTGI for all games that want it within like 5-10 years.

But we'll see how many gaming functions we can move to neural networks that can be sped up with tensor cores and thus justify flooding a chip with tensor cores... Already have a decent bit.

theguy · Apr 17, 2024

redmutineer75 · Apr 17, 2024

ItWasMeantToBe19 said:
We currently have neural networks for

1. Upscaling and anti-aliasing
2. Frame generation
3. Ray tracing denoising
4. Ray tracing caching

And will probably have neural networks in the future for

5. VRAM compression/decompression
6. Temporal ghosting cleanup

I believe there are also some theoretical papers on using neural networks to generate simulated dynamic global illumination, but these don't seem like they'll continue as we'll just have full RTGI for all games that want it within like 5-10 years.

But we'll see how many gaming functions we can move to neural networks that can be sped up with tensor cores and thus justify flooding a chip with tensor cores... Already have a decent bit.

Given that Nvidia is finally dipping its toes into multi-die with Blackwell, maybe we could see separate dies for the shaders and tensor cores. I'm not sure if the latency hit would be too much for gaming, but it could be a good way to greatly increase neural performance without sacrificing shaders in the limited die space.

oldpuck · Apr 17, 2024

BreakAtmo said:
Does this mean that Switch 1 BC for Switch 3 is out of the question? Or would they be able to get around this issue somehow?

Just to slightly expand other answers, yeah, translation/emulation is potentially the answer. With generational leaps getting smaller, Switch 1->Switch 2 probably doesn't have quite enough performance to get there consistently.

Nintendo is not longer accepting 32-bit software into the eShop already, as I understand. So Switch 2 software won't use the 32-bit capabilities of the hardware, which sets them up nicely for Switch 3

oldpuck · Apr 17, 2024

ninspider said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Brofield · Apr 17, 2024

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

ItWasMeantToBe19 · Apr 17, 2024

redmutineer75 said:
Given that Nvidia is finally dipping its toes into multi-die with Blackwell, maybe we could see separate dies for the shaders and tensor cores. I'm not sure if the latency hit would be too much for gaming, but it could be a good way to greatly increase neural performance without sacrificing shaders in the limited die space.

This seems like it would increase electricity consumption a lot which is hard for mobile hardware

Concernt · Apr 17, 2024

Nintendo Switch already contains many magnets, the meaninglessness of all this is impressive.

Concernt · Apr 17, 2024

Brofield said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Hermii · Apr 17, 2024

Concernt said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

darthdiablo · Apr 17, 2024

Hermii said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Brofield · Apr 17, 2024

Concernt said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Raccoon · Apr 17, 2024

Brofield said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Concernt · Apr 17, 2024

Raccoon said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Raccoon · Apr 17, 2024

Concernt said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Concernt · Apr 17, 2024

Brofield said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Concernt · Apr 17, 2024

Raccoon said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

kvetcha · Apr 17, 2024

Brofield said:
* Hidden text: cannot be quoted. *

I imagine they could be used to provide a smoother and more elegant-feeling latch. Just enough that it sort of pulls itself into place before clicking.

Depends how premium Nintendo wants these devices to feel.

Switcharoo · Apr 17, 2024

Concernt said:
Nintendo Switch already contains many magnets, the meaninglessness of all this is impressive.

Where exactly btw? The coils?? Or the joycon metal clips?

Brofield · Apr 17, 2024

Raccoon said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Concernt · Apr 17, 2024

Switcharoo said:
Where exactly?

Speakers, HD rumble motors, fan motor, at least. The problem with more magnets is also that you could interfere with these components and the motion control elements. Especially if they're strong enough to hold controller to console.

Concernt · Apr 17, 2024

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Hermii · Apr 17, 2024

Concernt said:
* Hidden text: cannot be quoted. *

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

navierStokes · Apr 17, 2024

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

Zedark said:
Not sure if it adds anything beyond the RTX2050M comparison, especially considering the VRAM is also limited at 4GB like in the RTX2050M. A downclocked A1000, on the other hand, might give some additional insight into docked performance with its 8GB of VRAM (with the necessary caveats), but then again the faster RAM bandwidth would still obfuscate things. No such thing as a perfect point of comparison, unfortunately.

Yup, I think the proxy T239 experiment by DF provides a sufficient overview with the lowest ampere card.
I've played enough with the RTX 3050 mobile to see similar results as DF, and to test DLSS at higher resolution you need that VRAM. Some games also don't run sufficiently well at 1080p to estimate performance (imo) so it's mostly "handheld" tier resolution that has some validity IMO.
If I took the experiments as a conclusion, I think that the primary disappointment would be docked performance going by the metrics. For a handheld, you are limited by the thermal design (e.g. it has to cool ~15W in a docket profile), so that provides the upper bound for the clock speed you're able to configure, oversimplified ofc. The lower bound would be primarily dominated by the battery life for a handheld.

kasparov77 · Apr 17, 2024

Lord Azrael said:
wtf did I just read in this thread. Feel absolutely matters, yes... to a device you'll actually be handling. A controller feeling cheap/light would be a bad. A console that sits under your TV, never to be touched? As long as the build quality isn't complete shit, no one's gonna care. Nintendo is a known quantity, ain't nobody out there dismissing them because their console is lighter. The size of Switch's contemporaries is one of the sore sticking points that people dunk on them for. Go back even just a few generations and consoles were a fraction of the weight.

I don't see how the weight of a portable has anything to do with it's assumed cost.

The psp 3000 is lightweight as heck, WAY more than a switch. And that's why even though it isn't ergonomic, it's not uncomfortable to hold for long periods of time like a switch without a grip.

One of the key issues with switch comfort is related to how heavy it is compared to the ds lite and gba. Although the ds still felt awful to hold for long periods of time due to it's cramped design and lack of grip.

Also, something feeling cheap has way more to do with design and finish. The psp at the time felt like a premium product because of how glossy plastic looks shiny and pretty and how minimalistic and well-made the buttons and overall handheld's face was.

Onetendo · Apr 17, 2024

Hidden content is only available for registered users. Sharing it outside of Famiboards is subject to moderation.

StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (Read the staff posts before commenting!)

Manakete

Moblin

Like Like

Moblin

Octorok

Tektite

Like Like

Moblin

bing bong singalong

Moblin

Tektite

History is tragedy becoming comedy

Resident Troll

Paratroopa

Bob-omb

+5 Death Stare

Rattata

Boo

Rattata

The guy with the ToV avatar

2010 experience points!

Manakete

+5 Death Stare

Manakete

Manakete

Paratroopa

Bob-omb

Like Like

Like Like

Magical Famicomrade

Manakete

Optimism is non-negotiable

Optimism is non-negotiable

Manakete

+5 Death Stare

Magical Famicomrade

Fox Brigade

Optimism is non-negotiable

Fox Brigade

Optimism is non-negotiable

Optimism is non-negotiable

hoopy frood

Piranha Plant

Magical Famicomrade

Optimism is non-negotiable

Optimism is non-negotiable

Manakete

Octorok

Bob-omb

Rattata