Alovon11
Like Like
- Pronouns
- He/Them
HehehehehehhehehehehehehehehhehehehehehehehehehehehehehehehehehehehehehehehhehehehehehehehehehehehehHopefully a very big one. The bigger the better for more bandwidth.
HehehehehehhehehehehehehehehhehehehehehehehehehehehehehehehehehehehehehehehhehehehehehehehehehehehehHopefully a very big one. The bigger the better for more bandwidth.
Someone quoted it at Nikki on twitter, who seemed to confirm. But I assumed that was from someone with access to a clone of the repo and could see file modification dates (all I've seen is a repo export, which has dates of exfiltration, all on the 21st, 2 days before Nvidia found out about the leak, and a week before the rest of us).
2019 modification dates would actually match with a 2020 ship date for dev tools (matching with leaks of dev kits) if NVN2 was branched from NVN about a year beforehand.
The repo export is all there is. There are files from with copyright dates of 2012 (at least) through 2022. NVN2 files aren't easily placed because some started as NVN1 files, so they go as far back as 2014/2015. 2019 may be or may not be the origin date for some of the files, but it's not a universal constant. Throwing it around just causes more confusion than it solves.Wasn’t it that there are dates from 2019? Not that it started in 2019?
Getting a little ahead of yourself there. DLSS is not a magic bullet, I'd really caution treating it like a simple multiplier. Especially for handheld mode.Well we know the GPU config now outside of clock speeds, and the CPU is at least known to the CPU cores used so I say
Portable: Series S performance at 720p after DLSS
Docked: Rivaling the PS5 after DLSS?
Hopefully Big Dick Switch can still fit in the original dock.Hopefully a very big one. The bigger the better for more bandwidth.
Getting a little ahead of yourself there. DLSS is not a magic bullet, I'd really caution treating it like a simple multiplier. Especially for handheld mode.
All we can really know at the moment is that the API (in its current form) sees 12SMs which gives us minimum and maximum bounds to work with.
Yeah when native.But the info we do have now…regardless of how/if the tensor cores and rt cores are utilized…definitely implies native power that would be similar to ps4 power portable and ps4 pro power docked.
No?
Yeah when native.
So even if you say DLSS isn't a 2x multiplier in a portable at least (it should be at least a 2x boost when docked), it should bring the portable mode performance at least 1.5x in performance, being a tad behind the PS4 Pro/Series S GPUs.
Assuming it uses the same or higher clocks than the original Switch I think that's a fair comparison. Flops wise it won't match up quite the same but efficiency gains from the newer architecture should make up for that.But the info we do have now…regardless of how/if the tensor cores and rt cores are utilized…definitely implies native power that would be similar to ps4 power portable and ps4 pro power docked.
No?
Getting a little ahead of yourself there. DLSS is not a magic bullet, I'd really caution treating it like a simple multiplier. Especially for handheld mode.
All we can really know at the moment is that the API (in its current form) sees 12SMs which gives us minimum and maximum bounds to work with.
Misintended reply?Ah so you are saying he’s over selling the DLSS comparisons, got it.
But the info we do have now…regardless of how/if the tensor cores and rt cores are utilized…definitely implies native power that would be similar to ps4 power portable and ps4 pro power docked.
No?
Probably more accurate to say that there were no rumors, it was informed speculation from folks in these threads based on what was going on with the Tegra line, and tidbits of chip leaks from Nvidia leakers. That speculation always came from the perspective that Nintendo would be somewhat conservative, and would stick to a similar power profile and size as the original Switch. What's come out from this Nvidia hack has changed that mindset since the chip appears to have significantly more cores than what people were thinking it would have.Really? That sounds insane! The early rumors were that it's around Xbone in terms of power. Now it's above PS4!?!
Yeah, even though DLSS when docked will shoot past the Series S, it will fall behind the PS5 still.How will this perform?
It’s hard to say. We don’t have all the details and even if we did there are software decisions that matter
But how will it perform though?
We still need to see software running to be sure - it devs don’t make exclusives that can really take advantage of the power of the new machine, it could still be held back by the old Switch
GRAPHICS GRAPHICS GRAPHICS HOW WILL IT DO??????
Okay okay. How is this?
While docked and when running games that have been optimized for its power it will feel like maybe the least performant member of the current generation - instead of feeling like an older generation machine that happens to play Mario and fit in your pocket.
This is Nintendo closing - but not eliminating - the gap between it and it’s competitors that has existed since the Wii.
Yeah, even though DLSS when docked will shoot past the Series S, it will fall behind the PS5 still.
At best, assuming a developer maximizes the benefits of the Tensor Cores, DLSS, and all that and drew every ounce of power out of Drake, it likely could match, or even shoot a little bit above PS5 (Assuming the PS5 is running at native performance levels here)
but it would require a hecklot of effort and sort of exposes the big thing that makes Drake hard to predict after DLSS outside of a 2x Multiplier on average when docked because it's all software optimization.
What a dev does to take advantage of DLSS when given the environment to actually optimize around it is an unknown quantity to us as all we have is the average unoptimized DLSS numbers from PC in which the 2x average comes from.
But something else to consider is the exponential performance cost of increasing resolution that DLSS dodges but PS5/Series S|X have to worry about for the most part unless you are using UE5's TSR
Misintended reply?
I am saying that assuming a 2x boost in both it would reach that level.
But even if portable mode DLSS can't boost as high in effective performance it still would not be weak in any way for Portable mode.
Watch Nintendo match 76.8 % of that. 2.3 TFLOPs "Please understand!"Theoretically the maximum is about 3TF, if we assume the hard upper limit of Orin GPU clocks (1GHz) applies to Drake.
25 watts or lower makes sense since Orion NX's highest profile is that much. Of course that's for the SOC, and Nintendo and Nvidia can inactivate the camera stuff, while having camera and other irrelevant machine stuff off. Would be interesting if Nintendo goes with 10 watts for handheld modeAh, I didn't know that, thanks. I doubt Nintendo would have quite as much flexibility with a smaller GPU and tighter power limits, but still very interesting nonetheless.
It's hard to say. I don't think compression has much effect at this point, Nvidia already had framebuffer compression technology back with Maxwell, and there's only so far you can go with compression. The bigger question is probably how much of a difference the bigger cache makes. All of Nvidia's recent GPU architectures (since Maxwell, I believe) use tile-based renderers, where the idea is that the tile being rendered is stored in cache, and therefore the most intensive memory accesses are kept to the cache, without hitting actual memory. However, I'm not really in a position to speculate on how much of an impact the larger cache would have. Some, certainly, but it's impossible to say how much without careful profiling of Ampere's memory access patterns, which we don't have.
Thanks for this. I wasn't really sure on that part, so it's very interesting to read more on it.
If it is the case that the new model shares a dock with the OLED model, and if the ability of the OLED dock to deliver 39W is based on supporting the new model (both reasonably big ifs), then I would assume that the 39W is to cover the maximal use-case of both operating at full power and fully charging the battery at the same time. So probably something like 25W of actual power draw plus around 14W for charging. Still, 25W isn't a small amount for a device like the Switch. Steam Deck has a 25W maximum power draw in a slightly thicker case and by all accounts the fan on it is pretty loud, so if Nintendo are hitting that kind of power draw I hope they've got a quiet fan solution sorted out.
I will note that the 8nm that Orin (and therefore Drake likely) uses is different than the 8nm that the Consumer Ampere cards use.Watch Nintendo match 76.8 % of that. 2.3 TFLOPs "Please understand!"
5-6 TFLOPs for 39 watts for the whole system sounds way too good to be true. Especially considering steam deck runs up to 30 watts for 1.6 TFLOPs GPU max and maybe 3Ghz for it's 4 core CPU
25 watts or lower makes sense since Orion NX's highest profile is that much. Of course that's for the SOC, and Nintendo and Nvidia can inactivate the camera stuff, while having camera and other irrelevant machine stuff off. Would be interesting if Nintendo goes with 10 watts for handheld mode
Anyway, portable PS4 power using 10 watts for the whole system sounds too good to be true. There is absolutely no way we can get that on an 8nm Samsung. If it isn't 5nm more efficient, it's impossible.
Man 12 SMs sounds too good to be true for Switch 2, but when even Thraktor is hyped about this and thinks it's pausigos plus a 192 bit bus bandwidth, and when people are more optimistic to a node more efficient and newer than 8nm samsung, it's pretty crazy. Usually we are on WUST hype since Wii era days and end up with a "Please understand" like half the clockspeedd we hoped from leakers (I remember when Emily said switch tx1 would be close to xbone, but got the fp16 1 TFLOP count).
We'll see how it plays out and if hackers really do release clock speeds tomorrow. Gonna expect steam deck like specs with 8SMs for docked before DLSS is counted just to not get myself dissapointed. I hope I'm wrong though.
I look at it from a vague view of possible support, in which I'm guessing will be Switch like, but better. Seems like PS4/XB1 level stuff should be fine, while PS5/XBS ports might actually be reasonably plausible (rather than say, the "impossible" Switch ports that took some herculean dev efforts).Yeah, this is all very exited, but people should have on mind with last 3 Nintendo consoles we always got weaker hardware (or lower clocks) than was generally expected.
I dont say this time will be similar, but its always better to stay cool and to have lower expectations.
Yeah, this is all very exited, but people should have on mind with last 3 Nintendo consoles we always got weaker hardware (or lower clocks) than was generally expected based on rumors.
I dont say this time will be same, but its always better to stay cool and to have lower expectations.
Yeah, this is all very exited, but people should have on mind with last 3 Nintendo consoles we always got weaker hardware (or lower clocks) than was generally expected based on rumors.
I dont say this time will be same, but its always better to stay cool and to have lower expectations.
I think the reason it’s not discussed so much when compared to those systems, is that in general NVidia’s architectures have been more memory efficient than their AMD counterpart thus they can perform about the same with less bandwidth. AMD made an effort to mitigate this bandwidth inefficiency of their architecture (and lack of going to GDDR6X) by including a very large L3 cache pool that they call “Infinity Cache” that increases the effective bandwidth throughput for their cards while also helping with energy efficiency.Talking about comparison with PS4, PS4 Pro and XsS hardware, here people mostly mention only CPU and GPU,
while not comparing memory bandwidth that's also very important when we talk about getting most of hardware or talking about potential bottleneck.
New Switch hardware based on current infos should have around 100GB/s memory bandwidth, PS4 has 176GB/s, PS4 Pro has 218GB/s and XsS has 224GB/s,
so that should be take in comparison also.
Nintendo has talked about BC in an implicit way with their investors in financial briefings, so it’s likely to have BC here. Nintendo hasn’t broken compatibility with a previous system unless they absolutely had no choice. They’ve aimed to keep BC with at least the direct predecessorYep
They tend to always be verrrrry conservatrice with clockspeeds.
But it should be a good machine it seems.
What about BC ?
The Swicth has way better hardware than what was expected for a Nintendo handheld. People were expecting a Vita+. It's only weak in concern trolling discourses, and for people refusing to accept what it is and go "but I have never took my swicth out of the dock!" (which I would say they are also concern trolling).Yeah, this is all very exited, but people should have on mind with last 3 Nintendo consoles we always got weaker hardware (or lower clocks) than was generally expected based on rumors.
I dont say this time will be same, but its always better to stay cool and to have lower expectations.
I'm now imagining a cockatrice that regularly watches Tucker Carlson.They tend to always be verrrrry conservatrice with clockspeeds.
A lot of people in threads similiar to this, did expect more but not a lot more.The Swicth has way better hardware than what was expected for a Nintendo handheld. People were expecting a Vita+. It's only weak in concern trolling discourses, and for people refusing to accept what it is and go "but I have never took my swicth out of the dock!" (which I would say they are also concern trolling).
A 'clockatrice'I'm now imagining a cockatrice that regularly watches Tucker Carlson.
I was curious to know if the switch had customization such as on chip memory (bigger caches or VRAM), but I remember discussing with dark10x that even the off the shelve TX1 was a full ~8-10x ahead of Vita, a generation jump. And we got close to the best possible hardware later with Mariko, IMHO.A lot of people in threads similiar to this, did expect more but not a lot more.
It came as a surprise to many (not all) that Nintendo went 100% off the shelf X1 at 20nm. We did consider the possibility of Nintendo making customizations to the TX1 as a likely one.
Adding 1 more gb and running their own clocks doesnt count as customizations.
Lazy devs, paired with "HW isn't as good as we thought LOL they cheaped out again". However, i think Nintendo will have at least one showcase title ready (May or may not be exclusive to the new device, idk)I think we need to keep in mind while DLSS is nice and all, 99% of games would probably not be optimized for it nor the new chip at launch to take the full potential of this rumored hardware.
I already see the "lazy devs" takes on the horizon.
Yep
They tend to always be verrrrry conservative with clockspeeds.
But it should be a good machine it seems.
What about BC ?
Edit : damn phone !
The Swicth has way better hardware than what was expected for a Nintendo handheld. People were expecting a Vita+. It's only weak in concern trolling discourses, and for people refusing to accept what it is and go "but I have never took my swicth out of the dock!" (which I would say they are also concern trolling).
I think the reason it’s not discussed so much when compared to those systems, is that in general NVidia’s architectures have been more memory efficient than their AMD counterpart thus they can perform about the same with less bandwidth. AMD made an effort to mitigate this bandwidth inefficiency of their architecture (and lack of going to GDDR6X) by including a very large L3 cache pool that they call “Infinity Cache” that increases the effective bandwidth throughput for their cards while also helping with energy efficiency.
All in all, the raw memory bandwidth is an issue yes, but perhaps not so pronounced. It should be able to trade blows decently well enough.
This does not factor the CPU of course, I think most of us agree that it won’t be anywhere near the other three consoles.
This is an unfair criticism, since we didn't knew that the NX was going to be a handheld with docked mode. The at least XB1 performance prediction was in the context of Nintendo releasing a traditional console. If there were some people legitimately thinking that once we knew it was a Tegra based handheld device, then they were indeed deluded.I talking about general expectations based on credible rumors and leakers not about troll comments (it will be little stronger than Vita, will sell like Wii U..).
Months before Switch revel, based on rumors (on NeoGaf and plenty people from here were also there back than) most people expected performance similar to XB1.
For instance, I also remember when was generally expected that Switch ARM CPU will be clocked at 1.5GHz,
and then disappointed few months later when actually proved its 1GHz.
This is an unfair criticism, since we didn't knew that the NX was going to be a handheld with docked mode. The at least XB1 performance was in the context of Nintendo releasing a traditional console. If there were some people legitimately thinking that once we knew it was a Tegra based handheld device, then they were indeed deluded.
That depends entirely on implementation. You could say the same thing about, say, RT cores, but in reality, engines like UE5 compile with hardware like that in mind to achieve the desired output, so I don't see DLSS implementations being much different in that respect.I think we need to keep in mind while DLSS is nice and all, 99% of games would probably not be optimized for it nor the new chip at launch to take the full potential of this rumored hardware.
I already see the "lazy devs" takes on the horizon.
This would be amazing. But in terms of teraflops we would be with more than the PS4, it means ?Well we know the GPU config now outside of clock speeds, and the CPU is at least known to the CPU cores used so I say
Portable: Series S performance at 720p after DLSS
Docked: Rivaling the PS5 after DLSS?
The first time we saw the clocks (for docked mode) was when a participant in that discussions (blu?) that had a rooted Shield Tv ran benchmarks and reported the sustained clocks, which were exactly the docked clocks that DF reported a few days later from their insider contacts. So, this is an argument based on ignorance, we didn't knew what were the maximum sustainable clocks for the chip, so we took Nvidia for its word on the advertised maximum and concluded that Nintendo was going to run those clocks for docked mode, because why they wouldn't?That's not criticism, its just point that proves fact that Nintendo hardware (including Switch) generally has weaker hardware than people generally expected.
Maybe you can say that in that time we didnt know what exactly Switch is, but credibly rumour sources and leakers heard about what potential hardware we talking about, and like I wrote 1.5GHz for CPU (for GPU were also expected higher clocks) was expected only few months before Switch was launched when we already know what NX is and that will have Tegra X1.
Theoretically it should be. None of us here know if the exact cooling requirements will change, but the fact that the API sees 12SMs means that's what the chip has. It's essentially confirmed.Hello. This is my first post on Famiboards, but I've been lurking in these threads (Wii U and Switch editions too) since way back.
I'm not very tech savvy, so I have a few questions regarding recent findings.
Would these power levels be possible in the current form factor?
It's highly likely this will reuse the OLED model s dock, so any increase in size will have to be small enough to allow it to fit there still. It can get wider by a few mm and thicker by a few mm but not much beyond that.PS4-level portable mode is Steam Deck territory. Does that mean increase in size to match it?
39W is the theoretical maximum for playing while also charging the battery and joycons. And possibly some reserved for the USB ports but I'm not clear on that. The wattage supplied to the unit should be spread out enough not to make it absurdly hot.39W in docked mode would require a good cooling solution in the dock, but wouldn't it affect the ability to instantly pull the device out and play portably? It would have to be constantly cooled to human-tolerable levels while docked.
1.4 TFlops portable(Same 460MHz clock of current Switch Portable) and 2.3 TFlops Docked(Same 768MHz clock od current Switch Docked) would still be amazing bro. And it won't use 39 Watts for the whole system docked. It will be way less. 39W is probably the worst-case scenario, where the Switch will be playing while recharging.Watch Nintendo match 76.8 % of that. 2.3 TFLOPs "Please understand!"
5-6 TFLOPs for 39 watts for the whole system sounds way too good to be true. Especially considering steam deck runs up to 30 watts for 1.6 TFLOPs GPU max and maybe 3Ghz for it's 4 core CPU
The hackers said they would dump the Nvidia data they stole today if Nvidia didn't meet their demands. Nvidia didn't and we're waiting. Haven't seen anything so far. It's a possibility that they might not even release anything and it was just a bluff. In case nothing is released, our best bet for new info will be this year GDC.Should the new information arrive today?
Very good summary, but a couple things are wrong. The portable range, you list 8SM as the maximum, but that is assuming they disable SM, the actual max range would be 12SM with a ~500MHz clock. For all we know they went with 5nm TSMC for Drake, as the original chip was codenamed Dane, and while T239 is shared between both versions of the chip, we do not know what was changed.I thought I'd do a quick round-up of what we know, and give some general idea of how big our margin of error is on the known and unknown variables on the new chip.
Chip
Codenamed Drake/T239. Related to Orin/T234. We don't have confirmation on manufacturing process. The base assumption is 8nm (same as Orin), however kopite7kimi, who previously leaked info about the chip and said 8nm, is now unsure on the manufacturing process. The fact that the GPU is much larger than expected may also indicate a different manufacturing process, but we don't have any hard evidence. We also don't know the power consumption limits Nintendo have chosen for the chip in either handheld or docked mode, which will impact clock expectations.
GPU
This is what the leaks have been about so far, so we have much more detailed info here. In particular, on the die we have:
12 SMs
Ampere architecture with 128 "cores" per SM, and tensor performance comparable to desktop Ampere per SM. Some lower-level changes compared to desktop Ampere, but difficult to gauge the impact of those.
12 RT cores
No specific info on these, in theory they could have changes compared to desktop Ampere, but personally I'm not going to assume any changes until we have evidence.
4MB L2 cache
This is higher than would be expected for a GPU of this size (most comparable would be RTX 3050 laptop, with 2MB L2). Same as PS5 GPU L2 and only a bit smaller than XBSX GPU L2 of 5MB. This should help reduce memory bandwidth requirements, but it's impossible to say exactly by how much. Note this isn't really an "infinity cache", which range from 16MB to 128MB on AMD's 6000-series GPUs, it's just a larger than normal cache.
Things we don't know: how many SMs are actually enabled in either docked or handheld mode, clocks, ROPs.
Performance range in docked mode: It's possible that we could have a couple of SMs binned for yields, as this is a bigger GPU than expected. This would probably come in the form of disabling one TPC (two SMs) brining it down to 10. Clocks depend heavily on the manufacturing process and whether Nintendo have significantly increased their docked power consumption over previous models. I'd expect clocks between 800MHz-1GHz are probably most likely, but on the high end of expectations (better manufacturing process and higher docked power consumption) it could push as high as 1.2GHz. I doubt it will be clocked lower than the 768MHz docked clock of the original Switch, but that's not strictly impossible.
Low-end: 10 SMs @ 768MHz - 1.97 Tflops FP32
High-end: 12 SMs @ 1.2GHz - 3.68 Tflops FP32
Obviously there's a very big range here, as we don't know power consumption or manufacturing process. It's also important to note that you can't simply compare Tflops figures between different architectures.
Performance range in handheld mode: This gets even trickier, as Drake is reportedly the only Ampere GPU which supports a particular clock-gating mode, which could potentially be used to disable SMs in handheld mode. This makes sense, though, as peak performance per watt will probably be somewhere in the 400-600MHz range, so it's more efficient to, say, have 6 SMs running at 500MHz than all 12 running at 250MHz. Handheld power consumption limits are also going to be very tight, so performance will be very much limited by manufacturing process. I'd expect handheld clocks to range from 400MHz to 600MHz, but this is very dependent on manufacturing process and the number of enabled SMs.
One other comment to make here is that we shouldn't necessarily expect the <=2x performance difference between docked and handheld that we saw on the original Switch. That was for a system designed around 720p output in portable mode and 1080p output docked, however here we're looking at a 4K docked output, and either 720p or 1080p portable, so there's a much bigger differential in resolution, and therefore a bigger differential in performance required. It's possible that we could get as much as a 4x differential between portable and docked GPU performance.
Low-end: 6 SMs @ 400 MHz - 614 Gflops FP32
High-end: 8 SMs @ 600 MHz - 1.2 Tflops FP32
There is of course DLSS on top of this, but it's not magic, and shouldn't be taken as a simple multiplier of performance. Many other aspects like memory bandwidth can still be a bottleneck.
CPU
The assumption here is that they'll use A78 cores. That isn't strictly confirmed, but given Orin uses A78 cores, it would be a surprise if Drake used anything else. We don't know either core count or clocks, and again they will depend on the manufacturing process. The number of active cores and clocks will almost certainly remain the same between handheld and docked mode, so the power consumption in handheld mode will be the limiting factor.
For core count, 4 is the minimum for compatibility, and 8 is probably the realistic maximum. The clocks could probably range from 1GHz to 2GHz, and this will depend both on the manufacturing process and number of cores (fewer cores means they can run at higher clocks).
The performance should be a significant improvement above Switch in any case. In the lower end of the spectrum, it should be roughly in line with XBO/PS4 CPU performance, and at the high-end it would sit somewhere between PS4 and PS5 CPU performance.
RAM
Again, the assumption is that they'll use LPDDR5, based on Orin using it, and there not being any realistic alternatives (aside from maybe LPDDR5X depending on timing). The main question mark here is the bus width, which will determine the bandwidth. The lowest possible bus width is 64-bit, which would give us 51.2GB/s of bandwidth, and the highest possible would be 256-bit, which would provide 204.8GB/s bandwidth. Bandwidth in handheld mode would likely be a lot lower to reduce power consumption.
Quantity of RAM is also unknown. On the low end they could conceivably go with just 6GB, but realistically 8GB is more likely. On the high end, in theory they could fit much more than that, but cost is the limiting factor.
Storage
There are no hard facts here, only speculation. Most people expect 128GB of built-in storage, but in theory it could be more or less than that.
In terms of speeds, the worst case scenario is that Nintendo retain the UHS-I SD card slot, and all games have to support ~100MB/s as a baseline. The best case scenario is that they use embedded UFS for built-in storage, and support either UFS cards or SD Express cards, which means games could be built around a 800-900MB/s baseline. The potential for game card read speeds is unknown, and it's possible that some games may require mandatory installs to benefit from higher storage speeds.
Do you personally believe it's impossible that all 12 SMs are active in handheld mode, assuming this is still 8nm? Would the power draw be simply unrealistically high?I thought I'd do a quick round-up of what we know, and give some general idea of how big our margin of error is on the known and unknown variables on the new chip.
Chip
Codenamed Drake/T239. Related to Orin/T234. We don't have confirmation on manufacturing process. The base assumption is 8nm (same as Orin), however kopite7kimi, who previously leaked info about the chip and said 8nm, is now unsure on the manufacturing process. The fact that the GPU is much larger than expected may also indicate a different manufacturing process, but we don't have any hard evidence. We also don't know the power consumption limits Nintendo have chosen for the chip in either handheld or docked mode, which will impact clock expectations.
GPU
This is what the leaks have been about so far, so we have much more detailed info here. In particular, on the die we have:
12 SMs
Ampere architecture with 128 "cores" per SM, and tensor performance comparable to desktop Ampere per SM. Some lower-level changes compared to desktop Ampere, but difficult to gauge the impact of those.
12 RT cores
No specific info on these, in theory they could have changes compared to desktop Ampere, but personally I'm not going to assume any changes until we have evidence.
4MB L2 cache
This is higher than would be expected for a GPU of this size (most comparable would be RTX 3050 laptop, with 2MB L2). Same as PS5 GPU L2 and only a bit smaller than XBSX GPU L2 of 5MB. This should help reduce memory bandwidth requirements, but it's impossible to say exactly by how much. Note this isn't really an "infinity cache", which range from 16MB to 128MB on AMD's 6000-series GPUs, it's just a larger than normal cache.
Things we don't know: how many SMs are actually enabled in either docked or handheld mode, clocks, ROPs.
Performance range in docked mode: It's possible that we could have a couple of SMs binned for yields, as this is a bigger GPU than expected. This would probably come in the form of disabling one TPC (two SMs) brining it down to 10. Clocks depend heavily on the manufacturing process and whether Nintendo have significantly increased their docked power consumption over previous models. I'd expect clocks between 800MHz-1GHz are probably most likely, but on the high end of expectations (better manufacturing process and higher docked power consumption) it could push as high as 1.2GHz. I doubt it will be clocked lower than the 768MHz docked clock of the original Switch, but that's not strictly impossible.
Low-end: 10 SMs @ 768MHz - 1.97 Tflops FP32
High-end: 12 SMs @ 1.2GHz - 3.68 Tflops FP32
Obviously there's a very big range here, as we don't know power consumption or manufacturing process. It's also important to note that you can't simply compare Tflops figures between different architectures.
Performance range in handheld mode: This gets even trickier, as Drake is reportedly the only Ampere GPU which supports a particular clock-gating mode, which could potentially be used to disable SMs in handheld mode. This makes sense, though, as peak performance per watt will probably be somewhere in the 400-600MHz range, so it's more efficient to, say, have 6 SMs running at 500MHz than all 12 running at 250MHz. Handheld power consumption limits are also going to be very tight, so performance will be very much limited by manufacturing process. I'd expect handheld clocks to range from 400MHz to 600MHz, but this is very dependent on manufacturing process and the number of enabled SMs.
One other comment to make here is that we shouldn't necessarily expect the <=2x performance difference between docked and handheld that we saw on the original Switch. That was for a system designed around 720p output in portable mode and 1080p output docked, however here we're looking at a 4K docked output, and either 720p or 1080p portable, so there's a much bigger differential in resolution, and therefore a bigger differential in performance required. It's possible that we could get as much as a 4x differential between portable and docked GPU performance.
Low-end: 6 SMs @ 400 MHz - 614 Gflops FP32
High-end: 8 SMs @ 600 MHz - 1.2 Tflops FP32
There is of course DLSS on top of this, but it's not magic, and shouldn't be taken as a simple multiplier of performance. Many other aspects like memory bandwidth can still be a bottleneck.
CPU
The assumption here is that they'll use A78 cores. That isn't strictly confirmed, but given Orin uses A78 cores, it would be a surprise if Drake used anything else. We don't know either core count or clocks, and again they will depend on the manufacturing process. The number of active cores and clocks will almost certainly remain the same between handheld and docked mode, so the power consumption in handheld mode will be the limiting factor.
For core count, 4 is the minimum for compatibility, and 8 is probably the realistic maximum. The clocks could probably range from 1GHz to 2GHz, and this will depend both on the manufacturing process and number of cores (fewer cores means they can run at higher clocks).
The performance should be a significant improvement above Switch in any case. In the lower end of the spectrum, it should be roughly in line with XBO/PS4 CPU performance, and at the high-end it would sit somewhere between PS4 and PS5 CPU performance.
RAM
Again, the assumption is that they'll use LPDDR5, based on Orin using it, and there not being any realistic alternatives (aside from maybe LPDDR5X depending on timing). The main question mark here is the bus width, which will determine the bandwidth. The lowest possible bus width is 64-bit, which would give us 51.2GB/s of bandwidth, and the highest possible would be 256-bit, which would provide 204.8GB/s bandwidth. Bandwidth in handheld mode would likely be a lot lower to reduce power consumption.
Quantity of RAM is also unknown. On the low end they could conceivably go with just 6GB, but realistically 8GB is more likely. On the high end, in theory they could fit much more than that, but cost is the limiting factor.
Storage
There are no hard facts here, only speculation. Most people expect 128GB of built-in storage, but in theory it could be more or less than that.
In terms of speeds, the worst case scenario is that Nintendo retain the UHS-I SD card slot, and all games have to support ~100MB/s as a baseline. The best case scenario is that they use embedded UFS for built-in storage, and support either UFS cards or SD Express cards, which means games could be built around a 800-900MB/s baseline. The potential for game card read speeds is unknown, and it's possible that some games may require mandatory installs to benefit from higher storage speeds.
It would also make sense to use it for BC.Do you personally believe it's impossible that all 12 SMs are active in handheld mode, assuming this is still 8nm? Would the power draw be simply unrealistically high?
There was some speculation about the clock gating being specific to this device in order to allow one TPC to be active during standby/sleep mode occasionally.