• Hey everyone, staff have documented a list of banned content and subject matter that we feel are not consistent with site values, and don't make sense to host discussion of on Famiboards. This list (and the relevant reasoning per item) is viewable here.
  • Do you have audio editing experience and want to help out with the Famiboards Discussion Club Podcast? If so, we're looking for help and would love to have you on the team! Just let us know in the Podcast Thread if you are interested!

StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (Read the staff posts before commenting!)

One more nail in the coffin for 8nm?

One would hope so, seeing how “bad” Samsung 8nm mode is compared to TSCM 7nm in just about everything.

Hopefully Nvidia swayed Nintendo to go with a more expensive mode (5nm or 7nm) which will make the Drake smaller, consume less wattage and run cooler. That against going with Sanding because they need to have a cheaper manufacturing node to compete
 
Keep in mind, ORIN has double the memory controllers, a lot of automotive focused parts, the CPU cores take extra space due to redundant logic for lockstep (redundant for a console), large amounts of Cache for the GPU and more GPU cores.

Still would be huge though so… 🤷🏾‍♂️
 
One would hope so, seeing how “bad” Samsung 8nm mode is compared to TSCM 7nm in just about everything.

Hopefully Nvidia swayed Nintendo to go with a more expensive mode (5nm or 7nm) which will make the Drake smaller, consume less wattage and run cooler. That against going with Sanding because they need to have a cheaper manufacturing node to compete
I don’t care what company convinced the other company, and that’s something we’ll never prove anyway.

I was just referring to the fact that we know from the Nvidia theft this will have 75% of the cuda cores of big Orin.
 
This is relevant to the thread:



Uses apple as the example case here, and it is an estimate, but it’s using the size of the die, the amount per wafer that can be made, and the number of price per wafer from TSMC.

Does anyone have that post by thraktor on the Die size estimates? I’ll probably find it and edit this post.

M1: 119mm^2 ($44)
M2: 155mm^2 ($61)
M1P: 245mm^2 ($109)
M1M: 419mm^2 ($224)

These are estimated prices, does not include the packaging or DRAM chips

Also this chain of tweets (below) is also of interest:


If I’m not mistaken, it seems as though the wafers became more expensive now, and to jump in now(2022) for a chip to release in say 2025 or 2026 could be even more expensive than if they jumped in 2019/2020 for a release in 2022/23 with contracts for these. 8nm is still or should still be expected, but you never know 🤷🏾‍♂️

Found it:

I've been doing a bit of research into this, and I'm actually not sure that the bolded is correct. Samsung 8N is surely the cheapest plausible node per wafer, but once you take into account density and yields, it's entirely possible that a more advanced node like TSMC N5 is actually cheaper per chip. In fact, my back-of-a-paper-envelope maths suggests that a Samsung 8N Drake could cost 70% more than a TSMC N5 Drake.

I should emphasise that I have no expertise in this field, my analysis contains a lot of assumptions and estimations which may deviate significantly from reality, and you shouldn't take what I'm about to write any more seriously than any other random person on the internet. That said, I can run through the maths of it.

A few pages back, I posted an estimate of Drake's die size on various manufacturing processes. I've revised my estimates on these figures in two ways since then. The first is that I'm now estimating Drake's transistor count to be around 8 billion transistors. This is based on Nvidia's Orin die photo actually being for an older 17 billion transistor configuration of the chip, but also from the fact that Xbox Series S's "Lockhart" SoC reportedly comes in at 8 billion transistors itself. This is the same number of CPU cores (8) and GPU shader "cores" (1536) on the silicon as Drake, but we know that the Zen2 CPU is larger and uses more transistors than A78, and RDNA2 similarly is larger and uses more transistors per "core" than Ampere. There are some differences between Drake and base Ampere, though, the 4MB of L2 cache will add considerably to the total (based on the GA102 die, it looks like it could be around 1.3 billion transistors for that alone), and there might be some additional components on there care of Nvidia that Nintendo don't really need, but might be useful for Nvidia's other customers (eg an 8K codec block). I'm just going with 8 billion as a round figure, but again there's a large margin of error.

The second change is that I'm changing my estimate for TSMC N7->N6 density improvement from 18% (TSMC's claim) to 8.1% (actual measured improvement from Navi 23 to Navi 24). That being the case, my new estimates are as follows:

ProcessDensity (mT/mm2)Drake size (mm2)
Samsung 8nm45.6175.4
Samsung 7nm59.2135.1
Samsung 5nm83.495.9
Samsung 4nm109.772.9
TSMC N765.6122.0
TSMC N670.9112.8
TSMC N5106.175.4

In terms of cost per wafer, my starting point was the figures shown in Ian Cutress's video on wafer prices (which incidentally is very informative if you're curious about how this kind of stuff works). This contains wafer cost figures for many of TSMC's nodes. It's important to note here that these numbers are a few years old at this point, and that the exact prices per wafer have surely changed (in fact they've probably gone down and come back up again since then), however I'm not really that interested in the absolute numbers, but rather the relative costs across different processes. The cost Nintendo pay for a Drake chip has a lot of other factors involved (packaging, testing, and obviously Nvidia's margins), which are difficult to estimate, so it's simpler to think about costs in relative terms.

The costs per wafer (in USD) quoted in that video for more recent nodes are:

28nm20nm16nm10nm7nm
2,361.842,981.754,081.225,126.355,859.28

These are just TSMC nodes, and this predated their 5nm processes. To estimate the 5nm wafer costs, I'm relying on this chart which TSMC released in mid-2021, showing the relative wafer manufacturing capacity of 16nm, 7nm and 5nm process families. This shows that the capacity of 7nm relative to 5nm in 2020 was 3.87:1, and the estimated capacity ratio in 2021 is shown as 1.76:1. We also know from TSMC's 2021 Q4 financials that 5nm accounted for 19% of revenue in 2021, compared to 31% for 7nm. The capacity figure from the chart doesn't reflect actual output, and it seems to reflect installed capacity at year-end, which obviously wouldn't be in operation over the entire year they're reporting revenue for. Therefore, if we assume that capacity was added uniformly over the year, the actual ratio of 7nm to 5nm wafers produced should be half way between the 2020 and 2021 year-end capacity numbers. That is, we would expect that over the course of 2021, TSMC produced about 2.4x as many 7nm wafers as 5nm wafers. With a 1.63x ratio of revenue between the two nodes, we can estimate that the revenue per wafer was approximately 47% higher for 5nm than 7nm. This would put a 5nm wafer at $8,622.76. Again, this may not be the correct absolute figure, but I'm mostly interested in whether the relative prices are accurate.

So, onto the cost per die. To do this we have to estimate the number of dies per wafer, for which I use this yield calculator. I take the die sizes above and assume all dies are square. For the defect density, I'm using a figure of 0.1 defect/cm2, which is based on this Anandtech article. It's likely yields are actually a bit better than this by now, but it won't make a huge difference to the analysis.

Die areaDies per waferCost per wafer ($)Cost per die ($)Cost per die ratio
TSMC N7122.04275,859.2813.721.15
TSMC N6112.84625,859.2812.681.06
TSMC N575.47238,622.7611.931.00

For N6 TSMC are probably charging a bit more per wafer than N7, but as I have no way of estimating this, I'm just leaving the price per wafer the same. The actual cost per die here won't be even close to what Nintendo will have to pay, both with the old numbers being used for wafer prices, and with packaging, testing and Nvidia's margins being added on top. However, the cost per die ratio in the last column is independent of those things. I've chosen TSMC N5 here as the baseline, and you can see that N7 and N6 are actually calculated as being more expensive per die than N5. The dies per wafer gives you a clue as to why, with the substantial increase in density of N5 (plus the smaller die resulting in a better yield ratio) meaning that even a significantly more expensive wafer cost doesn't necessarily mean more expensive chips themselves.

For the Samsung manufacturing processes, I haven't been able to find any information (even rough estimates) on wafer costs, or wafer output and revenue splits that might be used to estimate revenue per wafer. However, we can look at the cost per wafer required to hit a cost per die ratio of 1.0 (ie the same cost per die as TSMC N5) and evaluate whether that's feasible. For defect density on 5nm I'm going to use 0.5, as it was rumoured to be resulting in 50% yields for mobile SoCs that should be roughly 100mm2 in size. For 8nm defect density it's a bit trickier, but I'm estimating 0.3 defects per square cm, based on product distribution of Nvidia's desktop GPUs (if it were lower, then they wouldn't have to bin quite so heavily, if higher they wouldn't be able to sell full-die chips like the 3090Ti at all). These are only very rough estimates, so I'll also look at a range of estimates for both of these.

ProcessDefect density (per cm2)Dies per waferCost per wafer ($) - 1.00 ratio
Samsung 5nm0.53834,569.19
Samsung 5nm0.34595,475.87
Samsung 8nm0.51481,765.64
Samsung 8nm0.32012,397.93
Samsung 8nm0.12803,340.40

Samsung's 5nm processes are a bit more realistically priced here. They're most comparable to TSMC's 7nm family in terms of performance and efficiency, and if they've got the defect density down to 0.3 then they could charge a similar amount per wafer to TSMC N7 and be competitive on a per-chip cost. If the defect density is actually 0.5, then they'd have to be much more aggressive on price per wafer, coming in below TSMC 10nm, and not that far off TSMC's 16nm family. Note that the manufacturing costs on Samsung's side are likely quite a bit higher for their 5nm processes than even TSMC's N7, as Samsung are using EUV extensively in their 5nm process, so there's only a limited extent to which they can be aggressive on price.

On the 8nm side, wafer costs get a lot more unrealistic if we're trying to assume that they can be competitive on a cost per die basis with N5. If we use the 0.3 defect density estimate, then they'd have to charge about $2,400 per wafer for N8, which is basically the same as TSMC's 28nm process. Keep in mind that Samsung have their own 28nm and 14nm processes that are pretty competitive with TSMC's 28nm and 16nm families, which means Samsung would either have to be charging a similar amount for an 8nm wafer as they charge for a 28nm wafer, or they are massively undercharging for their 28nm and 14nm processes if they're proportionally cheaper than 8nm. Both of these seem very unlikely. Even with only a 0.1 defect density (similar to TSMC's processes), they would have to charge $3,340 per wafer, which is quite a bit less than TSMC 16nm.

If we assume the cheapest Samsung could charge for an 8nm wafer is the same as a TSMC 16nm wafer (which would make it very aggressively priced), and the defect density is 0.3, the cost per die would be $20.30, which gives a cost per die ratio of 1.70, or 70% more expensive than the same die on TSMC N5. This is even ignoring the significant performance and efficiency benefits of going with TSMC's N5 process over Samsung's 8nm process.

We can also plug Mariko into these to figure out a relative cost. For the Mariko die size, I measured some photos I found online in comparison to the original TX1, and it looks to be approximately 10.1mm by 10.2mm. With an assumed 0.1 defect ratio on 16nm, this would put it at 507 dies per wafer, and therefore $8.05 per die. Again this doesn't represent the actual price Nintendo pay, but this means a TSMC N5 Drake (with about 4x the transistor count) would cost about 50% more than Mariko does.

This might explain why Nvidia is moving so aggressively onto TSMC's 5nm process. I had assumed that they would keep lower-end Ada chips on Samsung 8nm, or maybe Samsung 5nm, but this would suggest that it's actually cheaper per chip to use TSMC 5nm, even before the clock speed/efficiency benefits of the better node. It also, from my perspective, makes Drake's 12 SM GPU a lot more reasonable. For an 8nm chip in a Switch form-factor, 12 SMs is much more than any of us expected, but if you were to design a TSMC N5 chip for a Switch like device, 12 SMs is actually not excessive at all. It's a small ~75mm2 die, and there shouldn't be any issue running all 12 SMs at reasonable clocks in both handheld and docked modes. Yields would be extremely high, and as TSMC N5 will be a very long-lived node, there would be no pressure to do a node shrink any time soon.

Now, to caveat all of this again, I'm just a random person on the internet with no relevant expertise or insight, so it's entirely possible (probable?) that there are inaccurate assumptions and estimates above, or just straightforward misunderstanding of how these things work. So take it all with a huge grain of salt. Personally I still think 8nm is very likely, possibly even moreso than TSMC N5, but I think it's nonetheless interesting to run through the numbers to try to actually verify my assumptions.
 
Last edited:
I don’t care what company convinced the other company, and that’s something we’ll never prove anyway.

I was just referring to the fact that we know from the Nvidia theft this will have 75% of the cuda cores of big Orin.

I know, I’m just stating that Samsungs 8nm node is not good with all the problems it has. Only good thing is that it’s cheaper than the opponents node, otherwise it’s mostly bad choice going with them
 
I know, I’m just stating that Samsungs 8nm node is not good with all the problems it has. Only good thing is that it’s cheaper than the opponents node, otherwise it’s mostly bad choice going with them
That's where Nintendo being Nintendo could fit in. They may choose it for the price, and possibly lack of demnd.
That said, the hacked desgin specs from Drake even seem to refute this. It's really hard to tell. I think no one can tell either way and we should just keep an open mind for new information.
 
This is relevant to the thread:



Uses apple as the example case here, and it is an estimate, but it’s using the size of the die, the amount per wafer that can be made, and the number of price per wafer from TSMC.

Does anyone have that post by thraktor on the Die size estimates? I’ll probably find it and edit this post.

M1: 119mm^2 ($44)
M2: 155mm^2 ($61)
M1P: 245mm^2 ($109)
M1M: 419mm^2 ($224)

These are estimated prices, does not include the packaging or DRAM chips

Also this chain of tweets (below) is also of interest:


If I’m not mistaken, it seems as though the wafers became more expensive now, and to jump in now(2022) for a chip to release in say 2025 or 2026 could be even more expensive than if they jumped in 2019/2020 for a release in 2022/23 with contracts for these. 8nm is still or should still be expected, but you never know 🤷🏾‍♂️

Found it:

These figure per mm^2 seem pretty reasonable/affordable.

The difference is, however, is that Apple also R&Ds the design. So these prices reflect what Nvidia would pay to TSMC (before difference in purchasing scale) and then Nvidia needs to mark it up and sell to Nintendo to recapture R&D costs.

This likely means that N5 is not affordable for Nintendo today. But I think that was rather obvious to most.
 
That's where Nintendo being Nintendo could fit in. They may choose it for the price, and possibly lack of demnd.
That said, the hacked desgin specs from Drake even seem to refute this. It's really hard to tell. I think no one can tell either way and we should just keep an open mind for new information.

I’m afraid that we will get a Switch 1 situation, which Switch 2 hardware resembles what Switch at 20nm is. Samsungs 8nm node is having the same (if not worse) problems with heat, size and more because of a cheap node
 
I’m afraid that we will get a Switch 1 situation, which Switch 2 hardware resembles what Switch at 20nm is. Samsungs 8nm node is having the same (if not worse) problems with heat, size and more because of a cheap node
The strengths and weaknesses of 8nm was a known quantity before they started designing this chip. 12 sm 8 nm would be a completely baffling decision. Either you get a smaller chip, or you get a better node. You don’t try to do both.
 
0
I’m afraid that we will get a Switch 1 situation, which Switch 2 hardware resembles what Switch at 20nm is. Samsungs 8nm node is having the same (if not worse) problems with heat, size and more because of a cheap node
I don't think that will be an analog. Tegra X1 was always 20nm , X2 was never in consideration and arrived too late since X1 would have been locked down in 2015 when Switch was being designed
 
0
0
Well, looks like we've gotten some fairly solid answers.

We know it's a 12sm variant of the GPU.
We know it's Samsung's 8nm.

We know the drive variant shown with 16sm and 12 CPU cores, and all the automotive fixings is absolutely massive. I mean hell, that soc itself is practically big enough to put buttons and a screen on. Pretty sure it's bigger than a gba micro.

Nintendo/nvidia is intent on making this work. Otherwise the stolen info from Nvidia would have been different.

I feel like we have a starting point and direction for a good new round of speculation.

Do we have any decent information on that single camera 15 watt orin?
 
These figure per mm^2 seem pretty reasonable/affordable.

The difference is, however, is that Apple also R&Ds the design. So these prices reflect what Nvidia would pay to TSMC (before difference in purchasing scale) and then Nvidia needs to mark it up and sell to Nintendo to recapture R&D costs.

This likely means that N5 is not affordable for Nintendo today. But I think that was rather obvious to most.
If I remember right, for semi the R&D is upfront cost of developing the silicon and all the IPs attached to it on a certain node, it getting more expensive the better the node. Since the next Nintendo system will have an NVidia chip with little chance of it being switch, then it means it would be using nVidia IPs.

Based on the breach, we can infer that it isn’t using a different version of the same IP as desktop parts, so it isn’t starting from scratch. And A78 is a likely choice, which is old by today standards now (2 years). It would be using an aged IP for both (still pretty modern but not the very edge), different from Apple that uses all bleeding edge and does their own R&D for this.

And considering it being a derivative from an existing chip, rather than one started like ORIN, it could be a split difference in the R&D for the specific chip, though ultimately nVidia is the one paying since they are the ones who directly deal with this, not Nintendo.
 
Well, looks like we've gotten some fairly solid answers.

We know it's a 12sm variant of the GPU.
We know it's Samsung's 8nm.

We know the drive variant shown with 16sm and 12 CPU cores, and all the automotive fixings is absolutely massive. I mean hell, that soc itself is practically big enough to put buttons and a screen on. Pretty sure it's bigger than a gba micro.

Nintendo/nvidia is intent on making this work. Otherwise the stolen info from Nvidia would have been different.

I feel like we have a starting point and direction for a good new round of speculation.

Do we have any decent information on that single camera 15 watt orin?
No, 8nm Orion is nothing new and been discussed here before.

Some of us think it's possible for 6nm TSMC or 7nm TSMC/Samsung. Not sure if 12 SM is really feasible on 8nm Samsung at a reasonable size that would be similar to TX1/switch, even without the automobile parts, but who knows. We'll see.
 
Last edited:
0
Well, looks like we've gotten some fairly solid answers.

We know it's a 12sm variant of the GPU.
We know it's Samsung's 8nm.

We know the drive variant shown with 16sm and 12 CPU cores, and all the automotive fixings is absolutely massive. I mean hell, that soc itself is practically big enough to put buttons and a screen on. Pretty sure it's bigger than a gba micro.

Nintendo/nvidia is intent on making this work. Otherwise the stolen info from Nvidia would have been different.

I feel like we have a starting point and direction for a good new round of speculation.

Do we have any decent information on that single camera 15 watt orin?
We know Orin is 8nm, but we don't really know if that also applies to Drake.
 
0
Because they won't, we know it's not Orin.
Removing automotive features and scaling it back a bit is not gonna make it fit in a switch both size and power wise at 8nm.
It would need to a massive cut back, to an extent where they would be better off just using a different chip made for their power budget.

It's not 8nm. I'm very confident about this.
 
Removing automotive features and scaling it back a bit is not gonna make it fit in a switch both size and power wise at 8nm.
It would need to a massive cut back, to an extent where they would be better off just using a different chip made for their power budget.

It's not 8nm. I'm very confident about this.
Oh yeah, I agree it's pretty likely not 8nm. Just wanted to clear up any confusion about it possibly being Orin itself.
 
At 15W it has multiple parts of the silicon disabled/turned off to meet that criteria.

Entertaining again the thought of 8nm samsung again..


Can it fit though in a switch shell? That's the biggest thing. It would really have to be customized. Ideally we want an orion without the machine parts, non A78AE CPU cores (smaller A78s I mean).

If we really get the 12 SM model, then it would likely be between the 32GB AGX model (1792 cuda cores, 8 A78 CPUs, 15-40 watts) and the the 16GB Orion NX model (1024 cores, 8 A78 CPUs, 10-25 watts). Which could fall under the 10-30 watts range. Funnily enough the power profile of the 16GB NX fits it perfectly (would 1024 cuda cores give us 1.86 TFLOPs?)
https://developer.nvidia.com/embedded/jetson-modules

Hard to say what CPU and GPU clockspeeds we can really get at the same time. I'd feel more confident saying we'd get like 90-100% of the GPU clockspeeds and 1.5-1.7Ghz CPU running at the same time if it was a smaller node like 6 or 5nm, especially TSMC.
 
The GPU the Drake is based on uses 50 watts of power. There's no scaling back that is gonna get this in the switch.
It has to use a newer process to make this viable. I'll be surprised if the Drake isn't 5nm honestly.
 
We know it's not orin specifically.

But you think orin being confirmed to be 8nm doesn't make derivative products more likely to be 8nm?
We've discussed for months now how unlikely it is to have a 12SM part on 8nm work in the Switch's form factor. Personally I don't think it's possible for this to exist if it's on 8nm.

Either it doesn't have 12SMs, it's not on 8nm, or it's not a hybrid. And I find the latter to be by far the least likely.
 
We know it's not orin specifically.

But you think orin being confirmed to be 8nm doesn't make derivative products more likely to be 8nm?
Not him obviously and I'm sure he disagrees with me on this but I personally am not expecting Drake to launch until late 2023 or early 2024, which would make a die shrink common sense. It would also mean Nintendo is using year old hardware which makes more sense to me than them using cutting edge stuff.
 
0
We've discussed for months now how unlikely it is to have a 12SM part on 8nm work in the Switch's form factor. Personally I don't think it's possible for this to exist if it's on 8nm.

Either it doesn't have 12SMs, it's not on 8nm, or it's not a hybrid. And I find the latter to be by far the least likely.

I am aware of that. I know why 8nm is a bad fit.

We know it HAS to have 12 sm. That's hard.

Do we have anything yet, hard, like the orin 8nm final confirmation, for Drake, or anything tangentially related to Drake? Like hard info on the other mobile chipsets?
 
0
The GPU the Drake is based on uses 50 watts of power. There's no scaling back that is gonna get this in the switch.
It has to use a newer process to make this viable. I'll be surprised if the Drake isn't 5nm honestly.
Where did you get 50 watts? The highest goes to 60 and that's for the 2048 cuda core 64GB model at 1.3 GHz, next in line is the 1792 cuda cores at 930Hz using around 40 watts max. Considering Drake is a step below this, 30 watts max is a good theoretical step below, if it had all the auto parts like the other models. Now reducing clock speeds and dialing some stuff could get is down to 20-25 watts, or lower. That's why said 10-30 watts is the guestimate.

Keep in mind that both AGX models support 256 bit bus width (NX is only 128), and the highest AGX module use 12 A78AE Arm cores, instead of 8.
 
Last edited:
Where did you get 50 watts? The highest goes to 60 and that's for the 2048 cuda core 64GB model at 1.3 GHz, next in line is the 1792 cuda cores at 930Hz using around 40 watts max. Considering Drake is a step below this, 30 watts max is a good theoretical step below, if it had all the auto parts like the other models. Now reducing clock speeds and dialing some stuff could get is down to 20-25 watts, or lower. That's why said 10-30 watts is the guestimate.

Keep in mind that both AGX models support 256 bit bus width (NX is only 128), and the highest AGX module use 12 A78AE Arm cores, instead of 8.
I looked into it further and the device the switch 2 is based on is 3.3 tflops at 40 watts. Unless they're going to massively cut this thing back, which we know from leaked hardware details that they are not, then it needs a newer process.
 
Last edited:
I looked into it further and the device the switch is based on is 3.3 tflops at 40 watts. Unless they're going to massively cut this thing back, which we know from leaked hardware details that they are not, then it needs a newer process.

I know they are not cutting back on SM's. 12 SM's is hard locked.

I am not aware of anything else being hard locked in the same fashion.
 
I know they are not cutting back on SM's. 12 SM's is hard locked.

I am not aware of anything else being hard locked in the same fashion.
I am just extremely skeptical of being able to cut back 40 watts to the level they need for handheld mode. The switch uses about 10 watts total. I don't know the specific breakdown of the GPU, but I'd guess it's obviously not using all the power. Even if Nintendo is willing to jack up handheld power usage, it's probably not going above 15 watts. So I'd imagine the GPU has to use around 8 watts or less in handheld mode. It's a tall ask to get 40 watts down to 8. Now with a die shrink to 5nm it gets a lot more viable.
 
I am just extremely skeptical of being able to cut back 40 watts to the level they need for handheld mode. The switch uses about 10 watts total. I don't know the specific breakdown of the GPU, but I'd guess it's obviously not using all the power. Even if Nintendo is willing to jack up handheld power usage, it's probably not going above 15 watts. So I'd imagine the GPU has to use around 8 watts or less in handheld mode. It's a tall ask to get 40 watts down to 8. Now with a die shrink to 5nm it gets a lot more viable.
Yeah it is. But whatever they plan on doing, they are planning on doing it, with 12 SM. Die shrink, or a different nm are clean logical answers. But it's still based on logic, in a stupid messy world that isn't logical nearly as much as it should be.

Has their been any relevant improvement to milliamps and battery size, from switch to now?
 
0
I looked into it further and the device the switch 2 is based on is 3.3 tflops at 40 watts. Unless they're going to massively cut this thing back, which we know from leaked hardware details that they are not, then it needs a newer process.
Where did you get 40 watts? I don't recall the power draw ever being leaked for Drake dude.

If 32 GB AGX Orion with a 256 bus bandwidth (Drake is confirmed for 128 with the breach) is using 40 watts and even more cuda cores than Drake (1792 vs ~1500), there is no reason why Drake would match that unless they have even more hardware.. Like i said, 30 watts seems more ballpark, if it was 8nm Samsung. We'd also have the auto parts removed or disabled.


If we do somehow get 8nm Samsung with 12 SM, I wouldn't be surprised if we get 80% of the GPU speeds.
Anyway, the upper range could be if all clocks were 90-100%. I don't think we would see CPU running on 2Ghz. More like 1.5 at most on 8nm.
 
Last edited:
Everyone mention this 12SM device as a sure thing; I understand why, as it comes directly from a hack. But there's nothing which ensures that it will be the final product either.

Nvidia and Nintendo probably have investigated several pathways for their new machine, but we cannot be sure that something is going to materialize until it is announced.

While there are reasons to be optimistic, and I am, even information from the hack shouldn't be treated as gospel.
 
Everyone mention this 12SM device as a sure thing; I understand why, as it comes directly from a hack. But there's nothing which ensures that it will be the final product either.

Nvidia and Nintendo probably have investigated several pathways for their new machine, but we cannot be sure that something is going to materialize until it is announced.

While there are reasons to be optimistic, and I am, even information from the hack shouldn't be treated as gospel.

The thing is the 12sm is hardcoded into the very far along new NVN2 Graphics library for the next switch, with absolutely no other options available.
 
The thing is the 12sm is hardcoded into the very far along new NVN2 Graphics library for the next switch, with absolutely no other options available.
To be completely fair, the hack did not produce a fully exhaustive leak. There's a possibility that other options are indeed available, they just weren't present in the leaked files.

But I do agree that seems exceedingly unlikely, 12SMs seems to be nearly certain.
 
Everyone mention this 12SM device as a sure thing; I understand why, as it comes directly from a hack. But there's nothing which ensures that it will be the final product either.

Nvidia and Nintendo probably have investigated several pathways for their new machine, but we cannot be sure that something is going to materialize until it is announced.

While there are reasons to be optimistic, and I am, even information from the hack shouldn't be treated as gospel.
I understand where you're coming from but it's likely too late to be changing these major details. Any adjustments at this point will likely be more minor things like adjusting the clock speed or adding a little extra RAM. That kind of stuff can easily be changed without messing up a bunch of plans, but if they completely rework the device it's going to screw everything they planned up. Like they're probably designing the next 3D Mario game around this hardware, imagine if they randomly decide to go with half the power.
 
0
Yeah, is it an estimated extrapolation, or do you have anything you can share that gives figures?
I explained it several times in my last couple of posts. In post 14,326, I posted a link.

https://developer.nvidia.com/embedded/jetson-modules

In cuda cores, Drake is between the 32 GB AGX Orion and the 16GB NX modules. 32GB goes up to 40 watts and 16GB goes up to 25. Keep in mind as well that the AGX modules have 256 bit buswidth, and the NX ones have 128 (102GB/s bandwidth at most)

But to be fair, I don't know how modules would translate to the actual thing once they are in a console, power draw wise.

In the breach, Drake is also confirmed not to be using 256 bus width. Correct me if I'm wrong, but Drake is confirmed to have a 2x NVDLA v2.0 Dl accelerator as well, which the AGX modules have.

This is the closest thing I can come up with. Unannounced of course and a custom Orion. Stripped down model of the AGX. Would be interested if it's actually physically removed the cuda cores or disabled. But has NX and Orion qualities. An intermediate really.
 
Last edited:
I explained it several times in my last couple of posts. In post 14,326, I posted a link.

https://developer.nvidia.com/embedded/jetson-modules

In cuda cores, Drake is between the 32 GB AGX Orion and the 16GB NX modules. 32GB goes up to 40 watts and 16GB goes up to 25. Keep in mind as well that the AGX modules have 256 bit buswidth, and the NX have 128.

In the breach, Drake is also confirmed not to be using 256 bus width. Correct me if I'm wrong, but Drake is confirmed to have a 2x NVDLA v2.0 Dl accelerator as well, which the AGX modules have.

Thanks, but I was just tagging along with your post asking about the 40 watts for whatever device Drake was based off of.
 
sometimes I wonder what hardware would result if Apple bought Nintendo. :dodges stones:

I mean I don’t WANT it to happen, but can you imagine a Switch powered by M2? Where budget for hardware R&D is (virtually) no object?
I wonder how a hypothetical fork of Metal customised for Nintendo's needs compares to NVN and/or NVN2.

Also this: for you techies
To be quite frank, Samsung's rumoured cultural problems don't exactly inspire confidence. (I hope Samsung can prove me wrong since more, not less, leading process node competitors.)

IIRC we still don't have explicit confirmation that Orin is even on 8nm, right?
We do. Read the paragraph marked update:

When I think of explicit confirmation, I think of the process node being written down directly in a slide (e.g. pp. 3-4 on the Hot Chips 30 Xavier slides). In that case, there's no explicit confirmation of Orin being fabricated using Samsung's 8N process node.

However, as mentioned, Orin being fabricated using Samsung's 8N process node, has been mentioned by Nvidia's PR (here and here), which I consider indirect confirmation rather than explicit confirmation.

At 15W it has multiple parts of the silicon disabled/turned off to meet that criteria.
Are we sure it's disabled? The yields gained by cutting all that silicon that's never intended to be used for a design that goes to a small camera seems pretty tantalizing to me.
Yes. (Pay attention to "Power budget", "Online CPU", "CPU maximal frequency (MHz)", "GPU TPC", and "GPU maximal frequency (MHz)".)

I'll be surprised if the Drake isn't 5nm honestly.
I think TSMC's N6 process node is also a possibility, and I think more likely than any process node in TSMC's N5 process node family (e.g. TSMC's N5, N5P, 4N, N4, and N4P process nodes). TSMC's N6 process node is probably more performant and power efficient than Samsung's 5LPE process node at high frequencies, and probably on par at low frequencies.
 
I wonder how a hypothetical fork of Metal customised for Nintendo's needs compares to NVN and/or NVN2.


To be quite frank, Samsung's rumoured cultural problems don't exactly inspire confidence. (I hope Samsung can prove me wrong since more, not less, leading process node competitors.)



When I think of explicit confirmation, I think of the process node being written down directly in a slide (e.g. pp. 3-4 on the Hot Chips 30 Xavier slides). In that case, there's no explicit confirmation of Orin being fabricated using Samsung's 8N process node.

However, as mentioned, Orin being fabricated using Samsung's 8N process node, has been mentioned by Nvidia's PR (here and here), which I consider indirect confirmation rather than explicit confirmation.



Yes. (Pay attention to "Power budget", "Online CPU", "CPU maximal frequency (MHz)", "GPU TPC", and "GPU maximal frequency (MHz)".)


I think TSMC's N6 process node is also a possibility, and I think more likely than any process node in TSMC's N5 process node family (e.g. TSMC's N5, N5P, 4N, N4, and N4P process nodes). TSMC's N6 process node is probably more performant and power efficient than Samsung's 5LPE process node at high frequencies, and probably on par at low frequencies.

Oh shit, data tables, thanks.
 
0
Okay. Here are some visual processing specialized processors that won't be needed for non auto orin derivatives.

PVA
VIC
NVENC (pretty sure? It won't need a dedicated video encoding processor right?)
OFA

I have no idea how much impact these processors have on foot print or power draw.
 
Based on the breach, we can infer that it isn’t using a different version of the same IP as desktop parts, so it isn’t starting from scratch. And A78 is a likely choice, which is old by today standards now (2 years). It would be using an aged IP for both (still pretty modern but not the very edge), different from Apple that uses all bleeding edge and does their own R&D for this.
Is there a ARM variant newer/better than A78 that would be a good fit for a Nintendo device? X1 doesn't sound like a good fit.
 
Please read this staff post before posting.

Furthermore, according to this follow-up post, all off-topic chat will be moderated.
Last edited:


Back
Top Bottom