• Hey everyone, staff have documented a list of banned content and subject matter that we feel are not consistent with site values, and don't make sense to host discussion of on Famiboards. This list (and the relevant reasoning per item) is viewable here.
  • Do you have audio editing experience and want to help out with the Famiboards Discussion Club Podcast? If so, we're looking for help and would love to have you on the team! Just let us know in the Podcast Thread if you are interested!

StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (Read the staff posts before commenting!)

I was looking at some phone reviews, and I'm increasingly convinced that UFS 2.1/2.2 at around 850MB/s is a reasonable baseline to expect for internal storage for [redacted]. My reasoning for this is pretty simple; almost every phone released in the past couple of years with slower storage than that has been limited by the SoC, not the storage itself.

The data for this all comes from Notebookcheck's phone benchmark database, which includes storage benchmark results going back quite a few years, with SoC and storage type data. They may not have every phone, but with around 100 phone reviews a year covering entry-level to high-end, it's easily the largest and most representative dataset. I've noticed a couple of labelling errors (UFS labelled as eMMC and vice versa), but these are typically pretty obvious.


First observation: not a single phone released in the last year with eMMC has used an SoC with UFS support. In every case eMMC was the only option for internal storage given the SoC the manufacturer chose.

There have been a few phones released with eMMC over the past year, all of which used SoCs which only support eMMC. A few months ago, the Moto G12, Moto G23 and Xiaomi Redmi 12C all launched with 128GB of eMMC, and all hit around 280MB/s read speeds, which is pretty typical for eMMC. They also all use Mediatek's Helio G85 SoC, which is a 12nm mid-range SoC from 2020 which doesn't support UFS. There are a handful of others with either old Mediatek SoCs or UNISOC SoCs, which again only support eMMC.

Even looking back two or three years, there are very few cases of phone manufacturers using eMMC when not limited by the SoC. The Nokia G21, released a year and a half ago, used 64GB of eMMC with the UNISOC T606, which supports UFS, however the G22 which replaced it a couple of months ago switched to UFS with the same T606 SoC. Similarly the Samsung Galaxy A13 5G used 64GB of eMMC with the Dimensity 700 SoC, whereas the newer Galaxy A14 5G uses the same Dimensity 700 and switched to UFS 2.2.

This tells me two things. Firstly, that eMMC isn't significantly cheaper than UFS, as if it was at least some manufacturers would use it given the option. Secondly, and arguably more importantly for Nintendo, that when those old SoCs which don't support UFS stop being manufactured, eMMC will effectively cease to be used in smartphones. If Nintendo want to keep selling [redacted] for 6 or 7 years they'll need to be able to keep buying parts for it, and if eMMC stops being used in smartphones then availability of parts will drop and there's a significant risk of it simply not being available for Nintendo, or them having to pay excessive amounts to keep it in production.

This isn't necessarily some far-off eventuality, either, it's entirely possible that the smartphone industry will have stopped using eMMC by the time [redacted] launches. If you think about why the Moto G12, G23 and Redmi 12C all appeared with a 3 year old 12nm SoC in a very similar time frame, it's not hard to guess why this is when you consider that they would have all started development when the chip shortage was in full flow. With the chip shortage well and truly over, there's very little reason for any phone currently in development to use such an old SoC, with newer alternatives manufactured on more efficient processes (and featuring UFS support).


Second observation: almost all phones with UFS 2.1/2.2 and speeds less than around 850MB/s are limited by the SoC operating in single-lane mode, not the UFS chip. Single-lane UFS chips don't seem to exist.

If you look through Notebookcheck's benchmarks for UFS 2.1/2.2 results, you'll see two broad groups of phones, ones which hit 850MB/s to 1GB/s and ones which hit around 500MB/s. Furthermore, you'll notice that all the phones which hit around 500MB/s use the same SoCs, like the Snapdragon 695 5G, Snapdragon 480 Plus, Helio G95, etc. Although Qualcomm and Mediatek don't publish the number of UFS lanes supported by their SoCs, it's a pretty safe guess that these chips only support single-lane UFS operation, given speeds of around 500MB/s are the limit of what you can get from single-lane UFS 2.1/2.2.

Furthermore, I don't think they're using single-lane UFS chips, because I don't think they exist. Samsung, who I believe are the largest UFS supplier, list only dual-lane parts in their UFS 2.x parts catalog. Kioxia list both of their current UFS 2.1 parts as supporting data rates of 1160MB/s, meaning dual-lane. SK Hynix doesn't seem to explicitly list the number of lanes supported by their UFS modules, but their product PDF for their newer 176-layer parts (here) lists only one part code for each capacity for UFS 2.2, and advertises 900MB/s or higher read speeds for them all, indicating they're all dual-lane parts. Micron doesn't list the number of lanes either, but they do list details like IOPS, so if there were both single-lane and dual-lane parts in there I would be very surprised if they didn't consider it important enough to distinguish between the two. They also only list one part per capacity for UFS 2.2.

This indicates that almost all UFS 2.1/2.2 parts are capable of 850MB/s+ sequential reads. If we exclude phones where the SoC is limited to single-lane mode, almost all UFS 2.1/2.2 results are between 850MB/s and 1GB/s, with only a handful of outliers below that. Smartphone SoCs which only support single-lane mode do so because Qualcomm and Mediatek want to up-sell to higher-end chips, and limiting storage performance on entry-level and mid-range chips is an easy way to do that. Nintendo and Nvidia have designed T239 specifically for [redacted], so there's no reason for them to intentionally hobble storage performance by only supporting single-lane UFS, when the chips they're buying would be capable of much higher speeds.


So, eMMC appears to be both not significantly cheaper than UFS, and would risk availability issues almost immediately. Meanwhile, unless Nintendo and Nvidia intentionally shoot themselves in the foot by only supporting single-lane mode, the baseline UFS sequential read speeds they can expect are around 850MB/s.
 
Last edited:
How much electricity would UFS 2.1 use to actually hit those speeds though.

UFS 4.0 hitting similar speeds at a small fraction of the electricity costs seems to make asset streaming directly from the SSD massively more viable.

Without asset streaming, I'm not sure how well the Switch 2 will be able to handle UE5.

With UFS 2.1, they may only be able to hit >800 MB/s during designated loading sections. This would still be a lot better than the Switch 1, but does introduce some problems for modern game engines.

(Even with UFS 4.0, you would need mandatory installs though which some are bizarrely against).
 
Last edited:
How much electricity would UFS 2.1 use to actually hit those speeds though.

UFS 4.0 hitting similar speeds at a small fraction of the electricity costs seems to make asset streaming directly from the SSD massively more viable.

Without asset streaming, I'm not sure how well the Switch 2 will be able to handle UE5.

With UFS 2.1, they may only be able to hit >800 MB/s during designated loading sections. This would still be a lot better than the Switch 1, but does introduce some problems for modern game engines.

(Even with UFS 4.0, you would need mandatory installs though which some are bizarrely against).
It's not "bizarre" to be against an inconvenience. 😅
 
I was looking at some phone reviews, and I'm increasingly convinced that UFS 2.1/2.2 at around 850MB/s is a reasonable baseline to expect for internal storage for [redacted]. My reasoning for this is pretty simple; almost every phone released in the past couple of years with slower storage than that has been limited by the SoC, not the storage itself.

The data for this all comes from Notebookcheck's phone benchmark database, which includes storage benchmark results going back quite a few years, with SoC and storage type data. They may not have every phone, but with around 100 phone reviews a year covering entry-level to high-end, it's easily the largest and most representative dataset. I've noticed a couple of labelling errors (UFS labelled as eMMC and vice versa), but these are typically pretty obvious.


First observation: not a single phone released in the last year with eMMC has used an SoC with UFS support. In every case eMMC was the only option for internal storage given the SoC the manufacturer chose.

There have been a few phones released with eMMC over the past year, all of which used SoCs which only support eMMC. A few months ago, the Moto G12, Moto G23 and Xiaomi Redmi 12C all launched with 128GB of eMMC, and all hit around 280MB/s read speeds, which is pretty typical for eMMC. They also all use Mediatek's Helio G85 SoC, which is a 12nm mid-range SoC from 2020 which doesn't support UFS. There are a handful of others with either old Mediatek SoCs or UNISOC SoCs, which again only support eMMC.

Even looking back two or three years, there are very few cases of phone manufacturers using eMMC when not limited by the SoC. The Nokia G21, released a year and a half ago, used 64GB of eMMC with the UNISOC T606, which supports UFS, however the G22 which replaced it a couple of months ago switched to UFS with the same T606 SoC. Similarly the Samsung Galaxy A13 5G used 64GB of eMMC with the Dimensity 700 SoC, whereas the newer Galaxy A14 5G uses the same Dimensity 700 and switched to UFS 2.2.

This tells me two things. Firstly, that eMMC isn't significantly cheaper than UFS, as if it was at least some manufacturers would use it given the option. Secondly, and arguably more importantly for Nintendo, that when those old SoCs which don't support UFS stop being manufactured, eMMC will effectively cease to be used in smartphones. If Nintendo want to keep selling [redacted] for 6 or 7 years they'll need to be able to keep buying parts for it, and if eMMC stops being used in smartphones then availability of parts will drop and there's a significant risk of it simply not being available for Nintendo, or them having to pay excessive amounts to keep it in production.

This isn't necessarily some far-off eventuality, either, it's entirely possible that the smartphone industry will have stopped using eMMC by the time [redacted] launches. If you think about why the Moto G12, G23 and Redmi 12C all appeared with a 3 year old 12nm SoC in a very similar time frame, it's not hard to guess why this is when you consider that they would have all started development when the chip shortage was in full flow. With the chip shortage well and truly over, there's very little reason for any phone currently in development to use such an old SoC, with newer alternatives manufactured on more efficient processes (and featuring UFS support).


Second observation: almost all phones with UFS 2.1/2.2 and speeds less than around 850MB/s are limited by the SoC operating in single-lane mode, not the UFS chip. Single-lane UFS chips don't seem to exist.

If you look through Notebookcheck's benchmarks for UFS 2.1/2.2 results, you'll see two broad groups of phones, ones which hit 850MB/s to 1GB/s and ones which hit around 500MB/s. Furthermore, you'll notice that all the phones which hit around 500MB/s use the same SoCs, like the Snapdragon 695 5G, Snapdragon 480 Plus, Helio G95, etc. Although Qualcomm and Mediatek don't publish the number of UFS lanes supported by their SoCs, it's a pretty safe guess that these chips only support single-lane UFS operation, given speeds of around 500MB/s are the limit of what you can get from single-lane UFS 2.1/2.2.

Furthermore, I don't think they're using single-lane UFS chips, because I don't think they exist. Samsung, who I believe are the largest UFS supplier, list only dual-lane parts in their UFS 2.x parts catalog. Kioxia list both of their current UFS 2.1 parts as supporting data rates of 1160MB/s, meaning dual-lane. SK Hynix doesn't seem to explicitly list the number of lanes supported by their UFS modules, but their product PDF for their newer 176-layer parts (here) lists only one part code for each capacity for UFS 2.2, and advertises 900MB/s or higher read speeds for them all, indicating they're all dual-lane parts. Micron doesn't list the number of lanes either, but they do list details like IOPS, so if there were both single-lane and dual-lane parts in there I would be very surprised if they didn't consider it important enough to distinguish between the two. They also only list one part per capacity for UFS 2.2.

This indicates that almost all UFS 2.1/2.2 parts are capable of 850MB/s+ sequential reads. If we exclude phones where the SoC is limited to single-lane mode, almost all UFS 2.1/2.2 results are between 850MB/s and 1GB/s, with only a handful of outliers below that. Smartphone SoCs which only support single-lane mode do so because Qualcomm and Mediatek want to up-sell to higher-end chips, and limiting storage performance on entry-level and mid-range chips is an easy way to do that. Nintendo and Nvidia have designed T239 specifically for [redacted], so there's no reason for them to intentionally hobble storage performance by only supporting single-lane UFS, when the chips they're buying would be capable of much higher speeds.


So, eMMC appears to be both not significantly cheaper than UFS, and would risk availability issues almost immediately. Meanwhile, unless Nintendo and Nvidia intentionally shoot themselves in the foot by only supporting single-lane mode, the baseline UFS sequential read speeds they can expect are around 850MB/s.
I do wonder if Nintendo and Nvidia added support for two UFS 3.x (2x) controllers for Drake, with one dedicated to the internal flash storage, and one dedicated to UFS Cards. I know Orin has support for one UFS 3.0 (2x) controller.

How much electricity would UFS 2.1 use to actually hit those speeds though.

UFS 4.0 hitting similar speeds at a small fraction of the electricity costs seems to make asset streaming directly from the SSD massively more viable.

Without asset streaming, I'm not sure how well the Switch 2 will be able to handle UE5.

With UFS 2.1, they may only be able to hit >800 MB/s during designated loading sections. This would still be a lot better than the Switch 1, but does introduce some problems for modern game engines.

(Even with UFS 4.0, you would need mandatory installs though which some are bizarrely against).
Assuming Drake's taped out at 1H 2022 (here and here), which I do think is likely, I don't expect UFS 4.0 support.

If the AGX Orin block diagram (here and here) is any indication, the UFS controller is directly integrated into the SoC, which means Nintendo and Nvidia need to decide which UFS controller to use before taping out Drake.

The UFS 4.0 specs were published by JEDEC on 17 August 2022. And the first smartphone equipped with the Snapdragon 8 Gen 2, which supports UFS 4.0, was released on 6 December 2022. So I doubt the UFS 4.0 controller was available to be used during 1H 2022.
 
Last edited:
Cool, so

1): You have given very little evidence about whether or not companies can go back and edit "taped out" hardware.
2): I'm just stating hitting max speeds is costly electricity-wise so it would be much better for load times, but not good for engine support.

(The Switch 2 isn't launching until Fall 2024 at the earliest, why would they "tape out" and refuse to edit their hardware 30 months before release, who would do this. The Xbox Series X has features AMD introduced September 2020)
 
Without asset streaming, I'm not sure how well the Switch 2 will be able to handle UE5.
data streaming isn't a bottleneck. in one of the UE5 demos, it was around 300MB/s

1): You have given very little evidence about whether or not companies can go back and edit "taped out" hardware.
we know for a fact they can. it's just expensive.

(The Switch 2 isn't launching until Fall 2024 at the earliest, why would they "tape out" and refuse to edit their hardware 30 months before release, who would do this. The Xbox Series X has features AMD introduced September 2020)
completion of the chip is independent of the manufacturing
 
Cool, so

1): You have given very little evidence about whether or not companies can go back and edit "taped out" hardware.
2): I'm just stating hitting max speeds is costly electricity-wise so it would be much better for load times, but not good for engine support.

(The Switch 2 isn't launching until Fall 2024 at the earliest, why would they "tape out" and refuse to edit their hardware 30 months before release, who would do this. The Xbox Series X has features AMD introduced September 2020)
It's documented that Drake has dedicated decompression hardware, which should help alleviate storage bottlenecks.

Also there was a dev in this thread, who's working on a game with Nanite. At least for his game, he said 500Mb/s could work with some optimization and 1000Mb/s would be plenty to work with. 850Mb/s sounds pretty good, from that tidbit,
 
Yes, completing a chip 30-42 months before it releases just seems very stupid and weird.
You just arrived at one fundamental reason why many people in this thread still think the system might launch this year.

If it does release late next year, then it implies that Nintendo either held onto a taped out, semi-custom made SoC for that long, or they somehow paid for changes after the tape out. A third option would be that they're going to use some other SoC, but that would also imply that a significant amount of Nintendo's money was spent on an SoC they didn't use.
 
You just arrived at one fundamental reason why many people in this thread still think the system might launch this year.

If it does release late next year, then it implies that Nintendo either held onto a taped out, semi-custom made SoC for that long, or they somehow paid for changes after the tape out. A third option would be that they're going to use some other SoC, but that would also imply that a significant amount of Nintendo's money was spent on an SoC they didn't use.

I mean, that or the tape out was never actually completed or the guy just had a typo on his resume.
 
1): You have given very little evidence about whether or not companies can go back and edit "taped out" hardware.
Taping out a chip means that a chip's design has been finished and is ready for foundry companies (e.g. TSMC, Samsung, Intel, etc.) to fabricate. Once a chip's been taped out, no more physical changes can be made to that chip.

If a company wants to make any physical changes to the taped out chip, the chip has to be re-designed and taped out again. One example is with the Tegra X1 and the Tegra X1+. The Tegra X1 supports up to LPDDR4. But the Tegra X1+ supports up to LPDDR4X. Basically, Nvidia changed the RAM controller (from a LPDDR4 controller to a LPDDR4X controller) and perhaps physically removed the Cortex-A53 cores (since the Cortex-A53 cores were physically present in the Tegra X1, but were actually disabled) when re-designing the Tegra X1 as the Tegra X1+.

But as ILikeFeet said, that probably won't be inexpensive for Nintendo and Nvidia.

Also there was a dev in this thread, who's working on a game with Nanite. At least for his game, he said 500Mb/s could work with some optimization and 1000Mb/s would be plenty to work with. 850Mb/s sounds pretty good, from that tidbit,
Mark Cerny mentioned third party developers requested a NVMe SSD with at least a sequential read speed of 1 GB/s for the PlayStation 5. Third party developers could have requested a higher sequential read speed from Sony, but ultimately didn't. And UFS 2.1/2.2 can definitely achieve those sequential read speeds. So as long UFS 2.1/2.2 is used, I think Nintendo should be okay.

I mean, that or the tape out was never actually completed or the guy just had a typo on his resume.
(Before I comment, I realised I put the wrong links for the links to the LinkedIn profiles that strongly implied Drake was taped out in 1H 2022. So I've fixed the links.)

I don't think electrically characterising and validating the I/O interfaces for Drake is possible until after Drake's taped out since I imagine actual silicon is needed to electrically characterise and validate the I/O interfaces.
 
I believe LiC mentioned that there's no support for DLSS 3 on NVN2. Of course, nobody knows how capable the OFA on Drake's GPU is comparable to the OFA on Ada Lovelace GPUs. But Drake's GPU does inherit from Orin's GPU the same OFA. But anyways, unless the performance of the OFA on Drake's GPU is comparable to the performance of the OFA on Ada Lovelace GPUs, and Nintendo and Nvidia have the option to add DLSS 3 support to NVN2 later on, I don't think DLSS 3 support on Nintendo's new hardware is likely.

Anyway, I think RedGamingTech is dubious when Nintendo's concerned.
RedGamingTech claimed that Nintendo's development teams expected the Nintendo Switch to use a SoC based on the Drive PX2 (probably AutoCruise), and Nintendo's development teams didn't expect the Nintendo Switch to use the Tegra X1. Outside of the larger memory bus width from the Tegra X2 vs Tegra X1, I don't think RedGamingTech's assertion that the Tegra X2 is vastly superior to the Tegra X1 is correct.
Although Nvidia does mention that the Tegra X2's GPU is a Pascal based GPU, and Nvidia did introduce DP4a instructions support with Pascal GPUs, the Tegra X2's GPU doesn't have DP4a instructions support.
That leads me to believe that there's very little difference, if any, between the Tegra X1 and the Tegra X2, GPU wise.
And there probably won't be any difference between the Tegra X1 and the Tegra X2, CPU wise, for Nintendo's use case, since Nintendo could ask Nvidia to disable the Denver 2 cores, especially if Nintendo couldn't find any use for the Denver 2 cores for video game development purposes, which means Nintendo still probably has four Cortex-A57 cores, regardless if Nintendo used the Tegra X1 or the Tegra X2.
I think the only practical difference between the Tegra X1 and the Tegra X2 is the process node being used for fabrication, which became non-existent with the Tegra X1+.

RedGamingTech also mentioned he thinks the "Switch Pro" was using a SoC based on Xavier. And I think LiC mentioned that Xavier wasn't mentioned anywhere on the NVN2 files, so that's probably not true.
Thanks as usual Dakhil for helping everyone frame expectations each time a leaker come forward with such claims.
 
My serious issue with this discussion on "taping out" is that we have no estimates on how expensive it is to edit completed hardware.

It could just cost like less than $50m or something and Nintendo would be completely fine doing that.

I have literally no idea how much it costs and I don't think any good estimates have been given.
 
My serious issue with this discussion on "taping out" is that we have no estimates on how expensive it is to edit completed hardware.

It could just cost like less than $50m or something and Nintendo would be completely fine doing that.

I have literally no idea how much it costs and I don't think any good estimates have been given.
The main thing with taping out is that it takes a good amount of time and testing effort to do.
And the main cost is the allocation of the process node as you have to make that allocation well ahead of time otherwise pay excessively upfront to get extra allocation on the process node.

So taping out Drake a while ago would be in favor of them not wanting to excess pay for allocation on whatever process node they are using (Likely TSMC 4N as Thrak stated).

The reason MS and Sony could do a quick tapeout/process node allocation/production turnaround was due to that being pre-Scalpocalypse/demand overshoot, and also due to NVIDIA being on Samsung 8N, Qualcomm being on a Samsung node atm too, and Intel being on their own node. So TMSC was mainly being used by AMD and Mediatek at that time.

Now atm though, TSMC pretty much has a monopoly on high-end products/architecture designs so you have to order far in advance to get economical allocation costs or pay extreme upfront.

Reason why NV was told off by TSMC about wanting to sell off their overallocation of TSMC 4N back when Lovelace launched.
 
I believe LiC mentioned that there's no support for DLSS 3 on NVN2. Of course, nobody knows how capable the OFA on Drake's GPU is comparable to the OFA on Ada Lovelace GPUs. But Drake's GPU does inherit from Orin's GPU the same OFA. But anyways, unless the performance of the OFA on Drake's GPU is comparable to the performance of the OFA on Ada Lovelace GPUs
I haven't seen good data in die shots, but as small as Drake is, I wonder how much die area the Ada OFA would take. Considering it can "only" double performance, it would need to be less than half a GPC to be a good use of space.
I think it's less likely just because Nintendo's year is so empty
It would be crazy if it were a partner direct, if for no other reason than the number of partner announcements through Twitter. Not that you'd hold them, necessarily, for a partner direct, just that if Nintendo could fill a partner mini above and beyond what they've already mentioned that is robust 3rd party support.

no it's not

are we seriously acting like the switch in handheld is comparable to the xbox one? wait, is it comparable to an xbox one? if so, why do the games look like that?
No, you're totally right. As some folks have pointed out, though, the 360/PS3 were sort of insane outliers in terms of their design. Going by performance, yeah, it's more like those machines. But design, it looks a lot more like modern systems. That's why miracle ports were possible. Same thing with redacted. It might perform like last gen machines, but in terms of design it looks more like current gen machines.

I think we start to get into the weeds about what "portable PS4" looks like, and it's not so much about wrong or right but about different people meaning different things by that.


Just a detail...
You're obviously correct, and one of the reasons I said graphs like that destroy nuance. I think the counter to your on-paper argument, is that last gen games were more likely to be single thread bound than modern games, and the lack of cores in the Switch was less of an issue. Even if the numbers don't perfectly reflect the gap, I think practically speaking it communicates the core idea that CPU is more likely to be a big deal for ports this gen than GPU, and seeing a game struggling graphically on Series S doesn't mean that it's a hard port - and conversely, looking great on Series S doesn't mean you can just cut resolution to get onto [redacted]. It's just a different ballgame this gen
 

Can Qualcomm provide the complete hardware and software package comparable to what's currently offered by Nvidia? And does Nintendo really care about Android gaming?

The only benefit I could see by going with Qualcomm is having access to Nuvia's CPUs.

One of the biggest lessons we learned from Project Lime is that, even after 8 years since the launch of the NVIDIA Tegra X1 (the SoC that powers the Nintendo Switch), Android SoC vendors still don't know how to make GPU drivers.

They all started working on Vulkan drivers around the same time in 2016. And yet, none of them have managed to deliver a compliant and stable Vulkan Android driver, except for the rare few NVIDIA devices.

It is obvious that only 4 vendors have the expertise and the commitment to make Vulkan drivers work: NVIDIA, AMD, and Mesa, with a special mention for Intel, who recently stepped up their game.

Although not one of these 4, we decided that limiting support to Qualcomm SoCs was our only option for now if we didn't want to spend several months further modifying our GPU code to accommodate all the quirks and broken extensions of Android phones and tablets. Not because their driver is decent — it's bad. But it was just good enough to get some games to render, albeit incorrectly most of the time.

Qualcomm is the best option (and for now, the only one) because bylaws created AdrenoTools, which lets users load the vastly superior Mesa Turnip drivers on their Adreno 600 series GPUs, providing more accurate rendering, comparable to the quality expected of PC products. Any Qualcomm SoC with a name like "Snapdragon ###" from the 460 to the 888+ equipped with an Adreno 600 series GPU can choose to use either the proprietary Qualcomm driver or Mesa Turnip.

The performance gain you can expect from a device with a Snapdragon Gen 1 or Gen 2 is quite significant. But the problem is that, while the Adreno 700 series GPU that comes with it is very powerful hardware-wise, the proprietary Qualcomm driver for it is subpar at best, and Mesa has only just begun to work on adding support for Turnip. There's an early driver release to test, but results are not great for now. It will take weeks, if not months, before we see proper support emerge. In the meantime, we intend to work on improving the rendering on the official Adreno drivers.

The Adreno 500 series is too outdated for yuzu. Its proprietary Vulkan driver is missing many of the essential features required, and Turnip has no plans to support it either.
 
You're obviously correct, and one of the reasons I said graphs like that destroy nuance. I think the counter to your on-paper argument, is that last gen games were more likely to be single thread bound than modern games, and the lack of cores in the Switch was less of an issue. Even if the numbers don't perfectly reflect the gap, I think practically speaking it communicates the core idea that CPU is more likely to be a big deal for ports this gen than GPU, and seeing a game struggling graphically on Series S doesn't mean that it's a hard port - and conversely, looking great on Series S doesn't mean you can just cut resolution to get onto [redacted]. It's just a different ballgame this gen
Poor thread utilization has been a frequent talking point in the recent DF coverage of disappointing PC ports, so I'm not sure the situation has truly improved all that much, at least when it comes to UE4.
I see Yuzu is joining in on the tradition of emulator devs shaming mobile GPU vendors.
 
Poor thread utilization has been a frequent talking point in the recent DF coverage of disappointing PC ports, so I'm not sure the situation has truly improved all that much, at least when it comes to UE4.
unfortunately, it won't improve in UE4. not to mention all of these releases are on older versions of UE4 anyway, so it's not like they can just update. we just have to hope UE5.0 and 5.1 have enough improvements built in already
 
I do wonder if Nintendo and Nvidia added support for two UFS 3.x (2x) controllers for Drake, with one dedicated to the internal flash storage, and one dedicated to UFS Cards.
Regarding the UFS Card, I agreed with @Dakhil and many others here that it could be a great replacement for the microSD in theory, and previously made several posts in support of that. However, that ship probably has sailed sunk for a some time now, and the foundation upon which an ecosystem can be built is no longer in place. My survey of the UFS Association member websites:

UFS Card
  • Samsung
    • AFAIK, Samsung was the only manufacturer who actually commercialized the UFS Card v1.0.
    • However, I can’t find any evidences suggesting that it was still being produced post 2019.
    • Nothing beyond v1.0 was ever released.
    • Although Samsung US still has a placeholder page up, the product has been scrapped from many other regional sites, such as UK and Canada.
  • Phison
    • Phison was the only other manufacturer (that I know) who released UFS Card v1.0 samples.
    • No one ever contracted Phison to manufacture UFS Card, unfortunately.
    • Nothing beyond v1.0 exists.
UFS Card socket
  • Amphenol seems to be the only manufacturer of UFS Card sockets.
    • Revision X1, base part number 10101704, was released in 2017.
    • Although revision X2, base part number 10101870, was introduced in 2019, I found no indication of it ever being mass produced. No distributor carries it either.
    • It seems that the revised socket did not find any customers.
  • Molex was supposed to produce UFS Card sockets too, but they never did.
UFS Card test fixtures
  • Astek released test fixtures for both the card (A9-UFS-02) and the host (A9-UFS-01).
  • They were introduced in 2018, supporting UFS Card 1.0 and M-PHY Gear 3.
  • I can’t find any updates for UFS Card 3.0 or M-PHY Gear 4.
USB to UFS bridge controller
  • A hypothetical Switch 2 with UFS Card socket would not need a USB to UFS bridge controller. I’m including it here as a further proof that the UFS Card ecosystem is likely dead.
  • Silicon Motion SM3350 and JMicron JMS901
    • Both chips were released in 2018, supporting UFS v2.1 (UFS Card 1.0).
    • No newer versions have been introduced since then.
So despite all the advantages UFS Card has over the competing removable media, it couldn’t sustain an ecosystem. There doesn’t seem any further development aside from the 3.0 standard that exists only on paper. I doubt that Nintendo would want to singlehandedly revive a dead medium.
 
I haven't seen good data in die shots, but as small as Drake is, I wonder how much die area the Ada OFA would take. Considering it can "only" double performance, it would need to be less than half a GPC to be a good use of space.
Although there hasn't really been any good, public die shots of AD102 as of currently, Locuza managed to find a good enough AD102 die shot from techanalye1 to do some general annotations of AD102. Although not explicitly labelled, Locuza did manage to label the area where the OFA, encoders, decoders, etc., are located in AD102.
Fpga3b8XsAIe1IB
 
how many phones with UFS storage also come with SD card slots? because that's the simplest solution. if the SD card (with minimal bottleneck) is slower than internal memory, then so be it. hopefully Nintendo is looking at 256GB options for internal storage, but unfettered SD card speeds would still provide a better experience
 
how many phones with UFS storage also come with SD card slots? because that's the simplest solution. if the SD card (with minimal bottleneck) is slower than internal memory, then so be it. hopefully Nintendo is looking at 256GB options for internal storage, but unfettered SD card speeds would still provide a better experience
They could also yoink whatever magic Valve gave their SD Card slot as that slot seemingly has no real performance degradation versus internal outside of extreme scenarios
 
Already posted?



Switch 2 whit a Microfone and camera. That could be a good thing. Plus a new health game.

This would presumably be related to Pokémon Sleep. A docking station for a cell phone is a lot different from the Pokémon GO Plus + that they've already announced, by which I mean it seems like it's redundant and/or worse, though I don't know if that would fall under the same patent.

Yurie Hattori is listed as an inventor, and it would make some sense for her to be involved with Pokémon Sleep. While she hasn't done anything with Pokémon before, most of her other work has been under Hitoshi Yamagami's production group, frequently as a Nintendo-side director. I would imagine that she's credited as an inventor here for the same reason Yuya Sato was credited on TotK's patents.
 
0
how many phones with UFS storage also come with SD card slots? because that's the simplest solution. if the SD card (with minimal bottleneck) is slower than internal memory, then so be it. hopefully Nintendo is looking at 256GB options for internal storage, but unfettered SD card speeds would still provide a better experience
Lots of Chinese phones pretty much every redmi note series iirc
 


Nice to see DF's quality of coverage being extended to the entry level cards. Most interesting to this thread, because the consoles tend to be most comparable to the bottom of the stack.

One thing of note - on desktop, you can get stuck between a rock and a hard place with the memory controller. Nvidia has gotten itself stuck before where its only viable options were 8GB of RAM or 16, with 16 being excessive and wildly overpricing the card, and 8 being insufficient and leaving the card underperforming. While the source of the [redacted] 12GB rumor is, ahem, dubious at best, it would be an excellent choice, and place it in a very comfortable position where the performance of the GPU would be unconstrained, while still offering plenty of room for non-VRAM tasks, without saddling the hardware with all the cost that a larger RAM pool would create.
 
0
Handheld ps4 for next switch will be a total disapointment for us the switch brew community, because we already can make the switch play games on handheld with that capacity.
This is not true. The level of games like Horizon Zero Down, Uncharted 4 or God of War has never been seen running smoothly and at 1080p on Switch.
 
Handheld ps4 for next switch will be a total disapointment for us the switch brew community, because we already can make the switch play games on handheld with that capacity.
Ehhhhhhhhh. Correct me if I'm wrong, but I'm not 100% sure that's quite accurate. Getting the Switch version of Doom to 720p60 in handheld is different than running the PS4-quality version on the Switch in handheld mode.

Plus, saying it's a handheld PS4 is almost downplaying it a bit. The architectural improvements and the better CPU should put it wayyyy ahead of the PS4, albeit not quite at Series S level. It's a welcome improvement in my book, good third party support or not. I'm just excited to see what Nintendo can do with such a device, with how first party games look right now on such a weak system.
 
The VRAM situation can be so wild in some respects. The GDDR6 spec allows for densities of 1 GB, 1.5 GB, 2 GB, 3 GB, and 4 GB. Yet, memory manufacturers only make 1 GB and 2 GB chips.
So what I wonder is, did the manufacturers decide on their own? Or was it feedback from prospective customers that only 1 and 2 GB were worth it?
(The impact that's most relevant here is restricting the amount of RAM that the PS5 and Series consoles have. In turn, that relatively small increase in ram from one generation to the next is what drives the concept of using fast storage to quickly replace the contents of memory. And that in turn becomes a useful marketing point, which incidentally gets us in a tizzy over 'OMG, will the next system's storage be fast enough?')

...and if my mind wanders far enough along this track, there is this wondering if whether GDDR is something that's also shifted away from being consumer driven*. While I guess that we're lucky that LPDDR still seems to be driven by consumer devices and thus have plenty of density options actually offered.

*for example, like say... PCI-Express. Draft version 0.3 of PCI-Express 7.0 was finished this past week wasn't it? And consumer usage is still mostly fine with 3.0...
 
Could [redacted] run Starfield?
The last time I did some truly in depth prediction on [redacted] performance it was in the context of PS4 and cross-gen. Since then, the launch of truly "next-gen" games has come along, and my own understanding has grown, so I thought it might be worth returning to.

Rather than do some abstract "Redacted is 73% of Series 5, assuming Nintendo picks Zeta Megahertz on the Right Frombulator" I thought it would be nice to look in depth at Starfield, a game I'm curious about, and think about what it might look like on a theoretical [redacted]. Which, I guess, is kinda abstract since we're talking about unreleased software on unannounced hardware, but let me have this.

TL;DR: The Takeaway
If there is one thing I want folks to come away with from this exercise it's "the problems of last gen are not the problems of this gen. Same for the solutions."

I know that's not satisfying, but the PS5/Xbox Series consoles are not just bigger PS4/Xbox One, and [redacted] is not just a bigger Switch. Switch had big advantages and big disadvantages when it came to ports - [redacted] is the same but they are different advantages and disadvantages.

For the most part, the Series S doesn't "help" [redacted] ports as much as some folks think. And obviously, Starfield is going to remain console exclusive to Microsoft's machines. But yes, I believe a port of Starfield would be possible. It would also be a lot of work, and not in the ways that, say, The Witcher III was a lot of work.

Zen and the ARM of Gigacycle Maintenance
Behold, the ballgame:



Graphs like this kill a lot of nuance, but they're also easy to understand. Last gen TV consoles went with bad laptop CPUs. Switch went with a good mobile CPU. That put them in spitting distance of each other.

[redacted] is set to make a generational leap over Switch, but PS5/Xbox Series have made an even bigger leap, simply because of how behind they were before. And, most importantly - the daylight between Series S and Series X is minimal. The existence of a Series S version doesn't help at all here.

This is especially rough with Starfield, a game that is CPU limited. With GPU limited games, you can cut the resolution, but that won't help here. Cutting the frame rate would - except it's already 30fps. There are no easy solutions here.

That doesn't mean no solutions. But this puts in solidly "holy shit how did they fit it onto that tiny machine" territory.

I Like It When You Call Me Big FLOPa
Good news: DLSS + The Series S graphics settings, done. Go back to worrying about the CPU, because that's the hard problem.

The tech pessimism - Ampere FLOPS and RDNA 2 FLOPS aren't the same, and it favors RDNA 2. Whatever the on-paper gap between [redacted] and Series S, the practical gap will be somewhat larger. If you want the numbers, open the spoiler. Otherwise, just trust me.

GPUs are not FLOPS alone. There are also ROPS/TMUs/memory subsystems/feature set. There are also tradeoffs for going for a wider/slower vs narrower/faster design. If we want to game out how Series S and [redacted] might perform against each other we would, ideally, want two GPUs that we could test that roughly parallel all those things.

The Series S GPU is 1280 cores, 80 TMUs, 32 ROPs, with 224 GB/s of memory bandwidth, at 4TFLOPS
[redacted]'s GPU is 1536 cores, ?? TMUs, 16 ROPs, with 102 GB/s of memory bandwidth, at a theoretical 3 TFLOPS.

The RX 6600 XT is 2048 cores, 128 TMUs, 64 ROPS, with 256 GB/s of memory bandwidth + 444.9 GB/s infinity cache, at 10.6 TFLOPS
The RTX 3050 is 2560 cores, 80 TMUs, 32 ROPs, with 224 GB/s of memory bandwidth, at 9 TFLOPS.

No comparison is perfect, but from a high level, this is pretty close. The Ampere card is slightly fewer FLOPS built on 20% more cores, the RDNA 2 card supports that compute power with twice as much rasterization hardware. And the performance is within the same realm as the existing consoles, so we're not trying to fudge from something insane like a 4090.

The downside of this comparison is the memory bandwidth. The consoles and the RX 6000 series have very different memory subsystems. We're going to act like "big bandwidth" on consoles and "medium bandwidth plus infinity cache" are different paths to the same result, but it's the biggest asterisk over the whole thing.

Digital Foundry has kindly provided us with dozens of data points of these two cards running the same game in the same machine at matched settings. Here is the 1080, rasterization only numbers

GameAmpere FPSRDNA 2 FPSPercentage
Doom Eternal15623167
Borderlands 3539456
Control548365
Shadow of the Tomb Raider9013268
Death Stranding8313561
Far Cry 59513968
Hitman 29614665
Assassin's Creed: Odyssey518162
Metro Exodus488060
Dirt Rally 2.06210459
Assassin's Creed: Unity10015763

As we can see pretty clearly, the Ampere card underperforms the RDNA 2 card by a significant margin, with only a 3.9% standard deviation. If we grade on a curve - adjusting the for the differences in TFLOPS - that improves slightly. Going as the FLOPS fly, Ampere is performing at about 74% of RDNA 2.

We could compare other cards, and I have, but the gap gets bigger, not smaller as you look elsewhere. Likely because where Nvidia spent silicon on tensor cores and RT units, AMD spent them on TMUs and ROPs.

If you take those numbers, an imaginary 3TFLOP [redacted] isn't 75% the performance of the Series S, but closer to 55%. We will obviously not be able to run the Series S version of the game without graphical changes. So what about DLSS? Again, technical analysis below, but the short answer is "DLSS Performance Mode should be fine".

Let's do some quick math. At 55% of the performance of Series S, is Series S can generate an image natively in 1ms, [redacted] can do it in 1.78ms. According to the DLSS programming guide, our theoretical [redacted], we can get a 1440p image (the Series S target for Starfield) from a 720p source in 2.4ms.

Looking at those numbers it is clear that there is a point where DLSS breaks down - where the native image rendering is so fast, that the overhead of DLSS actually makes it slower. That should only happen in CPU limited games, but it just so happens, Starfield is a CPU limited game. So where is that line?

Series S GPU Time * 1.78 (the redacted performance ratio) * 0.25 (DLSS performance mode starts at 1/4 res) + 2.4ms (redacted's DLSS overhead) = Series S GPU Time

Don't worry, I've already solved it for you - it's 3.8ms. That would be truly an extremely CPU limited game. So DLSS seems extremely viable in most cases.

Starfield is a specific case, however, as is the Series S generally. Starfield uses some form of reconstruction, with a 2x upscale. If Series S is struggling to get there natively, will DLSS even be enough? Or to put it another way, does FSR "kill" DLSS?

Handily, AMD, also provides a programming guide with performance numbers for FSR 2, and they're much easier to interpret than the DLSS ones. We can comfortably predict that FSR 2 Balanced Mode on Series S takes 2.9ms. You'll note that DLSS on [redacted] is still faster than FSR 2 on the bigger machine. That's the win of dedicated hardware.

And because of that, we're right back where we started. For GPU limited games, if the Series S can do it natively, we can go to half resolution, and DLSS back up in the same amount of time, or less. If the Series S is doing FSR at 2x, we can do 4x. If Series S is doing 4x, by god, we go full bore Ultra Performance mode. And should someone release a FSR Ultra Performance game on Series S, well, you know what, Xbox can keep it.

Worth noting, that even then the options don't end for [redacted]. Series S tends to target 1440p because it scales nicely on a 4k display. But 1080p also scales nicely on a 4k display, giving us more options to tune.

Whether you are willing to put up with DLSS here is a subjective question, but this is a pretty straight forward DLSS upscale, nothing unusual at all. Where it might become dicey is if Imaginary Porting Studio decided to do something wild like go to Ultra Performance mode, not because of the graphics, but to free up time for the CPU to run. In CPU limited games, that rarely gives you the performance you need, but it's worth noting that [redacted] and DLSS do give us some "all hands on deck" options.

In Space, No One Can Hear You Stream
It's not just CPUs and GPUs obviously. The ninth gen machines all advertise super fast NVMe drives. Meanwhile, we have no idea what [redacted]'s storage solution will look like. But I don't want to talk too much about abstract performance, I want to talk about Starfield.

Starfield's
PC requirements are informative. It requires an SSD, but doesn't specify type, nor does it recommend an NVMe. It only requires 16GB of RAM, which is pretty standard for console ports, which suggests that Starfield isn't doing anything crazy like using storage as an extra RAM pool on consoles. It's pretty classic open world asset streaming.

Let's make a little table:

Switch eMMCOld SATA SSDModern eMMCSATA III SSDiPhone NVMeSeries S NVMeAndroid UFS 4UFS 4, on paper
300MB/s300MB/s400 MB/s500MB/s1600MB/s2400MB/s3100MB/s5800MB/s

Nintendo has a lot of options, and pretty much all of them cross the Starfield line - if mandatory installs are allowed by Nintendo. There is a big long conversation about expansion and GameCard speed that I think is well beyond the scope here, and starts to get very speculative about what Nintendo's goals are. But at heart, there is no question of the onboard storage of [redacted] being fast enough for this game.

Don't Jump on the Waterbed
When you push down on the corner of a waterbed, you don't make the waterbed smaller, you just shift the water around.

You can do that with software, too. Work can be moved from one system (like the CPU) to another (RAM) if you're very clever about it (caching, in this case). Sometimes it's faster. Sometimes it's slower. But that doesn't matter so much as whether or not you've got room to move. This is likely one of the reasons that Nintendo has historically been so generous with RAM - it's cheap and flexible.

The danger with this next-gen ports isn't any one aspect being beyond what [redacted] can do. It's about about multiple aspects together combining to leave no room to breath. NVMe speed you can work around, GPU can cut resolution, CPU can be hyper optimized. But all three at once makes for a tricky situation.

At this point I don't see evidence of that in Starfield - I suspect only the CPU is a serious bottle neck. But some minor things worth bringing up:

RAM - reasonable expectations are that Nintendo will go closer to 12 GB than 8 GB, so I don't see RAM as a serious issue.

Storage space - PC requirements call for a whopping 128GB of free space. That's much larger than Game Cards, and most if not all of the likely on board storage in [redacted]. There are likely a bunch of easy wins here, but it will need more than just easy wins to cross that gap.

Ray Tracing - Starfield uses no RT features on consoles, so despite the fact that [redacted] likely does pretty decent RT for its size, it's irrelevant here.

Appendix: The Name is Trace. Ray Trace
But someone will ask, so here is the quick version: [redacted]'s RT performance is likely to be right up to Series S. But it's not like Series S games often have RT, and RT does have a decent CPU cost, where [redacted] is already weakest. So expect RT to be a first party thing, and to be mostly ignored in ports.

Let's look at some benchmarks again. The 3050 vs the 6600 XT once more. This time we're using 1440p resolution, For Reasons.

Game3050 FPS3050 FPS w/RTRT Cost6600 XT FPS6600 XT FPS w/RTRT Cost
Control351924.1ms492029.6ms
Metro Exodus372414.6ms603016.7ms
The method here is less obvious than before. We've taken the games at max settings with RT off, then turned RT on, and captured their frame rates. Then we've turned the frame rate into frame time - how long it took to draw each frame on screen. We've then subtracted the time of the pure raster frame from the RT frame.

This gives us the rough cost of RT in each game, for each card, lower is better. And as you can see, despite the fact that the 3050 is slower than the 6600 XT by a significant margin, in pure RT performance, it's faster. About 38% faster when you grade on the curve for the difference in TFLOPS.

There aren't a lot of games with good available data like this to explore, but there are plenty of cards, and you can see that this ratio tends to hold.

Game3060 FPS3060 FPS w/RTRT Cost6700 XT FPS6700 XT FPS w/RTRT Cost
Control552817.5ms672525.1ms
Metro Exodus543510.1ms743713.5ms
This gives us 43% improvement for Ampere, adjusted for FLOPS.

Applying this adjustment our theoretical 3TF [redacted] out performs the 4TF Series S by 3.5%.

It's worth noting that RDNA 2 doesn't have true RT hardware. Instead, the CPU builds the BVH structure, and then triangle intersections are tested by the existing TMUs that the GPU already has. Ampere performs both operations on dedicated hardware. This should reduce the CPU load, but also opens up the possibility of further wins when using async compute.

Really appreciate the indepth writeup on this! I usually lurk and read through all these but the topic of Starfield running on ReDraketed really interested me so I'm glad to see the same thought, care and professionalism put into your usual posts into this as well!
 
They could also yoink whatever magic Valve gave their SD Card slot as that slot seemingly has no real performance degradation versus internal outside of extreme scenarios
That's not magic. That's the CPU bottlenecking the loading of assets because they are compressed, needing to be decompressed.
 
Handheld ps4 for next switch will be a total disapointment for us the switch brew community, because we already can make the switch play games on handheld with that capacity.
No you can’t :p

There’s no way possible for even overclocked switch to match the PS4.


And the only thing people refer to with PS4 is just the GPU and its paper numbers.

the rest is different.
 
Regarding the UFS Card, I agreed with @Dakhil and many others here that it could be a great replacement for the microSD in theory, and previously made several posts in support of that. However, that ship probably has sailed sunk for a some time now, and the foundation upon which an ecosystem can be built is no longer in place. My survey of the UFS Association member websites:

UFS Card
  • Samsung
    • AFAIK, Samsung was the only manufacturer who actually commercialized the UFS Card v1.0.
    • However, I can’t find any evidences suggesting that it was still being produced post 2019.
    • Nothing beyond v1.0 was ever released.
    • Although Samsung US still has a placeholder page up, the product has been scrapped from many other regional sites, such as UK and Canada.
  • Phison
    • Phison was the only other manufacturer (that I know) who released UFS Card v1.0 samples.
    • No one ever contracted Phison to manufacture UFS Card, unfortunately.
    • Nothing beyond v1.0 exists.
UFS Card socket
  • Amphenol seems to be the only manufacturer of UFS Card sockets.
    • Revision X1, base part number 10101704, was released in 2017.
    • Although revision X2, base part number 10101870, was introduced in 2019, I found no indication of it ever being mass produced. No distributor carries it either.
    • It seems that the revised socket did not find any customers.
  • Molex was supposed to produce UFS Card sockets too, but they never did.
UFS Card test fixtures
  • Astek released test fixtures for both the card (A9-UFS-02) and the host (A9-UFS-01).
  • They were introduced in 2018, supporting UFS Card 1.0 and M-PHY Gear 3.
  • I can’t find any updates for UFS Card 3.0 or M-PHY Gear 4.
USB to UFS bridge controller
  • A hypothetical Switch 2 with UFS Card socket would not need a USB to UFS bridge controller. I’m including it here as a further proof that the UFS Card ecosystem is likely dead.
  • Silicon Motion SM3350 and JMicron JMS901
    • Both chips were released in 2018, supporting UFS v2.1 (UFS Card 1.0).
    • No newer versions have been introduced since then.
So despite all the advantages UFS Card has over the competing removable media, it couldn’t sustain an ecosystem. There doesn’t seem any further development aside from the 3.0 standard that exists only on paper. I doubt that Nintendo would want to singlehandedly revive a dead medium.
I also noticed that when I did my research, also when I was never diggign so deep like you. Thank you for this overview.
I think UFS Cards died, because there is basically no top tier Phone which supports MicroSDs, so there is also no one who really put a UFS card reader into their phone. Meanwhile MicroSD cards are still way cheaper. So UFS cards were in theory a good idea, but they never reached a proper market, where it brings enough benefits. The only market which I could have seen make UFS Cards happening is the market for High profile cameras. But here the absolute top tier ones already support directly NVMe and the more affordable ones never picked UFS up. Chicken egg problem.
So I doubt anything will happen here, a cold storage on MicroSD seems anyway much more logic to me. They could theoretically even offer some high speed 50GB cache and load any game into it, people already shows on PC that they are OK with low performance at startup while the shaders get compiled.
 
I also noticed that when I did my research, also when I was never diggign so deep like you. Thank you for this overview.
I think UFS Cards died, because there is basically no top tier Phone which supports MicroSDs, so there is also no one who really put a UFS card reader into their phone. Meanwhile MicroSD cards are still way cheaper. So UFS cards were in theory a good idea, but they never reached a proper market, where it brings enough benefits. The only market which I could have seen make UFS Cards happening is the market for High profile cameras. But here the absolute top tier ones already support directly NVMe and the more affordable ones never picked UFS up. Chicken egg problem.
So I doubt anything will happen here, a cold storage on MicroSD seems anyway much more logic to me. They could theoretically even offer some high speed 50GB cache and load any game into it, people already shows on PC that they are OK with low performance at startup while the shaders get compiled.
A large percentage of the games, will be completely fine with microsd speeds, we are only talking about games that push fast asset streaming. Which will be more and more as we move past cross gen, but still far from every game.

MS and Sony enforce ssd install for all games, we will see what Nintendo will do.
 


So if your game is well equipped to use nvme speeds, there will be a good sized difference. But slow games will be similar.

Since UFS is a good deal slower than nvme, the difference wouldn't nearly be as large. Since I don't think Switch even hits 100MB/s for its SD card slot, we might still see a big increase in speeds
 


So if your game is well equipped to use nvme speeds, there will be a good sized difference. But slow games will be similar.

Since UFS is a good deal slower than nvme, the difference wouldn't nearly be as large. Since I don't think Switch even hits 100MB/s for its SD card slot, we might still see a big increase in speeds

Switch and to a lesser degree SD are definitely bottlenecked by the cpu (hence Switch 1,8ghz cpu loading mode). Drakes FDE will mitigate this.
 
I do wonder if Nintendo and Nvidia added support for two UFS 3.x (2x) controllers for Drake, with one dedicated to the internal flash storage, and one dedicated to UFS Cards. I know Orin has support for one UFS 3.0 (2x) controller.


Assuming Drake's taped out at 1H 2022 (here and here), which I do think is likely, I don't expect UFS 4.0 support.

If the AGX Orin block diagram (here and here) is any indication, the UFS controller is directly integrated into the SoC, which means Nintendo and Nvidia need to decide which UFS controller to use before taping out Drake.

The UFS 4.0 specs were published by JEDEC on 17 August 2022. And the first smartphone equipped with the Snapdragon 8 Gen 2, which supports UFS 4.0, was released on 6 December 2022. So I doubt the UFS 4.0 controller was available to be used during 1H 2022.

I agree that UFS 4.0 is very unlikely, but I don't think the timing has much to do with it. These standards are typically available in draft form long before they're officially published, to allow chip makers to get products ready in a timely fashion. The Snapdragon 8 Gen 2, for example, likely taped out around a year before it first appeared in a shipping phone, so Qualcomm would have had to have access to the standard long before publication (there's a good chance a Qualcomm employee was on the committee writing the standard). Ditto with the actual UFS 4.0 chips themselves, which require custom flash controllers which would have had to be taped out on a similar timescale.

Regarding the UFS Card, I agreed with @Dakhil and many others here that it could be a great replacement for the microSD in theory, and previously made several posts in support of that. However, that ship probably has sailed sunk for a some time now, and the foundation upon which an ecosystem can be built is no longer in place. My survey of the UFS Association member websites:

UFS Card
  • Samsung
    • AFAIK, Samsung was the only manufacturer who actually commercialized the UFS Card v1.0.
    • However, I can’t find any evidences suggesting that it was still being produced post 2019.
    • Nothing beyond v1.0 was ever released.
    • Although Samsung US still has a placeholder page up, the product has been scrapped from many other regional sites, such as UK and Canada.
  • Phison
    • Phison was the only other manufacturer (that I know) who released UFS Card v1.0 samples.
    • No one ever contracted Phison to manufacture UFS Card, unfortunately.
    • Nothing beyond v1.0 exists.
UFS Card socket
  • Amphenol seems to be the only manufacturer of UFS Card sockets.
    • Revision X1, base part number 10101704, was released in 2017.
    • Although revision X2, base part number 10101870, was introduced in 2019, I found no indication of it ever being mass produced. No distributor carries it either.
    • It seems that the revised socket did not find any customers.
  • Molex was supposed to produce UFS Card sockets too, but they never did.
UFS Card test fixtures
  • Astek released test fixtures for both the card (A9-UFS-02) and the host (A9-UFS-01).
  • They were introduced in 2018, supporting UFS Card 1.0 and M-PHY Gear 3.
  • I can’t find any updates for UFS Card 3.0 or M-PHY Gear 4.
USB to UFS bridge controller
  • A hypothetical Switch 2 with UFS Card socket would not need a USB to UFS bridge controller. I’m including it here as a further proof that the UFS Card ecosystem is likely dead.
  • Silicon Motion SM3350 and JMicron JMS901
    • Both chips were released in 2018, supporting UFS v2.1 (UFS Card 1.0).
    • No newer versions have been introduced since then.
So despite all the advantages UFS Card has over the competing removable media, it couldn’t sustain an ecosystem. There doesn’t seem any further development aside from the 3.0 standard that exists only on paper. I doubt that Nintendo would want to singlehandedly revive a dead medium.

I don't disagree with anything you've written here, but the unfortunate reality is that any alternative they have for faster storage speeds than UHS-I MicroSD also can't sustain an ecosystem. SD Express barely exists in the full-size format, with microSD cards nowhere in sight, and although CFexpress Type A is in use, it's only currently used in high-end Sony cameras, which is a niche of a niche of a shrinking industry. Even UHS-II SD cards, which exist in full size format as a niche for mid-range cameras, are very rarely used in microSD format.

It used to be the case that you could depend on the camera industry to push new memory card formats, but with Switch outselling the entire dedicated camera industry by 2 to 1 on an off year, Nintendo's choice for fast expandable storage are to singlehandedly revive a dead medium, or singlehandedly revive a barely living medium. Nintendo would account for 90%+ of the market for any card format faster than 100MB/s, so the ecosystem will be built around them in any case. The one argument for UFS cards is that the hardware is basically identical to the already mass-market embedded UFS, so building that ecosystem would probably be relatively quick and cheap.
 
You're obviously correct, and one of the reasons I said graphs like that destroy nuance. I think the counter to your on-paper argument, is that last gen games were more likely to be single thread bound than modern games, and the lack of cores in the Switch was less of an issue. Even if the numbers don't perfectly reflect the gap, I think practically speaking it communicates the core idea that CPU is more likely to be a big deal for ports this gen than GPU, and seeing a game struggling graphically on Series S doesn't mean that it's a hard port - and conversely, looking great on Series S doesn't mean you can just cut resolution to get onto [redacted]. It's just a different ballgame this gen
Yes you are right too, although I have my doubts if the difficulty factor to program in multithreading really was a very advantageous factor for the Switch, it is fact that even today in the PC environment many devs still badly allocate resources on the CPU, but distribute the resources well cross-threading tasks was basically a matter of survival in console environments on the PS4/XOne gen, if you didn't optimize the game well it wouldn't even run as weak as the Jaguar CPUs were.
I think we should think about context too, when the Switch was released, even a Pentium G4560 delivered more performance than the PS4/XOne CPU, but even if Redacted is released in 2025 the PS5 and Series CPUs will still be quite competent at the time.
 
All this talk about if Redacted could run Starfield has me thinking, hoping, that it can run ES6 when the time comes. Assuming MS even makes it multi platform. 😖

Let's just say Nintendo has a non-zero chance of getting Microsoft titles compared to Sony. They've been very selective with what's been made available, but they are not against releasing stuff on Nintendo's system. Just unclear if this would be one of them and their wording about Switch ports and releases has been kinda odd as a whole (falls in the camp of stuff that "makes sense", if I remember what they said correctly). That said, a theoretical Switch 2/Drake/etc., port of Starfield (and ES6) would be fascinating, to say the least.
 
0
Could [redacted] run Starfield?
The last time I did some truly in depth prediction on [redacted] performance it was in the context of PS4 and cross-gen. Since then, the launch of truly "next-gen" games has come along, and my own understanding has grown, so I thought it might be worth returning to.

Rather than do some abstract "Redacted is 73% of Series 5, assuming Nintendo picks Zeta Megahertz on the Right Frombulator" I thought it would be nice to look in depth at Starfield, a game I'm curious about, and think about what it might look like on a theoretical [redacted]. Which, I guess, is kinda abstract since we're talking about unreleased software on unannounced hardware, but let me have this.

TL;DR: The Takeaway
If there is one thing I want folks to come away with from this exercise it's "the problems of last gen are not the problems of this gen. Same for the solutions."

I know that's not satisfying, but the PS5/Xbox Series consoles are not just bigger PS4/Xbox One, and [redacted] is not just a bigger Switch. Switch had big advantages and big disadvantages when it came to ports - [redacted] is the same but they are different advantages and disadvantages.

For the most part, the Series S doesn't "help" [redacted] ports as much as some folks think. And obviously, Starfield is going to remain console exclusive to Microsoft's machines. But yes, I believe a port of Starfield would be possible. It would also be a lot of work, and not in the ways that, say, The Witcher III was a lot of work.

Zen and the ARM of Gigacycle Maintenance
Behold, the ballgame:



Graphs like this kill a lot of nuance, but they're also easy to understand. Last gen TV consoles went with bad laptop CPUs. Switch went with a good mobile CPU. That put them in spitting distance of each other.

[redacted] is set to make a generational leap over Switch, but PS5/Xbox Series have made an even bigger leap, simply because of how behind they were before. And, most importantly - the daylight between Series S and Series X is minimal. The existence of a Series S version doesn't help at all here.

This is especially rough with Starfield, a game that is CPU limited. With GPU limited games, you can cut the resolution, but that won't help here. Cutting the frame rate would - except it's already 30fps. There are no easy solutions here.

That doesn't mean no solutions. But this puts in solidly "holy shit how did they fit it onto that tiny machine" territory.

I Like It When You Call Me Big FLOPa
Good news: DLSS + The Series S graphics settings, done. Go back to worrying about the CPU, because that's the hard problem.

The tech pessimism - Ampere FLOPS and RDNA 2 FLOPS aren't the same, and it favors RDNA 2. Whatever the on-paper gap between [redacted] and Series S, the practical gap will be somewhat larger. If you want the numbers, open the spoiler. Otherwise, just trust me.

GPUs are not FLOPS alone. There are also ROPS/TMUs/memory subsystems/feature set. There are also tradeoffs for going for a wider/slower vs narrower/faster design. If we want to game out how Series S and [redacted] might perform against each other we would, ideally, want two GPUs that we could test that roughly parallel all those things.

The Series S GPU is 1280 cores, 80 TMUs, 32 ROPs, with 224 GB/s of memory bandwidth, at 4TFLOPS
[redacted]'s GPU is 1536 cores, ?? TMUs, 16 ROPs, with 102 GB/s of memory bandwidth, at a theoretical 3 TFLOPS.

The RX 6600 XT is 2048 cores, 128 TMUs, 64 ROPS, with 256 GB/s of memory bandwidth + 444.9 GB/s infinity cache, at 10.6 TFLOPS
The RTX 3050 is 2560 cores, 80 TMUs, 32 ROPs, with 224 GB/s of memory bandwidth, at 9 TFLOPS.

No comparison is perfect, but from a high level, this is pretty close. The Ampere card is slightly fewer FLOPS built on 20% more cores, the RDNA 2 card supports that compute power with twice as much rasterization hardware. And the performance is within the same realm as the existing consoles, so we're not trying to fudge from something insane like a 4090.

The downside of this comparison is the memory bandwidth. The consoles and the RX 6000 series have very different memory subsystems. We're going to act like "big bandwidth" on consoles and "medium bandwidth plus infinity cache" are different paths to the same result, but it's the biggest asterisk over the whole thing.

Digital Foundry has kindly provided us with dozens of data points of these two cards running the same game in the same machine at matched settings. Here is the 1080, rasterization only numbers

GameAmpere FPSRDNA 2 FPSPercentage
Doom Eternal15623167
Borderlands 3539456
Control548365
Shadow of the Tomb Raider9013268
Death Stranding8313561
Far Cry 59513968
Hitman 29614665
Assassin's Creed: Odyssey518162
Metro Exodus488060
Dirt Rally 2.06210459
Assassin's Creed: Unity10015763

As we can see pretty clearly, the Ampere card underperforms the RDNA 2 card by a significant margin, with only a 3.9% standard deviation. If we grade on a curve - adjusting the for the differences in TFLOPS - that improves slightly. Going as the FLOPS fly, Ampere is performing at about 74% of RDNA 2.

We could compare other cards, and I have, but the gap gets bigger, not smaller as you look elsewhere. Likely because where Nvidia spent silicon on tensor cores and RT units, AMD spent them on TMUs and ROPs.

If you take those numbers, an imaginary 3TFLOP [redacted] isn't 75% the performance of the Series S, but closer to 55%. We will obviously not be able to run the Series S version of the game without graphical changes. So what about DLSS? Again, technical analysis below, but the short answer is "DLSS Performance Mode should be fine".

Let's do some quick math. At 55% of the performance of Series S, is Series S can generate an image natively in 1ms, [redacted] can do it in 1.78ms. According to the DLSS programming guide, our theoretical [redacted], we can get a 1440p image (the Series S target for Starfield) from a 720p source in 2.4ms.

Looking at those numbers it is clear that there is a point where DLSS breaks down - where the native image rendering is so fast, that the overhead of DLSS actually makes it slower. That should only happen in CPU limited games, but it just so happens, Starfield is a CPU limited game. So where is that line?

Series S GPU Time * 1.78 (the redacted performance ratio) * 0.25 (DLSS performance mode starts at 1/4 res) + 2.4ms (redacted's DLSS overhead) = Series S GPU Time

Don't worry, I've already solved it for you - it's 3.8ms. That would be truly an extremely CPU limited game. So DLSS seems extremely viable in most cases.

Starfield is a specific case, however, as is the Series S generally. Starfield uses some form of reconstruction, with a 2x upscale. If Series S is struggling to get there natively, will DLSS even be enough? Or to put it another way, does FSR "kill" DLSS?

Handily, AMD, also provides a programming guide with performance numbers for FSR 2, and they're much easier to interpret than the DLSS ones. We can comfortably predict that FSR 2 Balanced Mode on Series S takes 2.9ms. You'll note that DLSS on [redacted] is still faster than FSR 2 on the bigger machine. That's the win of dedicated hardware.

And because of that, we're right back where we started. For GPU limited games, if the Series S can do it natively, we can go to half resolution, and DLSS back up in the same amount of time, or less. If the Series S is doing FSR at 2x, we can do 4x. If Series S is doing 4x, by god, we go full bore Ultra Performance mode. And should someone release a FSR Ultra Performance game on Series S, well, you know what, Xbox can keep it.

Worth noting, that even then the options don't end for [redacted]. Series S tends to target 1440p because it scales nicely on a 4k display. But 1080p also scales nicely on a 4k display, giving us more options to tune.

Whether you are willing to put up with DLSS here is a subjective question, but this is a pretty straight forward DLSS upscale, nothing unusual at all. Where it might become dicey is if Imaginary Porting Studio decided to do something wild like go to Ultra Performance mode, not because of the graphics, but to free up time for the CPU to run. In CPU limited games, that rarely gives you the performance you need, but it's worth noting that [redacted] and DLSS do give us some "all hands on deck" options.

In Space, No One Can Hear You Stream
It's not just CPUs and GPUs obviously. The ninth gen machines all advertise super fast NVMe drives. Meanwhile, we have no idea what [redacted]'s storage solution will look like. But I don't want to talk too much about abstract performance, I want to talk about Starfield.

Starfield's
PC requirements are informative. It requires an SSD, but doesn't specify type, nor does it recommend an NVMe. It only requires 16GB of RAM, which is pretty standard for console ports, which suggests that Starfield isn't doing anything crazy like using storage as an extra RAM pool on consoles. It's pretty classic open world asset streaming.

Let's make a little table:

Switch eMMCOld SATA SSDModern eMMCSATA III SSDiPhone NVMeSeries S NVMeAndroid UFS 4UFS 4, on paper
300MB/s300MB/s400 MB/s500MB/s1600MB/s2400MB/s3100MB/s5800MB/s

Nintendo has a lot of options, and pretty much all of them cross the Starfield line - if mandatory installs are allowed by Nintendo. There is a big long conversation about expansion and GameCard speed that I think is well beyond the scope here, and starts to get very speculative about what Nintendo's goals are. But at heart, there is no question of the onboard storage of [redacted] being fast enough for this game.

Don't Jump on the Waterbed
When you push down on the corner of a waterbed, you don't make the waterbed smaller, you just shift the water around.

You can do that with software, too. Work can be moved from one system (like the CPU) to another (RAM) if you're very clever about it (caching, in this case). Sometimes it's faster. Sometimes it's slower. But that doesn't matter so much as whether or not you've got room to move. This is likely one of the reasons that Nintendo has historically been so generous with RAM - it's cheap and flexible.

The danger with this next-gen ports isn't any one aspect being beyond what [redacted] can do. It's about about multiple aspects together combining to leave no room to breath. NVMe speed you can work around, GPU can cut resolution, CPU can be hyper optimized. But all three at once makes for a tricky situation.

At this point I don't see evidence of that in Starfield - I suspect only the CPU is a serious bottle neck. But some minor things worth bringing up:

RAM - reasonable expectations are that Nintendo will go closer to 12 GB than 8 GB, so I don't see RAM as a serious issue.

Storage space - PC requirements call for a whopping 128GB of free space. That's much larger than Game Cards, and most if not all of the likely on board storage in [redacted]. There are likely a bunch of easy wins here, but it will need more than just easy wins to cross that gap.

Ray Tracing - Starfield uses no RT features on consoles, so despite the fact that [redacted] likely does pretty decent RT for its size, it's irrelevant here.

Appendix: The Name is Trace. Ray Trace
But someone will ask, so here is the quick version: [redacted]'s RT performance is likely to be right up to Series S. But it's not like Series S games often have RT, and RT does have a decent CPU cost, where [redacted] is already weakest. So expect RT to be a first party thing, and to be mostly ignored in ports.

Let's look at some benchmarks again. The 3050 vs the 6600 XT once more. This time we're using 1440p resolution, For Reasons.

Game3050 FPS3050 FPS w/RTRT Cost6600 XT FPS6600 XT FPS w/RTRT Cost
Control351924.1ms492029.6ms
Metro Exodus372414.6ms603016.7ms
The method here is less obvious than before. We've taken the games at max settings with RT off, then turned RT on, and captured their frame rates. Then we've turned the frame rate into frame time - how long it took to draw each frame on screen. We've then subtracted the time of the pure raster frame from the RT frame.

This gives us the rough cost of RT in each game, for each card, lower is better. And as you can see, despite the fact that the 3050 is slower than the 6600 XT by a significant margin, in pure RT performance, it's faster. About 38% faster when you grade on the curve for the difference in TFLOPS.

There aren't a lot of games with good available data like this to explore, but there are plenty of cards, and you can see that this ratio tends to hold.

Game3060 FPS3060 FPS w/RTRT Cost6700 XT FPS6700 XT FPS w/RTRT Cost
Control552817.5ms672525.1ms
Metro Exodus543510.1ms743713.5ms
This gives us 43% improvement for Ampere, adjusted for FLOPS.

Applying this adjustment our theoretical 3TF [redacted] out performs the 4TF Series S by 3.5%.

It's worth noting that RDNA 2 doesn't have true RT hardware. Instead, the CPU builds the BVH structure, and then triangle intersections are tested by the existing TMUs that the GPU already has. Ampere performs both operations on dedicated hardware. This should reduce the CPU load, but also opens up the possibility of further wins when using async compute.

Nice analysis! Though I think if REDACTED GPU have only 16 ROPS, games will likely be limited by fillrate before being compute-bound or memory bandwidth starved. Having a GPU with such huge amount of shading power whilst having low pixel fillrate seems pretty unbalance too.
Given PS4 GPU have 32 ROPS @ 800MHz, REDACTED 16 ROPS @ ~600MHz in handheld mode meaning REDACTED will have less than half of the PS4 fillrate in portable mode. Even docked REDACTED will still have a fillrate deficit compared to PS4 unless they clocked the GPU really high. That being said, there are architectural improvements in REDACTED GPU that might help with the lower pixel throughput though I'm not sure how much of an improvement it might made.
 
Please read this staff post before posting.

Furthermore, according to this follow-up post, all off-topic chat will be moderated.
Last edited:


Back
Top Bottom