• Hey everyone, staff have documented a list of banned content and subject matter that we feel are not consistent with site values, and don't make sense to host discussion of on Famiboards. This list (and the relevant reasoning per item) is viewable here.
  • Do you have audio editing experience and want to help out with the Famiboards Discussion Club Podcast? If so, we're looking for help and would love to have you on the team! Just let us know in the Podcast Thread if you are interested!

StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (Read the staff posts before commenting!)

If I was forced to choose between a downclocked orin at 8nm and a top of the line Snapdragon 888 at 5nm, I would go for the orin.
That's the problem. A downclocked Orin NX running at TX1 clocks for the GPU and CPU would theorically be less powerful than a stock S888 running with the switch cooling solution. It would maybe be more on par on the CPU side with 8 real big cores vs 4+4 cores but I suspect that the higher clocked S888's A78s would make the 25% difference in MT perf and would beat it in ST perf.

That said, I would also expect S888 to be too expensive for a $350 device in 2021 but I would expect a refined 2022 S890 to make the cut in a 2022Q4 device at $300.
 
The expectations for Dane's performance hasn't changed, it's been "PS4 with DLSS" on top as the expected performance, given that they are going from 16M transistors per mm to 50M transistors per mm, and 4 generations newer GPU architecture and CPU architecture, as well as 4x faster memory bandwidth, this is a pretty great outcome. ~4x GPU performance increase before DLSS, with raytracing being a real possibility.

Who cares if it's 7nm or 8nm, that means nothing to the end user, it's entirely about the performance we can expect, and it's within expectations.

I think before we start with the whole sky is falling, we should think about the jump in performance here, because this is in that PS4 Pro, XBSS, XB1X Performance tier, imagine what that will do for your gaming on the go, what phones are offering that much more than this at $400?
 
The expectations for Dane's performance hasn't changed, it's been "PS4 with DLSS" on top as the expected performance, given that they are going from 16M transistors per mm to 50M transistors per mm
We will have to wait for Orin die size before concluding over a 50MTr/mm2 density. It would be 60MTr with the 21BTr count and Xavier die size. But it would be 37MTr with the old and new 17BTr count and the visually bigger 400-450mm2 die size. T239 is probably 'derivative' from T234 but I hope that it doesn't share the transistor density especially if we are closer to the 37MTr/mm2 scenario.
 
0
The expectations for Dane's performance hasn't changed, it's been "PS4 with DLSS" on top as the expected performance, given that they are going from 16M transistors per mm to 50M transistors per mm, and 4 generations newer GPU architecture and CPU architecture, as well as 4x faster memory bandwidth, this is a pretty great outcome. ~4x GPU performance increase before DLSS, with raytracing being a real possibility.

Who cares if it's 7nm or 8nm, that means nothing to the end user, it's entirely about the performance we can expect, and it's within expectations.

I think before we start with the whole sky is falling, we should think about the jump in performance here, because this is in that PS4 Pro, XBSS, XB1X Performance tier, imagine what that will do for your gaming on the go, what phones are offering that much more than this at $400?

I think this is a fair point and in comparison to both XboxOne and PS4, there's been a number of architectural advancements over the years that level the playing field. The best thing Nvidia and Nintendo can do with what's available is to design the most balanced piece of hardware possible for say $350-$400 that will allow them to fully realize the functionality of a hybrid gaming device without much compromise.
 
0
The expectations for Dane's performance hasn't changed, it's been "PS4 with DLSS" on top as the expected performance, given that they are going from 16M transistors per mm to 50M transistors per mm, and 4 generations newer GPU architecture and CPU architecture, as well as 4x faster memory bandwidth, this is a pretty great outcome. ~4x GPU performance increase before DLSS, with raytracing being a real possibility.

Who cares if it's 7nm or 8nm, that means nothing to the end user, it's entirely about the performance we can expect, and it's within expectations.

I think before we start with the whole sky is falling, we should think about the jump in performance here, because this is in that PS4 Pro, XBSS, XB1X Performance tier, imagine what that will do for your gaming on the go, what phones are offering that much more than this at $400?
It’s crazy how significant this update would be. I mean a ps4 power level with DLSS a better cpu and better … GPU? That is pretty significant. Looking at things I personally believe this is going to be another switch iteration. Switch 4K which would be the most expensive of the switch family. I just don’t see this as switch 2. Only because if we really are believing Nintendo saying switch is at its mid-life cycle. They should release switch 4K early next year or middle next year. They can ride it out for many many years before they feel pressure to release a brand new console. I personally don’t believe there will be a switch 2 or something along those lines. That just isn’t Nintendo. They will release a brand new console with a new concept when it’s time.
 
Dane won't be able to run the bench with more battery life left if those SoCs were running at the same CPU clocks.


Here are tests made on 8cx gen3. Not sure if they are legits but it shows what we could expect from 8 big cores. Compared to the video posted earlier, an 8*A78C configuration (with two different sets of clocks and memory configuration) would be 30% faster than S870 in MT and 25% faster than S888. That would probably make a higher clocked 7 nm or 5 nm 4+4 CPU faster than a 1.2GHz 8 cores CPU on 8 nm. Moreover, having 8 big CPU cores would limit the space left on the die for the GPU.

I would be pleased to see Orin NX performances being somehow transferred to Dane in a smaller package but numbers are showing that it won't be possible with Orin density and power consumption. They will either have to use uHD libraries to cram that much transistors in a smaller place with 8 nm, using a newer and denser node in order to reduce the die footprint and power consumption or either reduce the numbers of CPU/GPU cores.

are you comparing CPU or GPU? of course Dane won't be on par in the CPU, no one here expects that. but games won't be as cpu bound on a modestly clocked 6-8 core A78. GPU-wise, Dane will blow all of these out the water.
 
That's the problem. A downclocked Orin NX running at TX1 clocks for the GPU and CPU would theorically be less powerful than a stock S888 running with the switch cooling solution. It would maybe be more on par on the CPU side with 8 real big cores vs 4+4 cores but I suspect that the higher clocked S888's A78s would make the 25% difference in MT perf and would beat it in ST perf.

That said, I would also expect S888 to be too expensive for a $350 device in 2021 but I would expect a refined 2022 S890 to make the cut in a 2022Q4 device at $300.
But that's only the theoretical performance. Only with RDNA 2, AMD approached Nvidia in performance/(stated) flops. Although we have a lot less information about Nvidia vs Phone GPU's, the Switch was trading blows with the iPhone X in Fortnite, which was stated to be close to 500Gflops in performance. By specs alone, the Switch would be closer to the iPhone 7 at ~200Gflops, but that was no competition on any game that is available on both devices. And that was Apple, Adreno and Mali are significantly weaker.

I think what holds down the Switch now is the CPU, which were on the weaker side in 2017 and memory bandwith.

Edit: Found the specs, the Apple A9 was stated at 249Gflops and 349 Gflops for the Apple A10 Fusion. So, the iPhone 6S should be more powerful than the Switch undocked by theoretical specs...

Source: https://gadgetversus.com/processor/apple-a9-vs-apple-a10-fusion/
 
Last edited:
0
It’s crazy how significant this update would be. I mean a ps4 power level with DLSS a better cpu and better … GPU? That is pretty significant. Looking at things I personally believe this is going to be another switch iteration. Switch 4K which would be the most expensive of the switch family. I just don’t see this as switch 2. Only because if we really are believing Nintendo saying switch is at its mid-life cycle. They should release switch 4K early next year or middle next year. They can ride it out for many many years before they feel pressure to release a brand new console. I personally don’t believe there will be a switch 2 or something along those lines. That just isn’t Nintendo. They will release a brand new console with a new concept when it’s time.
The last piece of the Apple ecosystem would be a gaming console and thus finally pushing Devs to make AAA games with the metal API and making use of its node advantage to kill the other console opponents.

Nintendo may not be the sole handheld supplier in two or three years. Cloud gaming may also be more prevalent.
are you comparing CPU or GPU? of course Dane won't be on par in the CPU, no one here expects that. but games won't be as cpu bound on a modestly clocked 6-8 core A78. GPU-wise, Dane will blow all of these out the water.
S888 has 85% the raw GPU power of Orin NX (comparing high end node smaller SoC). 8cx gen1 has 92% and 8cx gen2/SQ1/SQ2 have more raw GPU power than Orin NX (comparing Dane like sized chips with a closer N7 node). Dane will not blow these out the water. DLSS may only be available in order to push the resolution to 4K in docked mode which is not the main reason of an handheld device that happens to have a TV mode more than a home console device that happens to have a really low powered handheld mode.
 
S888 has 85% the raw GPU power of Orin NX (comparing high end node smaller SoC). 8cx gen1 has 92% and 8cx gen2/SQ1/SQ2 have more raw GPU power than Orin NX (comparing Dane like sized chips with a closer N7 node). Dane will not blow these out the water. DLSS may only be available in order to push the resolution to 4K in docked mode which is not the main reason of an handheld device that happens to have a TV mode more than a home console device that happens to have a really low powered handheld mode.
what are you basing these numbers on? FLOPS or benchmarks?
 
Theorical FLOPs. Which would be the best way to compare these chips as there is no Orin NX bench and the Android GPU Driver being what it is.
Even if we do take into account the driver being on par for example sake, the Ampere/RDNA2 GPU is built with being as performant as possible for each and every aspect of it. Mali GPU and Adreno are built around efficiency and not for super high performance.


it’s not really a 1:1 comparison. At least for a couple of years, probably by 2026 it’ll match what those gpus offer in actual raster, not that in the year 2026, just that by that year.
 
Even if we do take into account the driver being on par for example sake, the Ampere/RDNA2 GPU is built with being as performant as possible for each and every aspect of it. Mali GPU and Adreno are built around efficiency and not for super high performance.


it’s not really a 1:1 comparison. At least for a couple of years, probably by 2026 it’ll match what those gpus offer in actual raster, not that in the year 2026, just that by that year.
it's questionable enough to use flops with different nvidia generations. it's highly questionable to use different companies. this is why I always use benches rather than flops. it's more fair
There is no best way to compare different gaming platform and GPU power. As for real life performances, Genshin Impact have been used in order to compare the gaming experience and the 'raw gaming performance' across different plateform (PS4/iOS/Android) and different SoC manufacturers on the same plateform (QC Samsung/Exynos Samsung/QC android manufacturers like Xiaomi). Maybe we should for Switch 2 and GI on switch to release before making any comparison.

Before that, raw FP32 perfs is the only way to compare these chips.
 
That multicore performance is weird. Sounds like it's still a 4+4 type setup (even if it seems to be 4p+4p) rather than true equivalent 8 core, which probably has something to do with it, if not just basic thermals.
I think there's also the possibility Qualcomm skimped on the amount of L3 cache, assuming Qualcomm used the octa-core (8) configuration of the Cortex-A78C for the successor to the Snapdragon 8cx Gen 2, similar to how Qualcomm skimped on the amount of L3 cache on the Snapdragon 888, considering the Cortex-A78C and the Cortex-X1 can use up to 8 MB of L3 cache.
 
0
There is no best way to compare different gaming platform and GPU power. As for real life performances, Genshin Impact have been used in order to compare the gaming experience and the 'raw gaming performance' across different plateform (PS4/iOS/Android) and different SoC manufacturers on the same plateform (QC Samsung/Exynos Samsung/QC android manufacturers like Xiaomi). Maybe we should for Switch 2 and GI on switch to release before making any comparison.

Before that, raw FP32 perfs is the only way to compare these chips.
mobile games are flawed because it's been shown that games have vendor-specific graphical modes, different resolutions, etc. 3DMark's suite and GFXBench are the best cases IMO. GFXBench offers an offscreen mode that unifies resolution between all devices, PC and mobile

 
mobile games are flawed because it's been shown that games have vendor-specific graphical modes, different resolutions, etc. 3DMark's suite and GFXBench are the best cases IMO. GFXBench offers an offscreen mode that unifies resolution between all devices, PC and mobile

But then, some people are thinking that GFXBench is favouring the metal API over openGL and Vulkan.
 
it's questionable enough to use flops with different nvidia generations. it's highly questionable to use different companies. this is why I always use benches rather than flops. it's more fair
I actually use FLOPs and Benches/Effective Performance in tandem.

I have FLOP Conversions derived from Benches/Relative Performance.

(EX: the RTX 3070 vs 2080Ti, as of lately, it's 4% faster than the 2080Ti, but it has 20.31 TFLOPs vs the 13.45TFLOPs of the 2080Ti, making Turing 44.65% better per-FLOP than Ampere)

We can use something similar to figure out Ampere vs RDNA2

The 3060TI and the 6700XT are within 2% of each other averaged out,
The 3060Ti, 16.2TFLOPs, the 6700XT? 13.21TFLOPs, meaning RDNA2 is 25.5% better than Ampere per-FLOP.

So, examples.

2 Ampere TFLOPs = 1.39 Turing TFLOPs or 1.6 RDNA2 (w Infinity Cache) TFLOPs or 2.1 RDNA2 (w/o Infinity Cache) TFLOPs.

(Assuming Ininfity Cache is in play as that is a big thing there, because there are two types of RDNA2 TFLOPs, With Infinity Cache, and without Infinity Cache as AMD themselves said Infinity Cache is 33-40% of the RDNA1 to RDNA2 boost)

Now, that is TFLOP comparisons, but that can help us get a "baseline" to compare for DLSS Multiplication, which for me is where Effective/Benchmark multiplication comes in

(Multiplying TFLOPs is the worst case, multiplying effective performance is the best case after DLSS at different settings)

EDIT:
And so you know, Turing is 24% more effective than Pascal, so

2 Ampere TFLOPs = 1.39 Turing TFLOPs = 1.72 Pascal TFLOPs
 
Last edited:
1) A tape out is when a chip design has been finalised and mass manufacturing can begin.

2) Yes. But a more advanced process node also means that the size of the transistors has decreased, which means more transistors can be crammed into smaller sized chips.
So if I understand correctly, the chip going into new switch isn't finalized yet (ie isn't taped out)? Does this means it can still improve (or vice-versa) based on the informations we have now?
 
Die size is the link between perf/W and perf/$. Especially in late 2019-early 2020 when TSMC and Samsung were both unable to make massive GPU chipsets due to their toolings lacking EUV pellicules for the manufacturing of those big >200mm2 dies which has been addressed by ASML and TSMC in 2020 and with M1p/m being the first 'big' chip on an EUV process.

The theory that newer 7 nm or 5 nm nodes are more expensive than 8 nm while true has never stopped mobile company to put 7 nm SoCs in €279 phones in 2020 and 2021. Smaller mobile chipsets have suffered less from the price increase induced by EUV nodes as opposed to their GPU counterparts that are yet to be seen on EUV process which should change in 2022 with Lovelace and RDNA3.

What's wrong with Orin is the fact it is only able to push for 2 FP32 TFLOPS with a 25W power budget on 8N which is low compared to 1.3 for S870 on N7P (83mm2) and 1.8 for S888 on 5LPE probably with a small <100mm2 die size both consuming less than 10W.
It has DLSS and RT cores though. Plus A.I. stuff for autos that won't be needed on switch.

I think a 1.6 TFLOPs GPU and maybe up to 1.5Ghz CPU is possible at 15 watts. And then we'll get a 5nm or smaller revision in 2025 with no upgrades outside of battery life, because why not??
 
Last edited:
It has DLSS and RT cores though. Plus A.I. stuff for autos that won't be needed on switch.

I think a 1.6 TFLOPs GPU and maybe up to 1.5Ghz CPU is possible at 15 watts. And then we'll get a 5nm or smaller revision in 2025 with no upgrades outside of battery life, because why not??
Downclock the CPU to 1GHz.

And bingo?

(I assumed you are refering to a cpu config of 8 A78C)
 
Last edited:
last we heardf from a few posters it wasn't finalized around this time late last year, but dev kits have been out for as long. I would think it's much further along now if not finalized.
Nintendo may have made a decision on the CPU core count and GPU features that the Dev kits (maybe Xavier SOC +/- discrete rtx 2060/3060) would have to simulate.

It has DLSS and RT cores though. Plus A.I. stuff for autos that won't be needed on switch.
Yes, just like Xavier before. We have been making extrapolation of what could a Automotive-less Xavier chip be on a more mobile gaming oriented SoC. We are doing the same thing we the bigger and wider Orin chip. Orin S which should be T239/Dane will probably be less powerful than Orin NX and it may have to be downclocked in order to accommodate Mariko TDP (which would be smaller than the OG Erista TDP).

That said, a 6SM GPU running a 921MHz would still give PS4 performances in docked mode [1.4 vs 1.8 (PS4)tflops]) and 700 gflops in handheld mode. 4SM would probably be not able to reach PS4 perf in docked mode even with Xavier max clocks [1.1 vs 1.8] and would only reach Mariko max perf in handheld mode [471gflops on shield TV]. The handheld performances will suffer the most from the comparison with mobile SoCs that already have 1.4tflops within a 10W power budget.
 
0
I think a 1.6 TFLOPs GPU and maybe up to 1.5Ghz CPU is possible at 15 watts. And then we'll get a 5nm or smaller revision in 2025 with no upgrades outside of battery life, because why not??
That would be Orin S max perf. You will need to add another downclock in order to reach Switch docked perf as it as been with every switch chipsets (probably in order to improve yields as it would reduce the power binning).

My (Dane) best case scenario is 1.4tflops in docked mode/0.5~0.7tflops in handheld mode with 8*A78 all under 12W.
 
0
Downclock the CPU to 1GHz.

And bingo?

(I assumed you are refering to a cpu config of 8 A78C)
I don't think the CPU needs to be downclocked by 50% to get to 15 watts... Clock speeds aren't linear to power draw. ILikeFeet mentioned 1.25Ghz, which I think could be a good sweet spot too. How does *8 cores of 1.25Ghz compare to the series s CPU I wonder? I remember asking how 8 2Ghz A78s compared to series x/PS5 and I think dakhil said it was within 60% performance of single thread... As long as we are within that switch to ps4/xbone window (3x), I think we'll be okay for ports. The more the better, but not expecting it.

Forgot to mention that some RT cores would be disabled along with any auto stuff. Or Nvidia can make a custom board out of it and it should still be pretty good and match it.
 
Last edited:
I don't think the CPU needs to be downclocked by 50% to get to 15 watts... Clock speeds aren't linear to power draw. ILikeFeet mentioned 1.25Ghz, which I think could be a good sweet spot too. How does I cores of 1.25Ghz compare to the series s CPU I wonder? I think I asked how 8 2Ghz A78s compared to series x/PS5 and I think dakhil said it was within 60% performance of single thread...
1.25GHz would be good and necessary in handheld mode if they are planning to maintain the CPU clocks across all power profiles. However, A78 cores may be able to change individual core clocks allowing them to be fined tuned a la smartshift. Games with heavy ST loads would have threads being spread across 2 cores running at 1.6-2GHz with the 4/6 others cores running at less than 1GHz.
 
1.25GHz would be good and necessary in handheld mode if they are planning to maintain the CPU clocks across all power profiles. However, A78 cores may be able to change individual core clocks allowing them to be fined tuned a la smartshift. Games with heavy ST loads would have threads being spread across 2 cores running at 1.6-2GHz with the 4/6 others cores running at less than 1GHz.
DynamicIQ baby.

That sort of is another thing, Switch was pre-DynamicIQ, now we are firmly into that age of tech.

So I could actually see Nintendo setting "Brackets" for docked and portable clock profiles.

In each setting, Developers have a certain amount of "Ghz budget" across the 8 Cores to spend, and they can decide to spend it depending on if they want their game more single-threaded or multithreaded.

If they have a single-threaded game, they can spend most of that to get 1 or 2 cores to 1.8+Ghz, and the rest sub-1-GHz managing other tasks (outside the OS Core)

If they have a multithreaded game, they can set the number of cores to correspond to the threads to 1.25-1.5Ghz, with any excess aside from the OS core at sub-1ghz.

(These would be docked numbers)

Then maybe add a "Turbo" option for loading (May not be necessary anymore I feel though).
 
0
It’s crazy how significant this update would be. I mean a ps4 power level with DLSS a better cpu and better … GPU? That is pretty significant. Looking at things I personally believe this is going to be another switch iteration. Switch 4K which would be the most expensive of the switch family. I just don’t see this as switch 2. Only because if we really are believing Nintendo saying switch is at its mid-life cycle. They should release switch 4K early next year or middle next year. They can ride it out for many many years before they feel pressure to release a brand new console. I personally don’t believe there will be a switch 2 or something along those lines. That just isn’t Nintendo. They will release a brand new console with a new concept when it’s time.
The "brand new concept" part of Nintendo's history is just one part of it. For every N64, Wii, Wii U and DS, there's a Super NES, GBA, Gamecube and 3DS. Contrary to common belief, they do just iterate GPU and CPU hardware for new generations with mild changes to button inputs and a new casing. The big difference now is that we don't expect massive differences in casing. Maybe some iteration to the Joycons (if only because they'll want their control input devices updated to BT5 anyways).
 
0
But then, some people are thinking that GFXBench is favouring the metal API over openGL and Vulkan.
Is not surprise that Metal is faster than Open Gl. What is surprising is that Open GL is often faster than Vulcan on Android in the same device, which shows how badly are the drivers for those devices.
 
0
Does someone know how much power the tensor cores consume? Orin main selling point is the AI capabilities, so the max power consumption would be using those at 100%. I dont know if that differs from the compute loads.
 
Does someone know how much power the tensor cores consume? Orin main selling point is the AI capabilities, so the max power consumption would be using those at 100%. I dont know if that differs from the compute loads.
The Tensor cores are bundled into the uArch so consumption shouldn't be too different iirc
 
0
I don't think the CPU needs to be downclocked by 50% to get to 15 watts... Clock speeds aren't linear to power draw. ILikeFeet mentioned 1.25Ghz, which I think could be a good sweet spot too. How does I cores of 1.25Ghz compare to the series s CPU I wonder? I think I asked how 8 2Ghz A78s compared to series x/PS5 and I think dakhil said it was within 60% performance of single thread...

Forgot to mention that some RT cores would be disabled along with any auto stuff. Or Nvidia can make a custom board out of it and it should still be pretty good and match it.
Going by those Geekbench 5 scores Dakhil posted earlier...
The two single core scores for presumably-8 cx Gen 3 average out to 994.5. Divide that by 2.69 ghz to get ~369.7 points per ghz.
For the 4700S... I see a high of 1077 and a low of 823. That 823 looks like an outlier. I'll just go with the 1077 to stack in favor of the 4700s. 1077 divided by 3.6 ghz gets ~299.2 points per ghz.
Going back to ~369.7 points per ghz for our presumed-A78, multiply that by 1.25 to get ~462.1 pts. So that's ~43% of the high score of 1077 of the 4700s.

Hmm... the actual desktop Zen 2 scores at a rate (points/ghz) in the same area or a bit above that of the presumed A78 in single core. So the 4700S is really hampered by the quartering of L3 cache, like the other Zen 2 APUs. (normal desktop Zen 2 should be 16 MB of L3 cache per CCX, while the 4700S is listed at 4x2, so 4 MB per CCX)
 
The last piece of the Apple ecosystem would be a gaming console and thus finally pushing Devs to make AAA games with the metal API and making use of its node advantage to kill the other console opponents.

Nintendo may not be the sole handheld supplier in two or three years. Cloud gaming may also be more prevalent.
people always mention this but Nintendo has Nintendo IPs. You can’t get those on any other handheld device legally. They will always beat out the competition. Look at sales of Zelda, 3D Mario, and Mario Kart on switch. You add those games along with strong third party and you always are in a good place. Third parties have always been Nintendo’s Achilles heel. If they just focus and make sure their hardware is always in a place to get good third party support the 1st party and sales take care of themselves.
 
people always mention this but Nintendo has Nintendo IPs. You can’t get those on any other handheld device legally. They will always beat out the competition. Look at sales of Zelda, 3D Mario, and Mario Kart on switch. You add those games along with strong third party and you always are in a good place. Third parties have always been Nintendo’s Achilles heel. If they just focus and make sure their hardware is always in a place to get good third party support the 1st party and sales take care of themselves.
Nintendo won't care about 3rd parties
 
Nintendo won't care about 3rd parties
Nintendo does care about third parties. they just care about their ideas first, and then try their best to accommodate third parties. the good thing is that because hardware has homogenized, they are more capable of doing both without expending one or the other
 
Last edited:
Going by those Geekbench 5 scores Dakhil posted earlier...
The two single core scores for presumably-8 cx Gen 3 average out to 994.5. Divide that by 2.69 ghz to get ~369.7 points per ghz.
For the 4700S... I see a high of 1077 and a low of 823. That 823 looks like an outlier. I'll just go with the 1077 to stack in favor of the 4700s. 1077 divided by 3.6 ghz gets ~299.2 points per ghz.
Going back to ~369.7 points per ghz for our presumed-A78, multiply that by 1.25 to get ~462.1 pts. So that's ~43% of the high score of 1077 of the 4700s.

Hmm... the actual desktop Zen 2 scores at a rate (points/ghz) in the same area or a bit above that of the presumed A78 in single core. So the 4700S is really hampered by the quartering of L3 cache, like the other Zen 2 APUs. (normal desktop Zen 2 should be 16 MB of L3 cache per CCX, while the 4700S is listed at 4x2, so 4 MB per CCX)
I'm not sure I follow you with your calculations... But it almost looks single thread performance they are pretty even per GHz? Although Series S has the clear advantage of being clocked up to 3.6Ghz, while we can expect Dane to be probably no more than 1.5Ghz (Orion NX's max is 2GHz anyway).

edit: What Dakhil said on page 49

Assuming that the successor to the Snapdragon 8cx Gen 2 is using the octa-core (8) configuration of the Cortex-A78C, which seems to be the case going by how the CPU cores are described in the rumour from Roland Quandt, which seems to be vindicated by the information written on "CPU Information" (here and here) in the Geekbench 5 benchmarks, and using the Geekbench 5 benchmarks for the AMD 4700S, which is based on the PlayStation 5's APU, the octa-core configuration of the Cortex-A78C is overall theoretically very close to the Zen 2 CPU in the PlayStation 5 in terms of single-core performance, but the octa-core configuration of the Cortex-A78C can theoretically range from being ~43.63% to ~64.36% slower than the Zen 2 CPU in the PlayStation 5 in terms of multi-core performance.

So lets say they are pretty even per 1Ghz.. Almost like the A57s on Switch vs PS4's Jaguars are. Switch's 3 CPU cores for gaming at 1Ghz Vs PS4's 6-7 cores for gaming gave the PS4 something like a 3.5x advantage in speed.

But the thing is is that Series S has even higher clocks than Jaguar. 3.6Ghz. Dane/Switch 2 would have to be 1Ghz each to maintain that power gap as switch vs PS4. Lets hope the switch is higher... 1.25Ghz for Switch 2 would narrow it down to 2.9x and 1.5Ghz would narrow it to 2.4x.

But its great that we have dynamic IQ at least.
 
Last edited:
Yup, and for those unaware, that's not terribly out of the ordinary, most ARM CPUs don't. SMT seems to be a feature of their "AE" chips primarily, which are seemingly designed for AI and automotive applications.
Actually, Arm officially said that the Cortex-A78AE doesn't support simultaneous multithreading (SMT).
MT, [24]
Indicates whether the lowest level of affinity consists of logical cores that are implemented using a multithreading type approach. This value is:

1Performance of PEs at the lowest affinity level is very interdependent.
Affinity0 represents threads. Cortex-A78AE is not multithreaded, but may be in a system with other cores that are multithreaded.

So far, the Cortex-A65AE seems to be the only CPU from Arm that supports SMT.
MT, [24]
Indicates whether the lowest level of affinity consists of logical cores that are implemented using a multithreading type approach. This value is:

1Performance of PEs at the lowest affinity level is interdependent.
Affinity0 represents threads, Cortex-A65AE is multithreaded.
 
I'm not sure I follow you.. But it almost looks single thread performance they are pretty even per GHz? Although Series S has the clear advantage of being clocked up to 3.6Ghz, while we can expect Dane to be probably no more than 1.5Ghz (Orion NX's max is 2GHz anyway). I don't want to go back to Dakhil's old post. ._.
The presumed-A78C looks pretty even per ghz in single thread compared to the regular, non-APU desktop Zen 2 CPUs you can buy at retail.
The presumed-A78C is better per ghz in single thread compared to the 4700S (which is what should be in the PS5 and Series S/X). (and the explanation for the disparity between regular Zen 2 and the 4700S is probably the significant difference in L3 cache)

Edit:
I'm also making a critical assumption that Geekbench 5 tests are not memory bound/sensitive. Particularly, latency.
The 4700S as an APU/SoC uses GDDR as its memory. That's great for bandwidth (graphics). That's worse than DDR/LPDDR for latency.
For general computing purposes, the 4700S would actually be even further handicapped, due to the ram, compared to say, regular Zen 2 using DDR. It would also be handicapped compared to the usual Zen 2 APU, despite having the same L3 cache, because of the difference in ram.
 
Last edited:
The presumed-A78C is better per ghz in single thread compared to the 4700S (which is what should be in the PS5 and Series S/X). (and the explanation for the disparity between regular Zen 2 and the 4700S is probably the significant difference in L3 cache)
And assuming Qualcomm's using the octa-core (8) configuration of the Cortex-A78C for the successor to the Snapdragon 8cx Gen 2, I think there's a possibility Qualcomm skimped on the amount of L3 cache by potentially using 4 MB of L3 cache instead of 8 MB of L3 cache for the Cortex-A78C, similar to how Qualcomm skimped on the amount of L3 cache by using 4 MB of L3 cache instead of 8 MB of L3 cache for the Snapdragon 888, since the Cortex-A78C and the Cortex-X1 can use up to 8 MB of L3 cache.
 
0
Ooh, yea, interesting point. Guess we'll have to wait for future tests to show up (and actually reveal the memory configuration).

Oh, some more on memory latency since my mind's on it:
You know how in PC gaming space, the conversation about system RAM tends to arrive at latency being higher priority than raw speed/bandwidth (with a discrete graphics card being assumed to exist)? Anybody else think that, assuming sufficient bandwidth for Dane's own graphics purposes, the latency advantage of LPDDR over GDDR offers another bonus for the CPU side of things? Not necessarily that large of one, but still another factor in play.
Or would we expect to still be bandwidth constrained on the graphics front?
 
Ooh, yea, interesting point. Guess we'll have to wait for future tests to show up (and actually reveal the memory configuration).

Oh, some more on memory latency since my mind's on it:
You know how in PC gaming space, the conversation about system RAM tends to arrive at latency being higher priority than raw speed/bandwidth (with a discrete graphics card being assumed to exist)? Anybody else think that, assuming sufficient bandwidth for Dane's own graphics purposes, the latency advantage of LPDDR over GDDR offers another bonus for the CPU side of things? Not necessarily that large of one, but still another factor in play.
Or would we expect to still be bandwidth constrained on the graphics front?
I'm no expert but I think we'll be fine on the bandwidth front if we get the 102GB/s. In addition to the increased cache for L2 and L3 (8mb L3 cache going to really help out), Nvidia has been really efficient with bandwidth use.. And we're 3-4 generations after the TX1 too. So I expect 1:1 PS4 ports to not be a problem at all. 4x bandwidth alone vs TX1 is huge, and we were getting decent PS4 ports

Steam deck's 88GB/s shouldn't be a problem either, although I'm not sure if they can input at 1080p on a TV, but the power is still going to be the same.
 
0
Please read this staff post before posting.

Furthermore, according to this follow-up post, all off-topic chat will be moderated.
Last edited:


Back
Top Bottom