damn, discord link is dead :/About
We are The Nintendo Pipeline! We are a group of passionate Nintendo fans who are always on top of the latest news and games. This community blog is where you’ll see original content created a…nintendopipeline.wordpress.com
Could you provide the link for it Oldpuck? I must have missed it. But it's not surprising. The GA10F OFA is also the same as Ampere, right?and I have done some back of the envelope calculations, and I don’t think that DLSS 3 would be viable regardless of the architecture. Just not fast enough on a little machine
You mean.....sort of stretched...... like......
butter scraped over too much bread?
I have them both, neither give me Nintendo games.
I'm not suggesting they "go back to home console" and reverse the hybrid success but I'd love them to have a "pro" box as a niche market in addition to Switch 2 hybrid.
It won't happen but I'd love it.
NX2 isn't the code name for the next device, but NX is the name of the platform which includes the Switch successor, we know this because of the Nvidia leak, so NX2 is a good quick hand for it, though you might be explaining that it's just a community nickname for the Switch successor until we have a codename or official name. We also have plenty of other nicknames, such as the Succ, Switch 2, REDACTED, and NG Switch.
Saw some power profile thing being passed around for the "devkit" all rumor, but 11w portable and 28w+ when docked... It's important to remember that devkits will have a higher CPU clock to run the debug software and just for general stability, Switch devkits for instance run at 1.2GHz. This could count for 1 or 2 watts, so retail units would draw 9 to 10 watts when portable if this rumor is correct.
The docked number is also easily explained, to me this is a case of a 15w draw for charging the battery, thus the power draw for the system is 13w to 14w, again with 1 to 2 watts going towards that higher CPU clock, basically these numbers more or less match launch Switch units, another thing to consider is that 5000mah batteries are very common in both high end and budget phones, this is a reasonable upgrade to expect for Switch's successor, as the Switch used a 4315mah battery.
Drake/T239 uses GA10F (Geforce/Graphics Ampere) it is custom and seems to have some upgraded components, which includes the optical engine that Orin uses, that is vital to DLSS 3.0's frame generation, however it's separate from Ada Lovelace, and we don't know which is superior or if Orin/Drake's Optical engine is fast enough to support frame generation. We will learn if it is at a later time.
The OFAs appear the same, yeah.Could you provide the link for it Oldpuck? I must have missed it. But it's not surprising. The GA10F OFA is also the same as Ampere, right?
I would have to defer to @oldpuck and @Thraktor for any simulations, but since T239 has been developed for Nintendo specifically whatever L2 amount they have chosen should be more than enough to make sure Redacted isn't memory bandwidth limited.Has anyone done any calculations on how much bandwidth 1MB of L2 does (2MB, 4MB, 8MB and so on) on Ampere/Lovelace?
Looking at the 4060 Ti on a 128bit bus with 288GB/s bandwidth and 32MB L2 cache trades blows with the 3070 that has a 256bit bus with 448GB/s bandwidth.
Would be fun if Switch 2 could have a large L2 cache to mitigate the low bandwidth but that won’t happen
Cost is also a huge issue for Nintendo though. They're not trying to reduce bottlenecks at all costs, they're trying to be the best they can be within budget.I would have to defer to @oldpuck and @Thraktor for any simulations, but since T239 has been developed for Nintendo specifically whatever L2 amount they have chosen should be more than enough to make sure Redacted isn't memory bandwidth limited.
Older posts in this thread indicated L2 is either 1MB or 4MB in the NVN2 leak.
Thanks a lot but I'm finished with Gaming PCs.You should really go for the emulator-route if you want the beefier Switch Experience. I played Tears of the Kingdom in 4K with an emulator for some time and it works. The only catch is the beefier gaming-pc you need for this.
Devkits go out in a number of phases. We don't know what phase these supposed devkits are. If they're initial devkits we could still be looking at a year+, if they're final devkits then we could be looking at weeks before reveal.How much usually does it take from devkits to console reveal?
there's no "usually". it's whenever the timing permitsHow much usually does it take from devkits to console reveal?
Better node, better OFA and power saving features, that's the majority of Lovelace's advantages over Ampere.All respect to Rich, but he hasn't studied the Nvidia leak.
It has backported a couple of power saving features, and may be on 5/4nm but other that it's ampere.
I'm slightly confused is this the same fortnite rumor that came out several months ago?Would someone working on Fortnite know Nintendos first party plans? Would they be dumb enough to say things only someone working on Fortnite would know? Would the BC stuff even be relevant to their work?
Imo it's 100% fake and not even worth entertaining.
I think a more rounded approach is in the cards. The joycons, and I think most agree, can be a little bit bigger.I really think the main new thing of the Switch 2 is gonna be new controllers. "Joycons 2" maybe. I would love the next detachable controller to just be a pro controller split in half, like some fan renders show. Also, I think the console itself is gonna have a more ergonomic design, instead of being just a tablet.
Except it has the same OFA as desktop ampere.Better node, better OFA and power saving features, that's the majority of Lovelace's advantages over Ampere.
Thank you!The OFAs appear the same, yeah.
Thraktor starts the discussion here, and we go back and forth in a few replies. But I can summarize.
DLSS 3 frame gen has two phases, to hugely over-simplify. The first is to analyze the two frames it has buffered. The second is to generate a new frame between them. The OFA is critical for that first phase, but the second phase happens on some combination of tensor/shader cores.
It's not clear how much of the process is one phase or the other, but now that there are some 4060 benchmarks out there, we can start to take a guess. And that guess says that Frame Gen could take as long as 40ms on on a 3 TFLOPS machine. At that point it's taking longer to make your "fake" frame than it would to just generate a new one natively.
At that point, the OFA from Ada is actually a liability. It's like 3x larger, physically, so that's a decent chunk of wasted space on the chip which could be spent elsewhere. Not huge, really, because the OFA is probably pretty small, but for a mobile chip, every bit counts.
Also better media engines.Better node, better OFA and power saving features, that's the majority of Lovelace's advantages over Ampere.
Sure, but other than Switch itself, Nintendo's consoles have rarely been limited on RAM or memory bandwidth. Whatever Redacted has should be sufficient to keep the CPU and GPU feed.Cost is also a huge issue for Nintendo though. They're not trying to reduce bottlenecks at all costs, they're trying to be the best they can be within budget.
The Discord link isn't working for me.About
We are The Nintendo Pipeline! We are a group of passionate Nintendo fans who are always on top of the latest news and games. This community blog is where you’ll see original content created a…nintendopipeline.wordpress.com
Do we still think Nintendo must rush out a successor?
I can think of a console that's limited by memory bandwidth, it's called the Nintendo Switch.Sure, but other than Switch itself, Nintendo's consoles have rarely been limited on RAM or memory bandwidth. Whatever Redacted has should be sufficient to keep the CPU and GPU feed.
1MB of L2 seems low to me, but w/o more context (like a comparison of L2 levels to RAM and memory bandwidth configurations on a modern SOC) I can only guess.
We don't exactly know that. What's much more likely is that it inherits the OFA from the existing ORIN devices (which don't have exactly the "same OFA as desktop ampere").Except it has the same OFA as desktop ampere.
New intel from Nikki
Same Horizon OS as Switch 1 for the NG
Confirms how the Switch 2 chip performances will be reduced compared to the retail Nvidia Orin chip
Ilikefeet and Oldpuck just contradicted you.We don't exactly know that. What's much more likely is that it inherits the OFA from the existing ORIN devices (which don't have exactly the "same OFA as desktop ampere").
same OFA as desktop
The OFAs appear the same, yeah.
Thraktor starts the discussion here, and we go back and forth in a few replies. But I can summarize.
DLSS 3 frame gen has two phases, to hugely over-simplify. The first is to analyze the two frames it has buffered. The second is to generate a new frame between them. The OFA is critical for that first phase, but the second phase happens on some combination of tensor/shader cores.
It's not clear how much of the process is one phase or the other, but now that there are some 4060 benchmarks out there, we can start to take a guess. And that guess says that Frame Gen could take as long as 40ms on on a 3 TFLOPS machine. At that point it's taking longer to make your "fake" frame than it would to just generate a new one natively.
At that point, the OFA from Ada is actually a liability. It's like 3x larger, physically, so that's a decent chunk of wasted space on the chip which could be spent elsewhere. Not huge, really, because the OFA is probably pretty small, but for a mobile chip, every bit counts.
Note the word "exactly".Ilikefeet and Oldpuck just contradicted you.
No.Is this thread even about hardware anymore? Is this the new Paper Mario thread?
The short version is, "I would expect 1MB of L2 to perform roughly like other RTX 30 cards."Has anyone done any calculations on how much bandwidth 1MB of L2 does (2MB, 4MB, 8MB and so on) on Ampere/Lovelace?
Looking at the 4060 Ti on a 128bit bus with 288GB/s bandwidth and 32MB L2 cache trades blows with the 3070 that has a 256bit bus with 448GB/s bandwidth.
Would be fun if Switch 2 could have a large L2 cache to mitigate the low bandwidth but that won’t happen
GPU | Bandwidth/TFLOP | Cache/Bandwidth |
---|---|---|
GA102 | 25 GB/s/TFLOP | 6KB |
GA103 | 22.4 GB/s/TFLOP | 9KB |
GA104 | 28.95/GB/s/TFLOP | 7KB |
GA106 | 30GB/TFLOP | 8KB |
GA107 | 24GB/TFLOP | 9KB |
T239 (3TFLOPS, docked) | 34GB/TFLOP | 10KB |
T239 (1.5TFLOPS portable) | 44GB/TFLOP | 15KB |
It's a pointless discussion anyway. Even if it was Lovelace OFA frame generation would be useless on Drakes power envelope.Note the word "exactly".
Nobody knows really. It's safe to assume that it'll be similar in form factor to the Switch based on the chip they're using and the idea that they want to carry over Nintendo accounts/supposedly the OS seems to suggest an iteration of the Switch concept rather than something brand new.So I’m clearly out of the loop these past few months.
I’m assuming general speculation has been leading towards a Switch 2 rather than a new idea/gimmick for a console. Would I be correct in assuming that?
Nubia is making gaming tablets, technically the same thing if what you need is a gaming orientated Android tablet. They also have a 3d tablet or something.I'm still waiting for the Nvidia Shield Tablet 2
Oh, absolutely, yes. At least we can agree there. I'm definitely not getting on DLSS 3.X, at least not the current form of it.It's a pointless discussion anyway. Even if it was Lovelace OFA frame generation would be useless on Drakes power envelope.
Do we still think Nintendo must rush out a successor?
Do we still think Nintendo must rush out a successor?
Do we still think Nintendo must rush out a successor?
if Nintendo has decided to keep the Joy-Con or something similar for the Switch sucessor, i hope they do something to solve the dreadful Joy-Con drift, a lot of my friends suffered trough this problemI'm slightly confused is this the same fortnite rumor that came out several months ago?
I think a more rounded approach is in the cards. The joycons, and I think most agree, can be a little bit bigger.
7/8 years for the past two console generation has been the typical lifecycle of a console, if Nintendo Switch sucessor launch in holiday 2024 as i believe Nintendo will do with the console, that would be longest console generation for Nintendo, Switch would be 7 years old and 8 in the market, sufficient time for a sucessor to launch, here a intereting comparison of previous Nintendo console games on it seventh year on the market compared to Switch seventh yearPlease tell me this is being ironic. Reading anything about Nintendo "rushing out a successor" about 6.5 years after the Switch's release can't be serious, and it tends to hugely trigger me lol
It's been covered a lot in this thread by people much smarter about this stuff than I am, so I'm just regurgitating what I've read here over the years:Do we still think Nintendo must rush out a successor?
Interesting postGot in late for the HOS discussion, but I don't think we're going to see them use the exact same HOS used by Switch. More like HOS 2.0, where the baseline for it involves its security. Even though Switch got hacked early on, it was because of the convenience of the entry point being well documented and public from the situation with the Shield TV. Hackers had said that if it wasn't for that point of entry, they would still be looking for another opening, perhaps even now, because they felt that HOS was that secure.
The way HOS works with memory is it partitions it into 4 sections - Application, Applet, System and System Unsafe. The latter two are, imo, the basic portions of the OS, which together allocate roughly 310MB. Application is for the games, taking ~3.2GB. The remaining 467MB is for Applet, which is everything else, like the Home Menu, Album, eShop, Settings, even the All Software list etc, and as you move into each part of the system, the prior stuff is emptied out and newer stuff is loaded in. Areas like the eShop and Album are loading in data as you scroll through them, but aren't retained because there's only a limited amount of memory to work with, so when you scroll back and forth, it's continually reloading. For the eShop, the data isn't located locally, so it has to continually download. While folks suggest that the reason Nintendo didn't include a fully loaded internet browser was to avoid a hacking situation, it could also be because of the limited amount of RAM that would have been available to it. Can't really compare it to the Wii U, because as time goes on, there's expectations from things like that, like using the latest software, and the available RAM likely couldn't cut it.
For the eShop, the CPU is used heavily, using the 3 game cores when a game isn't loaded, but is limited to the OS core when a game is loaded (the latter of these two scenarios brings the eShop to a crawl). Back to the point of the eShop redownloading data as you scroll, this is likely why the eShop uses a good amount of CPU. I think it's having to process this data, which may include decompression. Let's say it is having to decompress data for the eShop. With Switch 2, if they redesign the eShop, they could possibly make use of the FDE so the CPU isn't being pushed, which could make browsing the eShop much faster (besides having the stronger CPU).
What I'm getting at ultimately, is let's not think of Switch 2 using Switch's HOS "as-is", but as a baseline for the important things while also being expandable in content, features, and functionality.
I don't think that's what they were implying.You do realise that not a single person in this thread has suggested even for a second that Nintendo will abandon the hybrid model, don't you?
The short version is, "I would expect 1MB of L2 to perform roughly like other RTX 30 cards."
The longer version is, you can't really simulate the effect without knowing the cache hit rate, but we can compare to other GPUs in the same architecture. To keep it simple, I'm going with full GPU dies, not all the binned variants.
I'm assuming the smaller 1MB L2 cache here, and a lot of this analysis is me just summarizing @Look over there's very smart work.
GPU Bandwidth/TFLOP Cache/Bandwidth GA102 25 GB/s/TFLOP 6KB GA103 22.4 GB/s/TFLOP 9KB GA104 28.95/GB/s/TFLOP 7KB GA106 30GB/TFLOP 8KB GA107 24GB/TFLOP 9KB T239 (3TFLOPS, docked) 34GB/TFLOP 10KB T239 (1.5TFLOPS portable) 44GB/TFLOP 15KB
You can see that T239 has more bandwidth than any other card in the RTX 30 range. Only the weirdly large GA106 comes close. Now, ideally, a console needs more bandwidth than a standard GPU, but the bandwidth is shared with the CPU, but this looks pretty good.
And you can see that the cache is also slightly on the high end, even with just a piddly 1MB. Considering how big caches are, a 1MB design (instead of the 4MB we've seen elsewhere) makes a ton of sense.
GPU | Size | CUDA Cores | Transistors | Cost |
---|---|---|---|---|
GA102 (RTX 3090+) | 628mm^2 | 10752 | 28,300 | $1499 |
AD102 (RTX 4090+) | 609mm^2 | 18432 | 76,300 | $1599 |
Do we still think Nintendo must rush out a successor?
It’s Ampere with some aspects of Lovelace, but for all intents and purposes for gaming it is Ampere. Plus, at the size and the constraints that device would operate in, DLSS3 probably isn’t a good idea.I'm a little confused tbh, is the T239 fully Ampere with some lovelace features or is it some sort of bogged down lovelace *SoC? Does this completely dismiss DLSS 3 for REDACTED? 'pologies if I missed a post explaining it reading through the trainwreck ,:v
(also my first famiboards post hi o/)
Idk, it seems pretty cut down."I don't think anyone has dug into how cut down"?
What the hell has this thread been doing all this time?
And unless she knows something we don't, it's far less cut down than people would guess.
*Almost Half of the PS5 owners in the United States.During the FTC trial, there were claims that half the PS5 owners also owned a Switch.
I’d like an actual black for the OLED at least, this is just Grey .I’d be happy just with more theme colours than just black or white
Ada is already largely Ampere by 92% of what it is.All respect to Rich, but he hasn't studied the Nvidia leak.
It has backported a couple of power saving features, and may be on 5/4nm but other that it's ampere.
Do we know Orin or Drake has a different OFA from ampere?
The OFA in ORIN is different from the desktop, and T239 has the same one as ORIN. Due to ORIN’s usecase (automotive, object detection) it’s critical that it has a very performant and adequate OFA especially for this bolded reason. Desktop Ampere is only able to deliver for video playback, but ORIN’s OFA is suitable for real-time object detection.same OFA as desktop
We can’t figure that out unless we know how much latency the L2 has. Or the clock frequency as it’s tied to the GPU.Has anyone done any calculations on how much bandwidth 1MB of L2 does (2MB, 4MB, 8MB and so on) on Ampere/Lovelace?
It won’t.i thought the ts239 wouldnt support dlss3.
Shouldn’t be more than 25GB/s. Probably about 20GB/s. It’s up to how they choose to develop a game though.Thanks for taking your time, sounds good that it has higher bandwidth than the other Ampere cards. The CPU should take a chunk of that, in guessing it will be similar to what’s reported on the PS4 CPU taking a chunk out of the GDDR5 bandwidth?
At that size, yields also aren’t much of an issue. . Even if Nintendo goes sparser than 125MTr/mm, and goes with say, 90MTr/mm, it would still be really small. Sub-100 range.I don't know what node Nintendo will choose, but I am increasingly convinced that cost isn't an issue for 5nm
*Assuming yields are the same
On Switch, Nintendo has sometimes used Twitter as a replacement for where they previously used Miiverse for in game integration, and I think it's pretty safe to say now that that experiment is failing. General social media can be useful for boosting reach, but, as many companies are now finding out, the rug can be pulled out from under them quite quickly with little warning. It's best to use it without relying on it.What else are we referring to? I’m only talking about sharing on the platform here.