• Hey everyone, staff have documented a list of banned content and subject matter that we feel are not consistent with site values, and don't make sense to host discussion of on Famiboards. This list (and the relevant reasoning per item) is viewable here.
  • Furukawa Speaks! We discuss the announcement of the Nintendo Switch Successor and our June Direct Predictions on the new episode of the Famiboards Discussion Club! Check it out here!

StarTopic Future Nintendo Hardware & Technology Speculation & Discussion |ST| (New Staff Post, Please read)

so some more slide decks went public on GDC Vault, including the Ubisoft Massive set on Snowdrop

though more interestingly, Ubisoft LaForge's deck on Neural Texture Compression

some highlights


30% smaller textures, arguably higher quality, and 1ms overhead on last gen systems? sauce me up, daddy!

Thanks for sharing this. There's also a paper that was already published by the same team on this technique here, which goes into some more detail.

I don't think I'm understanding this part. I would expect two methods of storing/presenting textures of the same 1024x1024 resolution to look very similar, unless one method was especially bad in its compression or whatever. So why does neural just straight up seem higher resolution?

Referring to them both as 1024x1024 glosses over a lot, as they're representing the data in very different ways, although they're both stored as texture data.

In the paper, which uses different texture sets to the AC Mirage ones here, the comparison is to texture sets with 9 channels across 3 textures; a diffuse albedo (RGB) texture, a normal texture, and an ARM texture (ambient occlusion, roughness, metalness). So for the 1k comparison you have three different three channel 1024x1024 textures.

The neural approach combines all 9 channels into a set of textures of varying sizes, one each at 1024, 512, 256 and 128 resolution per side. Using higher resolution originals as a source (the paper uses 2k source textures), they generate both the contents of those textures, and a small neural network which is trained to decode them into the original 9 channels. During runtime the textures are sampled by existing block decompression hardware, then fed into the neural network to produce the output.

Part of the reason that they can produce more detail for a given file size is that the neural representation can leverage correlation across all 9 channels in a way which traditional block texture compression can't, as existing formats are limited to 4 channels per texture. However, as I brought up when discussing Nvidia's similar neural texture compression technique, I'm not yet convinced that there's a significant benefit to the ML component of the implementation. It's quite possible that extending traditional block compression techniques to 10+ channels in a single texture could achieve a similar level of compression with zero performance overhead and a very small silicon cost.

Unfortunately block compression techniques are unchanged for over a decade now, and the only chance of a new format becoming widely used is if Microsoft mandates it as part of a new DirectX version. In theory a console would be the perfect place to introduce a new format, as developers would be able to rely on every console supporting it, but it seems a bit of a longshot for Switch 2.

That compression time is a beast, though. Having to bake an asset for over half an hour. Woof.

In the paper they specify that they're using an RTX 2070 for the encoding, so even a newer, more powerful desktop GPU should be able to speed that up quite a bit, let alone a server setup that you'd expect in production. Still, over the assets required for a single area, let alone a full game, you're still looking a pretty hefty training times.

Fake edit: Actually, while reading the paper, they're claiming 140 mins on an RTX 2070, so the 30 min figure is likely already on something like a RTX 4090. Getting a full game into even an overnight process would require some serious hardware in that case.
 
Fake edit: Actually, while reading the paper, they're claiming 140 mins on an RTX 2070, so the 30 min figure is likely already on something like a RTX 4090. Getting a full game into even an overnight process would require some serious hardware in that case.
Much less hot reloading during development. And since there is a small-but-real performance cost, you can't just save the bake for the very end of development.

This is a really cool technique, and the fact that it works on all hardware, right now, with a really simple integration path is pretty amazing. But with the bake times so long, I wouldn't expect to see it anywhere except from teams that can afford the CI/CD tools necessary.
 
Link Awakening?
iC1ukZ.gif



Weirder. If you've played Earthbound, you know what I mean
 
Block compression clamps the color space. You take a 4x4 (or 8x8 or whatever) pixel block, take the range of colors represented in that block, and create a smooth gradient between two colors inside that block, and then map all the pixels to points along that gradient.

Over the course of the whole texture, you can represent lots and lots of colors this way, very compressed. But inside one of those blocks, similar colors turn into the same color. And since a color wheel is 2 dimensional space and a color gradient 1 dimensional, you can lose tiny details that might be a wildly different color.

This causes smoothing in noisy areas, like the fine detail of the wood there. While technically it might have as many pixels as the original image, each 4x4 block is really only able to express one color, with a range of values (ie, green so bright it's nearly white all the way down to green so dark it's basically black). That lowers the perceived resolution.

Neural compression, instead, represents pixels as a set of statistical probabilities, which can express the whole color space. So adjacent pixels can represent wildly different colors. The total number of pixels, and the color range are the same between the two images, it's just the fine detail doesn't get squashed.

Edit: I wondered why you, of all people, would ask this question. Upon reflection, this might be stuff you all knew, and were referring to bad compression in the PDF itself. Sorry if this sounded like I was talking down.

It's worth noting that the neural representations themselves are block-encoded (using BC6H), so have the same limitations in terms of representing pixels within a block as part of a gradient.

The overall benefit of the technique likely comes down to two things. Firstly, as mentioned in my previous post, they're encoding around 9 channels in a single representation, so they can leverage a lot more correlation between channels than traditional block techniques. The second reason is that they can leverage some degree of coarse-grained data across the entire texture. The neural network itself is trained per texture set, so can contain information about the entire texture. Then the lower-resolution textures used as part of the representation can contain data about a much wider area in each block.

For example, the wood texture is pretty much all shades of brown, so having the lowest-resolution texture (or even the network itself) store the fact that it's pretty much all brown allows the full-res texture to store more fine detail instead.

If it's just this, block compression is a worse option than I'd realized. I guess not much has changed about it in a long time, sounds basically like what I remember learning about then-innovative S3TC for GameCube 20+ years back. But I would have thought wood should be a great use case. Since each block basically breaks down into a gradient between two colors, the worst cases would be where there are several very different colors in a small area. But this wood is basically shades of brown.

If that wood on the left is the result, though, seems like you're not really better off than just using a smaller uncompressed texture.

Modern block compression basically is S3TC in evolved form. In fact, the format they use for the normals and ARM textures in the paper for the comparisons is BC1, which is literally the S3TC encoding used in the Gamecube. The other BC2-BC7 formats used by DirectX (and defacto everywhere else) are all evolutions of the S3TC format.

ASTC is a bit different, and more flexible, but is supported on a relatively limited amount of hardware (including the Switch, though). All of these were developed before PBR became the norm, though, so the most channels you were expected to store in a single texture was 4 for RGBA.
 
Much less hot reloading during development. And since there is a small-but-real performance cost, you can't just save the bake for the very end of development.

This is a really cool technique, and the fact that it works on all hardware, right now, with a really simple integration path is pretty amazing. But with the bake times so long, I wouldn't expect to see it anywhere except from teams that can afford the CI/CD tools necessary.
Wonder if it's reasonable to develop a specialized, many-GPU system to run these sorts of operations overnight?
 
If it's just this, block compression is a worse option than I'd realized. I guess not much has changed about it in a long time, sounds basically like what I remember learning about then-innovative S3TC for GameCube 20+ years back. But I would have thought wood should be a great use case. Since each block basically breaks down into a gradient between two colors, the worst cases would be where there are several very different colors in a small area. But this wood is basically shades of brown.

If that wood on the left is the result, though, seems like you're not really better off than just using a smaller uncompressed texture.
I think this example is cheating a little, as it's zooming in, where in game you'd be able to load in a higher res texture. But PBR materials these days require so many layers, that compression is basically required if you want to light it in any sort of interesting way.
 
Much less hot reloading during development. And since there is a small-but-real performance cost, you can't just save the bake for the very end of development.

This is a really cool technique, and the fact that it works on all hardware, right now, with a really simple integration path is pretty amazing. But with the bake times so long, I wouldn't expect to see it anywhere except from teams that can afford the CI/CD tools necessary.

Given that they're using AC Mirage as an example, and it's a game which might get a Switch 2 port at some point, I'd be curious if they could make it work for a late port where all the assets are already fixed. That said, even in that case they would likely need to tweak texture resolutions a lot while optimising for the hardware, which would necessitate frequent re-encoding.

Wonder if it's reasonable to develop a specialized, many-GPU system to run these sorts of operations overnight?

Yes, you could easily run it in parallel across a large number of GPUs. The issue is that it requires an overnight process in the first place, which means if you change any assets, you have to wait a day to see what it will look like in-game.

This is one of the reasons dynamic global illumination is such a big deal. Yes, it can allow fancy dynamic lighting effects for those of us playing the games, but it also means that artists working on the game can see how lighting will look immediately, instead of having to wait for a baked lighting system to generate the lighting for them.
 
It's worth noting that the neural representations themselves are block-encoded (using BC6H), so have the same limitations in terms of representing pixels within a block as part of a gradient.
My understanding - and I've only given the paper a once over - is that the model is constructed in such a way that BC6 output is it's natural output format. While that obviously limits the range of values that can be inputs to the decompression model, that's a product of how the data is quantized, not an post-hoc clamping process. Compression artifacts shouldn't necessarily be aligned to the 4x4 grid. But my understanding there is really fuzzy.
 
Something just occurred to me that I hadn't really connected before regarding GPUs, and NPUs. GPUs do all the rendering of stuff on screen so a pretty picture can be presented on your monitor. The CPU does all the general executions, and other processing (hence processing unit), and the NPU feels like that middle ground intermediary where it doesn't replace either chip, but can greatly enhance the abilities of each, allowing resources and cycles dedicated to where it's needed most at any given frame.

In some ways, NPUs might be the next "GPU" in terms of industry disruption like how GPUs were breaking ground in the 1990s into the mainstream when hardware acceleration of graphics were the next best thing.

I'm sure it's more complicated than how I'm describing it, but that is how I look at NPUs right now, and their potential use cases.
Microsoft is thinking exactly this, hence the NPU mandates for new cpus
 
I’m actually kinda shocked that Nintendo hasn’t implemented a feature to listen to Nintendo music on NSO.

It’s seems something so simple that would índice people, also with the Switch portability it seems like a no brainier.

I honestly doubt many would actually use it. It's way easier, more comfortable, and convenient to just pull up Nintendo OSTs on YouTube with your smartphone. The better option would be to just officially release their tracks on Apple Music/Spotify.
I have actually dreamed about this before, but my dream was Nintendo releasing a standalone music app for both Android and iPhone, and you just use your login credentials. They could also release the app on the Switch, with synced playlists and stuff like that.
 
0
How many compression techniques built into the uArch does Nvidia Ampere have compared to RDNA2?

i have vague memory that Nvidia is ahead of AMD in that department. Shouldn’t that give T239 a slight edge over RDNA2?
 
I wasn't sure about the Switch successor retail having 12GB of RAM and sort of guessed it was the dev kit but after the rumor of 16GB RAM dev kits I'm relieved that it may in fact have 12gigs and that the rumored dimensions might actually be the retail Switch successor. So 👍🏽

I'm wondering why two different sections of 6GB RAM each. Would it be perhaps so that it can save battery when playing Switch games through 1x6GB RAM remaining unheated or unaccessed meanwhile the other block is utilized by the Switch games? Meanwhile both are used by successor titles.

Also I saw how DLSS can cost hefty VRAM through a video testing VRAM usage of pc games at different resolutions + different settings + DLSS usage. Like Richard Leadbetter of DF said, DLSS is not a free lunch. I wonder whether it does have DLSS and if it does I hope the component that computes in conjunction with dlss is included in Tegra T239.
 
I wasn't sure about the Switch successor retail having 12GB of RAM and sort of guessed it was the dev kit but after the rumor of 16GB RAM dev kits I'm relieved that it may in fact have 12gigs and that the rumored dimensions might actually be the retail Switch successor. So 👍🏽

I'm wondering why two different sections of 6GB RAM each. Would it be perhaps so that it can save battery when playing Switch games through 1x6GB RAM remaining unheated or unaccessed meanwhile the other block is utilized by the Switch games?

Also I saw how DLSS can cost hefty VRAM through a video testing VRAM usage of pc games at different resolutions + settings + DLSS usage.
It wasn't through the 16GB dev kit rumors that the switch2 was confirmed to be 12GB RAM, it was the fact that all the parts on the May customs data were retail parts.
 
I wasn't sure about the Switch successor retail having 12GB of RAM and sort of guessed it was the dev kit but after the rumor of 16GB RAM dev kits I'm relieved that it may in fact have 12gigs and that the rumored dimensions might actually be the retail Switch successor. So 👍🏽

I'm wondering why two different sections of 6GB RAM each. Would it be perhaps so that it can save battery when playing Switch games through 1x6GB RAM remaining unheated or unaccessed meanwhile the other block is utilized by the Switch games?

Also I saw how DLSS can cost hefty VRAM through a video testing VRAM usage of pc games at different resolutions + different settings + DLSS usage. Like Richard Leadbetter of DF said, DLSS is not a free lunch. I wonder whether it does have DLSS and if it does I hope the component that computes in conjunction with dlss is included in Tegra T239.
they use 2 6GB chips because that was what's available when speccing out the system and also filled the 128-bit bus. using one 12GB chip would have left half the bus empty and cut your bandwidth in half

DLSS reduces VRAM usage, it doesn't increase it unless you're using DLSS at 1:1 input/output resolution
 
they use 2 6GB chips because that was what's available when speccing out the system and also filled the 128-bit bus. using one 12GB chip would have left half the bus empty and cut your bandwidth in half

DLSS reduces VRAM usage, it doesn't increase it unless you're using DLSS at 1:1 input/output resolution
oh I see 🤔
 
we still dont know what process node Nintendo is gonna use( if is 4NM, 8NM or something else)
There's no way to know the information through any customs data, and even the developer docs don't answer that question.And I think it's pointless to dwell on the nodes because the clock frequency data is bound to be leaked around the end of this year, and as long as we have the power consumption data and the clock frequency data, we can basically speculate on the nodes.

We want to know the nodes because we need to speculate on the possible clock frequencies based on the nodes, but based on switch's experience, the clock frequencies will be known much earlier, and then speculating on the nodes won't make much sense at this point.
 
Last edited:
we still dont know what process node Nintendo is gonna use( if is 4NM, 8NM or something else)

Unless Nintendo comes out and says it (and 99/100 they won't), that sort of thing sadly won't probably be known until someone has the Switch 2 in their hands and does a deep-dive. Until then, all we can do is make educated guesses.

On the positive -- as it's been analyzed to kingdom come -- it's extremely likely to not be 8nm.
 
0
The first trailer for the 2014 wiiu version of botw was a product of the same mindset both in terms of character models and environment rendering as the full version of botw, it's just that the 2014 PV was much more detailed(Mainly in terms of lighting and texture.), and it's not the same thing as the kind of realism that TP had.



I was referring to this Zelda Wii U:

 
I think they're aware that it was just a Wii U tech demo. They just think it would be nice to have a game that actually utilized that style.
Well, there is only a very low probability that they will make a realistic style Zelda like this, both for personal taste and realistically speaking.They've found a good balance since SS with botw and totk's exploits.I don't see them making another realistic style game like TP.
 
they use 2 6GB chips because that was what's available when speccing out the system and also filled the 128-bit bus. using one 12GB chip would have left half the bus empty and cut your bandwidth in half

DLSS reduces VRAM usage, it doesn't increase it unless you're using DLSS at 1:1 input/output resolution

Does DLSS increase VRAM usage when compared to just using the input resolution? Like, 2160p DLSS performance mode (1080p internal) would need more VRAM than native 1080p?
 
Does DLSS increase VRAM usage when compared to just using the input resolution? Like, 2160p DLSS performance mode (1080p internal) would need more VRAM than native 1080p?
Yes. General rule of thumb with DLSS and VRAM usage is:

DLAA > native output resolution > DLSS > native input resolution
 
You can't put a number on that, especially for a reason so arbitrary.
There's no need to quantify anything to discuss this, TP was even made to satisfy established North American fans back in the day after toppling TWW2, and Eiji Aonuma hasn't done anything in the realism since then because it just wasn't his preference.
 
There's no need to quantify anything to discuss this, TP was even made to satisfy established North American fans back in the day after toppling TWW2, and Eiji Aonuma hasn't done anything in the realism since then because it just wasn't his preference.
That doesn't mean they won't go back to the TP style or some other form of realism. The series is unpredictable like that so being definitive is a folly.

Besides, they still made 3 Celdas after Wind Waker
 
That doesn't mean they won't go back to the TP style or some other form of realism. The series is unpredictable like that so being definitive is a folly.
I would agree with this, provided that both Aonuma and Fujibayashi are no longer in charge of the Legend of Zelda follow-up.

I'd just caution that a return to a realistic style usually means they don't innovate further with their gameplay mechanics, and I'd be merciless in criticizing how much TP relies on OOT for its gameplay paths, which I consider to be nothing short of OOT 2.0.
 
I would agree with this, provided that both Aonuma and Fujibayashi are no longer in charge of the Legend of Zelda follow-up.

I'd just caution that a return to a realistic style usually means they don't innovate further with their gameplay mechanics, and I'd be merciless in criticizing how much TP relies on OOT for its gameplay paths, which I consider to be nothing short of OOT 2.0.
I'd argue that will no longer be the case with the device in hand, it can handle the artstyle and innovate as the last two did just fine. They're not mutually exclusive.
 
I'd argue that will no longer be the case with the device in hand, it can handle the artstyle and innovate as the last two did just fine. They're not mutually exclusive.
In fact botw chose this art style precisely because of the large number of color-blocked objects needed to allow the player to quickly split and understand different objects on a visual level after the multiplication design principle was introduced. I don't see why they would go back to the past and choose the ugly realism style of TP while they maintain their wild imagination now, it's not in line with what they've been doing for most of the time since TWW.
 
In fact botw chose this art style precisely because of the large number of color-blocked objects needed to allow the player to quickly split and understand different objects on a visual level after the multiplication design principle was introduced. I don't see why they would go back to the past and choose the ugly realism style of TP while they maintain their wild imagination now, it's not in line with what they've been doing for most of the time since TWW.
because realism and wild imagination aren't incompatible. the art style solves multiple birds with one stone. they aren't incapable of having easily discernable objects with a realistic art style. that's an entire game design philosophy that's independent of any particular art style
 
because realism and wild imagination aren't incompatible. the art style solves multiple birds with one stone. they aren't incapable of having easily discernable objects with a realistic art style. that's an entire game design philosophy that's independent of any particular art style
I don't need to know if it's possible for the realist style to create easily recognizable objects, because we already know that BOTW's art style choices are functional to begin with, and I neither think nor expect them to go for ugly realist style pieces. I don't want to argue this topic,No one can convince anyone.
 
I don't need to know if it's possible for the realist style to create easily recognizable objects, because we already know that BOTW's art style choices are functional to begin with, and I neither think nor expect them to go for ugly realist style pieces. I don't want to argue this topic,No one can convince anyone.
It can, but they might need to do what you mentioned earlier. Embrace realism while keeping the asset design colorful and fantastic, extremely detailed and using all the newest rendering paradigms mind you but not intending to lose colour at all, there are games that have done this and aren't "ugly"... all the opposite. TP being ugly is and has always been a case of hardware limitations as well as western influence, neither are the case now.
 
🤔 "Frore"... and air? why does that sound familiar.

Using_Farore%27s_Wind.png

ehy this is actually priddy intrestin. they specifically mention orin compatibility

I actually missed your link you posted yesterday. That is definitely quite interesting for sure! Doubt it'll have much to do with Switch 2 at this point, though if an active cooling solution was required on the dock, I wonder if this would be viable?
 
I don't need to know if it's possible for the realist style to create easily recognizable objects, because we already know that BOTW's art style choices are functional to begin with, and I neither think nor expect them to go for ugly realist style pieces. I don't want to argue this topic,No one can convince anyone.
this is the crux of this whole discussion. you don't like it, so you say the team won't do it. you don't need to deal in needless absolutes. just say you don't like it and hope they don't do it
 
Y’know, the Zelda team can still take advantage of the new tech by creating even more complex systems, but choose to stay with the same type of art style, or just do minor changes here and there. The only guarantee Drake has for Zelda is that the devs will continue to push the franchise as hard as rhey can
 
The war against leakers makes me believe that we won't have any concrete news until at least September/October, assuming that this is the official announcement period. 🥲
it'll cut down on twitter leakers leaving the website-backed leakers. they tend to have more meat to chew on anyway with multiple sources giving different perspectives and possibly different games
 
The war against leakers makes me believe that we won't have any concrete news until at least September/October, assuming that this is the official announcement period. 🥲
An article from a very serious source like Nikkei will probably happen a few days (at most 2 weeks) before official HW announcement. Their sources are from factory/logistics so they are 95% accurate.
 
I actually missed your link you posted yesterday. That is definitely quite interesting for sure! Doubt it'll have much to do with Switch 2 at this point, though if an active cooling solution was required on the dock, I wonder if this would be viable?
This feels to me like a pretty transparent grab to brand a cooling solution with "AI."

I haven't looked into it in a while, but the Airjet's strength is also its weakness - it's thin. A conventional fan moves an amount of air based on how wide, but also how tall it is. For the Airjet, you can't make them taller in the same way. That's why this Airjet module is rated for Orin Nano and NX, but not AGX. It simply can't move that much heat.

A fan moves air up and away from a chip. The Airjet moves air at a right angle. That angle change matters in something like high density computing. There are setups (I use several at my job) where you have a server about the size of two packs of playing cards stacked together, and they're lined up side by side inside a standard sized server chassis. In order to make this work, the chip is vertical. You can't mount a traditional fan on that sort of device, because hot air would blow out of the server on the left directly into the server on the right. Not good

The Airjet offers the possibility of really tightly packed servers, with all the airflow out of the way of each other. Dozens of Orins, all lined up in an AI compute cluster, in a very small space. Airjet is only slightly more power efficient than a fan, but if you've got 24 servers, in a chassis, that adds up.

The dock doesn't have any of these needs. It's connected straight to power, and a tiny amount of inefficiency isn't going to matter. It's not super thin, and it doesn't need to worry about what direction the air is moving. So while you could put one of these things in a dock, a regular old fan is going to be a cheaper solution that is basically as good.
 
Please read this new, consolidated staff post before posting.

Furthermore, according to this follow-up post, all off-topic chat will be moderated.
Last edited by a moderator:


Back
Top Bottom