Thraktor
"[✄]. [✄]. [✄]. [✄]." -Microsoft
- Pronouns
- He/Him
so some more slide decks went public on GDC Vault, including the Ubisoft Massive set on Snowdrop
though more interestingly, Ubisoft LaForge's deck on Neural Texture Compression
some highlights
Machine Learning Summit: Real-time Neural Textures for Materials Compression (With Introduction from Summit Advisor Olivier Pomarez)
What's better than starting a day full of awesome content in the Machine Learning Summit? Chairs of the summit will introduce the day with a short glance at Machine Learning impact and how it's helping us all creating better games. Each day covers...gdcvault.com
30% smaller textures, arguably higher quality, and 1ms overhead on last gen systems? sauce me up, daddy!
Thanks for sharing this. There's also a paper that was already published by the same team on this technique here, which goes into some more detail.
I don't think I'm understanding this part. I would expect two methods of storing/presenting textures of the same 1024x1024 resolution to look very similar, unless one method was especially bad in its compression or whatever. So why does neural just straight up seem higher resolution?
Referring to them both as 1024x1024 glosses over a lot, as they're representing the data in very different ways, although they're both stored as texture data.
In the paper, which uses different texture sets to the AC Mirage ones here, the comparison is to texture sets with 9 channels across 3 textures; a diffuse albedo (RGB) texture, a normal texture, and an ARM texture (ambient occlusion, roughness, metalness). So for the 1k comparison you have three different three channel 1024x1024 textures.
The neural approach combines all 9 channels into a set of textures of varying sizes, one each at 1024, 512, 256 and 128 resolution per side. Using higher resolution originals as a source (the paper uses 2k source textures), they generate both the contents of those textures, and a small neural network which is trained to decode them into the original 9 channels. During runtime the textures are sampled by existing block decompression hardware, then fed into the neural network to produce the output.
Part of the reason that they can produce more detail for a given file size is that the neural representation can leverage correlation across all 9 channels in a way which traditional block texture compression can't, as existing formats are limited to 4 channels per texture. However, as I brought up when discussing Nvidia's similar neural texture compression technique, I'm not yet convinced that there's a significant benefit to the ML component of the implementation. It's quite possible that extending traditional block compression techniques to 10+ channels in a single texture could achieve a similar level of compression with zero performance overhead and a very small silicon cost.
Unfortunately block compression techniques are unchanged for over a decade now, and the only chance of a new format becoming widely used is if Microsoft mandates it as part of a new DirectX version. In theory a console would be the perfect place to introduce a new format, as developers would be able to rely on every console supporting it, but it seems a bit of a longshot for Switch 2.
That compression time is a beast, though. Having to bake an asset for over half an hour. Woof.
In the paper they specify that they're using an RTX 2070 for the encoding, so even a newer, more powerful desktop GPU should be able to speed that up quite a bit, let alone a server setup that you'd expect in production. Still, over the assets required for a single area, let alone a full game, you're still looking a pretty hefty training times.
Fake edit: Actually, while reading the paper, they're claiming 140 mins on an RTX 2070, so the 30 min figure is likely already on something like a RTX 4090. Getting a full game into even an overnight process would require some serious hardware in that case.