So, just for funsies, how about a little explainer? This loops back to the "physically based rendering" conversation from the Metroid Prime remaster.
TL;DR: @Thraktor is suggesting - and I agree - that this paper actually does one Bleeding Edge AI thing, and one More Boring Modernization Thing, and that the cool improvements that Nvidia is reporting aren't actually because of Bleeding Edge AI, but the Boring Modernization part.
Longer Version:
One way to think of an image is that it is a
map of how
light bounces off a
surface. Go with me here, it sounds more complicated than it is.
Take a piece of printer paper, a blank white sheet. Go outside in bright, clear sunlight. It practically glows. It can be literally blinding. If you were to make an
image of that white printer paper, you'd just have a rectangle of pure, white pixels.
Now say you use that image as a texture in a video game. Just like the white light of the sun bounces off the white pieces of paper and projects white light into your eyes, the
game engine uses each pixel of the image as a
map of the color of light that the texture bounces back.
But a blank white texture looks... terrible. Like, take that white pieces of paper out of the bright sun, you pick up the colors of the light in the room a little, you get little shadows where there are tiny creases in the paper, you can see the grain a little bit. So in classical game engines, the artist wouldn't just make a flat white texture for a piece of paper, they'd add all those little details to the image. Much better looking, but it's still just one channel of data, one map.
The problem, then, is what happens when the light changes? You move the piece of paper around in the game engine, or you move the camera around so you see it from a different angle? In real life, you'd see all the little shadows shift, the color from the light change. If you don't account for that in the game engine, then it doesn't look like a piece of paper anymore, it looks like a picture of a piece of paper, if that makes sense.
So, over time, game engines started to add new
channels to the textures. Just like you can think of an image as a map that says "when light hits this pixel, send back this color", you can think of these other channels as a second set of maps that tell the game engine more about how light interacts with the object. One channel might be a "roughness" map, that is an image of our piece of paper that shows the parts of it that are smooth (and shinier) vs the parts that are rough (and less shiny), or the parts that are crumpled (and case little micro-shadows) versus the parts that are flat (and don't cast micro-shadows).
Engines can then combine all the information from these various maps to decide how to shade the surface. Take the
color from the first channel and use the other channels to
shade the surface. This is how you get highlights on shiny objects but not on rough ones, or how a brick wall is a single texture, but when you move the camera, the shadows between the bricks shift and move.
Modern texture formats know how to compress 4 of these channels in a smart way, but modern game
engines actually need 10 channels most of the time. This is the "physically based rendering" that
Metroid Prime: Remastered uses. This Nvidia paper compares those formats to a new format that uses AI to compress/decompress the textures, but also supports 10 channels, instead of the usual 4.
Thraktor's assertion - which I think is correct - is that the advantages that Nvidia's new format has are probably
not AI related, but just because they smartly handle the number of channels modern game engines need, and that it's more likely we'll see an emerging format that supports 10 channels, but doesn't use the AI compression/decompression.