I touched on this a little bit the other day, but it's an interesting topic.
In RAM you need to keep all of your assets, all of the 3D models and textures you're going to use to draw a scene. But, ideally, you actually only use all these once a frame. Instead, you take your 3D scene, and generate a bunch of flat images, called buffers, and then do your complex shading/lighting/effects work using those flat images.
Here, I did a (bad) diagram, outlining the process
What you can see here is the way that RAM, bus, and GPU performance all interact. All the arrows are copies over the memory bus. As quality of asset go up, the more RAM you need to store them. As resolution goes up, the bigger those buffers and the more time spent on the memory bus copying them. Also, longer that each shading pass takes. And the more shading passes, the more elaborate the effects, the more copies, more buffers, and more computation that it takes.
In a well balanced design no single part of this is more likely to be a bottleneck over the other. In any single game, you might discover that a specific part of this process is a bottleneck, but over all the high performance games you're getting on the system, you want to see none of these stick out.
But not only do the prices of these each individual components vary, they also have price curves - just like performance curves, sometimes doubling any section of this diagram costs more (or less) than doubling the cost. Here are some considerations.
For the GPU there are two paths to more performance. More cores and faster clocks
More cores | Faster clocks |
---|
Power efficient, linear power curve (that's good) | Power inefficient, quadratic power curve (that's bad) |
Heat efficient, same thing | Heat inefficient, same thing |
Expensive at first | Cheap at first |
Costs rise at a steady rate | Costs rise at rapidly increasing rate |
No real cap except $$$ | Hard limit before chip just won't function |
Makes chip bigger | Chip stays the same size |
For the memory bus there are 2 paths to faster performance. A bigger bus, and more cache
Bigger bus | More cache |
---|
Limited by memory standards | Unlimited, except by price |
Makes the memory you use more expensive | Makes SOC more expensive |
Memory will get cheaper over time, due to node shrinks | Node shrinks don't affect cache very much, if at all. |
Power hungry | Super efficient |
Makes every copy faster | Only improves some copies |
Bad latency, even when latency is "low" | Latency is nearly zero |
For RAM, there are two ways to make RAM bigger. Add more modules, or make the modules bigger
More modules | Bigger Modules |
---|
For cheap RAM, more modules are generally cheaper. | For expensive RAM, bigger modules are generally cheaper |
Increases memory bandwidth, but only if you add more memory controllers to the SOC | Doesn't increase memory bandwidth |
Takes up lots of space | Takes up less space |
Okay, this post is long enough. You can make similar diagrams for the CPU, but adding it into the mix here would complicate things a lot, so I skipped it for now. And I'll probably make a post about how this points to the decisions that Nintendo/Nvidia seem to have made. But that will have to wait until after my afternoon coffee break.