Power Draw TL;DR: The Switch was built under unusual circumstances. Those circumstances aren't repeating, so I think the V1 Switch power draw represents an extremely high number that Nintendo won't repeat. I also think the OLED power draw is an extremely low number that Nintendo won't repeat. We should be really suspect estimates that involve pushing up to, much less fudging past, those lines.
I get the impulse to use Switch as a sort of template for what the next hardware might be like. It's totally reasonable - and I've done it - to plug in the power draw of the SOC or even a subset of the SOC (like the CPU) into a calculator and see what the performance of the next hardware might be. Or that there might be wiggle room to push that power draw higher if it solves some problem.
The thing is, the V1 Switch has both a performance and power level that Nintendo clearly was unhappy with. We can be fairly sure of that because of Nintendo's own actions. Up to very close to the system's launch, the handheld GPU clock was ~300Mhz. They would raise it not once, but twice, a solid indication that they were trying very hard to keep the clocks down to preserve battery life, but were ramming into performance problems. The final clock is over 50% higher than what Nintendo was trying to achieve, just a few months before launch.
By the time the V2 came along, Nintendo was stuck with that performance level. And they were going to launch the Lite, so making the main unit smaller was probably a waste as well. That means putting 100% of the node shrink toward battery life. 5.5 hours for your definitional title - a title which is still unstable - is almost pathologically stingy with performance.
So really would should see these things as two extremes that Nintendo would prefer to not hit again, much less exceed. We should also see that the existing allocation of CPU/GPU power wasn't some platonic ideal that will be repeated, but a compromise on a device that was not built for Nintendo's use case.
Let's do a quick thought experiment. We're not experienced hardware engineers, but we know the Switch very well, because we've been using it and dissecting it for 7 years. Now let's imagine an alternate reality. It's late 2014, Nintendo has decided to go with this Switch concept, using the Tegra chip. But in this alternate reality, Nintendo realized far earlier than the 2016 launch date is too aggressive. And in this reality, Nvidia uses the time to customize the tegra X1 for Nintendo's needs.
You are Ko Shiota, head of Nintendo's hardware division, and the driving force in the industry for reducing power draw, all the way back to the Wii. You have a game, Breath of the Wild, that is far into development, and whose port will be the launch title for the system, so getting it running well is paramount. What decisions do you make differently?
First, you probably decide that the basic design of Tegra X1 is excellent. It's a very large GPU for a mobile device, and very modern. It has the most modern ARM core available, and a cluster of 4, which is small by console standards, but big by the standards of a gaming handheld. Besides, games love single core, more cores is probably not a problem.
The battery life is dreck though. You have clocked the CPU and GPU to the absolute bottom. Actually below peak efficiency, because you are scrambling for milliwatts. The power jump over the Wii U is small, and your launch title, while cross gen, is already having trouble there. Your first choices is probably node shrink to 16nm. Nvidia is about to launch their next gen GPUs on 16nm, they know it, they already have capacity there, and by 2017, when you are shipping your console, they're going to have moved on past it. It'll be nice and cheap. You need that extra power. Go.
Okay, but what to do with the extra power. Well, at minimum, you probably start with at least setting the clocks to peak efficiency. 500Mhz is probably a comfortable spot in handheld. Maybe Breath of the Wild still needs to hit dynamic res, but it's acceptable looking, and the frame rate is consistent. 1.125MHz is a tiny jump over the TX1 original clock of 1.000, but the node shrink makes heat not a problem.
The next place to look is the CPU. The GPU is the priority - you've done a node shrink, and raised the clock speed of the GPU over the original TX1. You need to get good battery life and that's gotta come from somewhere. Still, you're not scraping the bottom of the barrel looking for minutes of battery life anymore. You can afford a 1.2 or even 1.4 GHz clock. It's so close to the bottom of the power curve, that it's essentialy free.
You might consider changing the memory controller. 25GB/s is a nice leap over the Wii U, and huge for a mobile device. But modern rendering, which BotW has, really hits that bandwidth hard. You stick with 4 GB of RAM, but you add a second memory controller, and go with either 64-bit DIMMS, or with four 1GB 32-bit DIMMS. The result is a doubling of memory bandwidth on the TV, but now in handheld that bandwidth is excessive. Which means you can actually lower the clock in handheld. That's good, because despite the node shrink, you've just sunk a lot of power into bandwidth.
This alternate reality version of the Switch costs just as much as the Switch does today. It perhaps cost a bit more at launch, but Nintendo was highly motivated. The battery life was significantly better than the launch device, but not OLED good. Maybe 4-4.5 hours on Zelda. The performance addresses all of the Switch's major bottlenecks, and launch titles run substantially better. The CPU draws even less juice than the GPU, as a ratio, and the memory clock draws substantially more.
Which brings us to today. This process is what Ko Shiota and Nvidia have already gone through. They will not land at the same ratios of CPU/GPU/Memory/Storage/Screen power draw, not just because the technologies are different, but because the needs of rendering engines are different, and the game being developed as the test bed (probably 3D Mario) is different.
We shouldn't be trying to map the Switch power draw and performance profile on top of T239. Instead, we should be thinking about Breath of the Wild. BotW had a modern PBR rendering engine, a modern open world design, physics based gameplay running on hardware with a previous generation performance range.
The next gen launch title that is the testbed for Switch 2 will almost certainly use 9th generation software techniques, on a device that in the performance ballpark of the 8th gen, built on the hardware features (AI, mesh shaders) of the 10th gen.
In that case, the CPU needs to make a bigger leap than the GPU. Last generation did not drive the CPU hard, and all the consoles had weak CPUs. Even graphical techniques, like RT, put a big load back on the CPU. Meanwhile, the current gen hardware has a SKU that's in the same realm as a last gen device. A GPU leap is necessary, but we probably need to spend a greater percentage of our power budget on the CPU this time.
Upscaling is paramount. 2x upscaling is basically standard, and 4x upscaling is common. You can absolutely produce beautiful games with last gen performance, you're Nintendo, and PS4 games still look great. But you need enough performance so that 4k upscaling is possible and 1440p is cheap. That's the minimum. If you can't do 4k upscaling at all, you'll never hit 4k, and if 1440p isn't cheap, you'll be stuck in 1080p land for another generation. DLSS performance is tied to the rest of the GPU, but you're not thinking about it as "How much upscaling do I get with X GPU performance," you're thinking "How much GPU performance do I get as a side effect of getting X amount of upscaling."
RT is necessary. It doesn't need to be mind blowingly good, but the absence of it will make your games look dated. Again RT perf is tied to the GPU, like DLSS. But this probably pushes you in the direction of "lots of cores, clocked low" instead of "few cores, clocked fast". It's more expensive, but it's also more power efficient, so it's battery life win, and DLSS and RT probably slightly prefer the more core version, TFLOPS being equal.
This is probably also why you shell out for premium RAM setup. Upscaling loves high res textures, RT loves RAM. Relative to your GPU performance, you want to be rich in RAM to support those features.
You'll also need to spend $$$ and electricity on fast storage. Not because of Ratchet and Clank whizzbang fast switching, but because open world games have become the default AAA experience, and one of your biggest franchises has landed solidly in that area. Perhaps two of them. You need fast storage. Again, you might be willing to sacrifice GPU performance for this. Upscaling can cover a lot of ills, but loading stutter is not one of them.
I've gone on too long. I just want to emphasize that the Switch was the product of Nintendo repurposing technology. Not withered tech, but stillborn, a laptop chip whose market collapsed out from under it. The decision process on power draw - and the rest of the hardware - was driven by making the preexisting chip work. That's not the world of the T239, and you'll lead yourself astray if you try to think like it is.