Firstly I wouldn't say 100mm2 is a hard limit on size, just that my expectations are for something around that size. It could be around 120mm2, maybe smaller, but I wouldn't expect anything significantly bigger.
As for the reasons to expect a die around that size, there are two. Firstly cost, which should be obvious enough, as the SoC is the most expensive component in a gaming device, and a smaller die is cheaper than a bigger die. The second is power consumption.
Now, you'll probably have heard many people (including myself) point out that larger chips can in general reduce power consumption at the same performance level by allowing for reduced clocks. This is due to the power consumption of an integrated circuit being proportional to the square of the voltage being supplied, so as you increase voltage (to achieve higher clocks) the power consumption increases faster than the performance. Hence, a (let's say) 8SM GPU running at 1GHz should consume less power than a 4SM GPU running at 2GHz, while providing effectively the same performance, all other things being equal.
The problem in this case is that this only applies to higher clock speeds. At low clock speeds, there's quite a different behaviour, as there's a minimum voltage required for the chip to actually operate, so as you reduce clocks to zero the voltage doesn't go down to zero, it goes down to this minimum voltage. In practice there is some maximum clock speed which can be attained at this minimum voltage, and there's very little reason to clock lower than this. For example, if a chip could hit 300MHz at its minimum voltage, then reducing the clock to 150MHz would consume almost as much power, but give you 50% less performance, so would be a net loss in power efficiency. This is the reason, when you check clock speeds on your PC or phone, you don't see CPU or GPU clocks dropping down to 1MHz when idle. There's some minimum clock speed which they will idle at where there are no meaningful power savings to be had to clock lower.
Bringing this back to Switch, the most important power limit for Nintendo and Nvidia when designing the new SoC isn't how much power is consumed while at high clock speeds (ie docked), it's how much power is consumed at low clock speeds (ie handheld). Nintendo will want the new model to have some basic level of battery life, which means they're going to set strict design limits on handheld-mode power consumption for the SoC. For the GPU side of things, this makes the minimum voltage a key design constraint; they can't go lower than that, so they can't use a GPU that consumes too much power at that minimum voltage.
More specifically, we can actually see
from Nvidia's DVFS tables that the base 384MHz handheld clock speed used by the Switch is already running at the lowest voltage on the 16nm Mariko (although not quite on the original 20nm TX1). With the move to 8nm, it's reasonable to assume that this max-clock-on-min-voltage will increase again, let's say to around 500MHz. Let's assume that Nintendo have budgeted 3W for the GPU in handheld mode. If 4SMs at 500MHz consumes around 3W, then they simply can't use a larger GPU without breaking their power budget. Increasing to 6SMs or 8SMs, even if clocks were decreased to 300MHz or 250MHz respectively, would still consume more power, as the savings from the reduced clocks wouldn't be anywhere near enough to offset the larger GPUs.
Note that the CPU is also impacted by the same behaviour, although it's a bit different as the clocks will likely still be the same across handheld and docked modes. Ditto with basically all other hardware on the SoC, like security coprocessors, DSPs, etc.
Now, I don't know exactly the power budget Nintendo will allocate to the SoC in handheld for the new device, although it's likely to be within the ballpark of the original Switch and the 2019 Mariko revision. I also don't know the exact minimum voltages or corresponding clock speeds, or power consumption thereof of the new GPU, CPU, etc. However, it's pretty safe to say that it's roughly proportional to die size, as a large proportion of power consumption of ICs at low clocks is static power (ie leakage power), which is basically directly proportional to die area. This is why high-end smartphone SoCs are designed as small chips using high-density libraries, as opposed to low-density libraries used by desktop parts, as they spend a lot of time idling at low clocks, and high density libraries mean smaller chips, which means less static power consumption, which means better battery life.