To try and ELY5:
First, only half the story is told through the GPU. The CPU in the Erista SoC used in Switch was a BIG.little configuration, meaning it had 4 A57 cores for doing intensive tasks and 4 A53 cores for doing what I call the "minor busywork" tasks, but a game console typically doesn't have many "minor busywork" calculations to perform, so they were permanently disabled to save power consumption by Nvidia. Now, with a custom SoC commissioned explicitly by Nintendo, Nvidia has opted for a CPU that utilizes 8 A78C cores of the same size, and the power efficiency between A57 (originally designed by ARM in 2012) and A78C (designed by ARM in 2020) is quite dramatic, as ARM was incredibly focused on designing for power efficiency at peak performance.
As for the GPU (and the CPU, as well, to some extent), some of the gains you see in performance are through accelerators, specially designed units inside a CPU or GPU to calculate highly-specific types of information, usually paired up with a standard core in the CPU or GPU to pass the information back to when it finishes. For example, the reason that your phone doesn't overheat and explode when trying to play 4K compressed video is because there are accelerators in GPUs that are specialized in decompressing certain video formats like MP4 (and now AV1) and make incredibly quick work of the necessary calculations to do it, which means less raw power from the CPU or GPU itself to do it. This is why, when brand-new audio and video compression formats are introduced, they tend to make CPUs and GPUs stress and heat up substantially; they lack these little helpers who know how to do it much faster and easier.
One such accelerator in an Nvidia GPU is called a ray-tracing core (or RT core for short), which can more easily process lighting effects, which are a large strain on a GPU otherwise. RT cores were introduced in the Turing architecture in 2018 and Nvidia has made great improvements on them even in just the past 3 years. AMD chips can do ray tracing, as well, but AMD (as of 2021) lags behind Nvidia in efficiency and performance in this regard.
Another accelerator is called a Tensor core, which more easily process AI algorithmic calculations. These are the accelerators that enable DLSS, the tech that allows a lower-res image to be up-rezzed while looking nearly identical to if the image were generated at that higher resolution. DLSS and Tensor cores have also seen dramatic improvement and iteration by Nvidia, by virtue of their desire to be a leading chip maker for autonomous driving tech, but it has had significant benefits to visual outputs, as well. AMD is also developing similar technology, but again, not at the same rate as Nvidia.
It is a stroke of good timing on Nintendo's part that they are able to obtain a custom design that makes use of these well-developed technologies, as it provides these accelerators that allow it to output an image that it otherwise could not at a hybrid device's power usage. Additionally, PS5 and Xbox Series do not feature these more-performant technologies due to their chips being designed prior to AMD's advancements in these fields to keep up with Nvidia, which is why we are discussing a hybrid console outputting a 4K image and potentially being able to achieve something close enough to Xbox Series S that it has dramatically closed the gap between the new Nintendo hybrid hardware and current home consoles that was much harder with Switch and PS4.
TL; DR - it's that much more performant because the GPU is "cheating" and was designed to use shortcuts to achieve a similar result in ways current home consoles were not.