There are multiple texture processing clusters (TPC) units within a graphics processing cluster (GPC), each TPC includes two SMs, a Polymorph Engine, two Texture Units, and a Ray Tracing core (RTcore). Each GPC includes a Raster Engine (ROP), which can access all of memory. Each SM is partitioned into four separate processing blocks, each with its own instruction buffer, scheduler and 128 CUDA cores.
The GPC is a dedicated hardware block for rasterization, shading, texturing, and compute. The GPU's core graphics functions are performed inside the GPC. Inside the GPC, the SM CUDA cores perform pixel/vertex/geometry shading and physics/compute calculations. Texture units perform texture filtering and load/store units fetch and save data to memory. Special Function Units (SFUs) handle transcendental and graphics interpolation instructions. Tensor cores perform matrix multiplies to greatly accelerate DL inferencing. The RTcore unit assists ray- tracing by accelerating Bounding Volume Hierarchy (BVH) traversal and intersection of scene geometry during ray tracing.
Finally, the PolyMorph engine handles vertex fetch, tessellation, viewport transform, attribute setup, and stream output. The SM geometry and pixel processing performance make it highly suitable for rendering advanced user interfaces and complex gaming applications. The power efficiency of the Ampere GPU enables this performance on devices with power-limited environments.