One thing I'm finding... Curious about these leaks is that they keep stating that the card has 30WGP with a total of 60CU.
To understand why I find this curious, I want to point out something about RDNA architecture.
- Work Group Processors (WGP) are comprised of two Compute Units (CU) each
- You then have Shader Engines, each with a set amount of WGPs
Take the PS5, for example. It's based on the Navi 10 / 22 Graphics Compute Die (GDC), with a floorplan comprised of
- 20 WGP / 40 CU - 2 WGP are disabled for yields
- 2 Shader Engines, each with 10 WGP
Now when looking at RNDA3 floorplans, we only have ONE GDC that they could possibly be using if keeping with AMD's standards, the Navi 32.
- 30 WGP / 60CU
- 3 Shader Engines, each with 10 WGP
Why is this relevant? Well, because if accurate,
either Sony decided not to accept any APUs with borked WGP, which would be a moronic decision since it would severely impact yields,
or their GPU is custom in that it features 30 WGP with somehow more compute units than they should be allowed - which would make no sense.
That said, there are two possible combinations I can see here - I
t will have either 28 or 27 enabled WGP, for a total of anything between 54 and 56 CUs. Funny enough,
there is a card that matches the 54 CU number perfectly, the Radeon RX 7700 XT.
Now we know the compute unit targets, what else? Well, we do not know the clockspeed, but we do know the supposed teraflop numbers (dual issue) 67 TF fp16 / 33.5 TF fp32 - Like for like, this will put us at around 16.75 TF when compared like for like with the standard PS5. This will now allow us to calculate the actual clockspeed for the GPU:
Compute Units x Shaders per CU x Clockspeed x 2 = Teraflop
- 54 x 64 x ? x 2 = 16.75 | ? = 16.75 / (54 x 64 x 2) | ? ≈ 0.00242, or 2.42 GHz
- 56 x 64 x ? x 2 = 16.75 | ? = 16.75 / (56 x 64 x 2) | ? ≈ 0.00234, or 2.34 GHz
My guess here could be completely wrong, but I don't see the math in this table below being correct, either TF wise or GPU speed wise:
Funny enough, when "Oberon" was announced people were talking about how the PS5 was 9.2TF because they assumed the console would be 36CU based on the Navi10 nomenclature. What's even funnier is that if you calculate Teraflops based on the PS5 "oberon" having 40CU you get...
40 x 64 x 2GHz x 2 = 10.24 TF, which is incredibly close to the final figure we got from Sony.
My personal bet goes to this console having 54 enabled Compute Units running at 2.42 GHz
Edit - I also find it funny that the "expert" Kepler_l2 mentioned that the card full config has 64 CU organised differently from what we'd expect from RDNA 2 or 3. To put things in perspective, each shader engine usually has 10 WGP, this would require 2 of them to have 16 each. That would be a terrible engineering decision considering how caches are laid out in RDNA cards - and unless Cerny found some gold somewhere, this would be antithetical to his efficient designs.
However then says that 300 tops suggests a clockspeed of 2.45 GHz (pure coincidence that this pretty much matches my math)
So which is it? I will keep on calling BS on the 2 shader engine configuration since there's no RDNA 3 card with it other than the tiny Navi 33
Edit - If anyone asks, one of the main issues with the Series X hardware architecture is that the shader arrays are simply "too long" for the caches they have - there are efficiency losses the further away a compute unit is from it's L1 cache. The PS5 has 4 shader arrays with 10 CU each, the Series X has 4 shader arrays with 12 CU each, leading to further efficiency losses. I can only begin to imagine 4 shader arrays with 16 compute units each and how dumb that would be.
I'm perfectly ok If I'm wrong, but I'm very curious to the configuration itself, much more than tittyflops
Wdit again - Someone shared this image from apparently AMD a while back, and if it's real, it gives credence to my theory on the 7700xt / 54CU machine