Requirement to use internal tools blamed for the state of Halo and Fable's delays

24 Jun 2022
3,314
5,743
Isn't that the same as having more cache? Jsyk coreteks is often full of shit. There's a reason chip makers don't put a lot of memory on-die right now, and that's because the space required would decrease the performance compared to the cost of silicon. That's why most cpus have kilobytes of level 1-2 cache instead of mb/gb, and it's rare for anything outside of server class hardware to have large level 3 caches. If there's potential for "memory near compute" it'll be something like what AMD is doing with their vertical cache stacking on desktop/server cpu or maybe the infinity cache on their gpu line.

Well, there are chips that do a lot of on-chip cache, but they're very specific for big data and server markets it'd appear, not in the consumer space due to the cost/performance reasons you touched on. But I don't think Coreteks was actually talking about loading up chips with a ton of cache, and hopefully my response wasn't insinuating that either.

Actually, I'm looking at it more from the POV of having processing closer to bigger pools of data so that you don't have as much movement across the memory bus, which tends to take up a majority of the energy consumption in processing systems. Nvidia and other companies have touched on it conceptually in the past, where compared to moving data around in a design, the actual arithmetic is "free" in terms of power consumption.

I actually think the memory I/O subsystem Sony have for the PS5 is an implementation of processing-near-memory, because there's a lot that I/O subsystem does with the data coming in before it even goes into main memory for the CPU & GPU to access (or at least, it can depending on the needs of the game with that data). When I was referring to PNM I meant something more along the lines of that, but with even better granularity and with better memory technologies (i.e shifting from GDDR to HBM).

It's really more about having data processing closer to the main memory, not the caches of the chips, that way the data doesn't need to be moved around the bus or so many parts of the system to be fully processed, saving on power consumption. And I think that's going to be important with the next systems because they'll have to find ways to provide substantially more power/performance but without exploding in the power consumption budget. Node shrinks are only going to bring but so much, and some of the GPU architecture changes are going to be outside of the realm of Sony or Microsoft's control. So they'll have to make some smart choices in other areas to compliment any GPU advances to get that additional power at reasonable production budgets and I think further innovations with PNM (and if the pricing is affordable, PIM) will help with that.

Which is the main reason I thought about stuff like having smaller, efficiency cores tuned for certain functions closer to the main system memory for PNM, if not integrated directly into the memory modules for PIM. And that was the particular thing with Coreteks I was pointing to, not anything to do with buffing caches, necessarily.

Let me explain further why having a ton of memory on-die isn't that efficient. Let's say you have a processing unit devoted to adding two 32 bit float point numbers together. Well, yes, if you had infinite amount of cache/memory right next to the processing unit, you would never have to wait for the system to go out further to system RAM or storage for more numbers to add together. But your processing unit can only add 2 numbers together at a time. So you would only be using 2 of your cache lines at once, and the rest of it would be sitting idle. So in reality, the more cache you have sitting next to your processors, the more of your die space is idle. Now, there are some things that definitely benefit from a bigger cache, but not everything does - because sometimes the stuff you're working on fits in cache already, or is simply too large to cache - and then you're just better off running the processor faster or having more compute units.

I agree with this. I think smarter cache organization and data movement systems are where more benefits would come if just talking caches, rather than increasing the cache sizes. And again back to PS5, that's why I think things like the cache scrubbers are underappreciated, they help the caches work smarter while keeping the sizes manageable.
 
P

peter42O

Guest
I'm expecting half that list to be cancelled or delayed further, as is Microsoft tradition.

Oddly enough, im not expecting any of the six games I listed to be delayed out of the year I said or cancelled.
 

KiryuRealty

Cambridge Dictionary High Priest of Grammar
28 Nov 2022
6,646
8,165
Where it’s at.
Well, there are chips that do a lot of on-chip cache, but they're very specific for big data and server markets it'd appear, not in the consumer space due to the cost/performance reasons you touched on. But I don't think Coreteks was actually talking about loading up chips with a ton of cache, and hopefully my response wasn't insinuating that either.

Actually, I'm looking at it more from the POV of having processing closer to bigger pools of data so that you don't have as much movement across the memory bus, which tends to take up a majority of the energy consumption in processing systems. Nvidia and other companies have touched on it conceptually in the past, where compared to moving data around in a design, the actual arithmetic is "free" in terms of power consumption.

I actually think the memory I/O subsystem Sony have for the PS5 is an implementation of processing-near-memory, because there's a lot that I/O subsystem does with the data coming in before it even goes into main memory for the CPU & GPU to access (or at least, it can depending on the needs of the game with that data). When I was referring to PNM I meant something more along the lines of that, but with even better granularity and with better memory technologies (i.e shifting from GDDR to HBM).

It's really more about having data processing closer to the main memory, not the caches of the chips, that way the data doesn't need to be moved around the bus or so many parts of the system to be fully processed, saving on power consumption. And I think that's going to be important with the next systems because they'll have to find ways to provide substantially more power/performance but without exploding in the power consumption budget. Node shrinks are only going to bring but so much, and some of the GPU architecture changes are going to be outside of the realm of Sony or Microsoft's control. So they'll have to make some smart choices in other areas to compliment any GPU advances to get that additional power at reasonable production budgets and I think further innovations with PNM (and if the pricing is affordable, PIM) will help with that.

Which is the main reason I thought about stuff like having smaller, efficiency cores tuned for certain functions closer to the main system memory for PNM, if not integrated directly into the memory modules for PIM. And that was the particular thing with Coreteks I was pointing to, not anything to do with buffing caches, necessarily.



I agree with this. I think smarter cache organization and data movement systems are where more benefits would come if just talking caches, rather than increasing the cache sizes. And again back to PS5, that's why I think things like the cache scrubbers are underappreciated, they help the caches work smarter while keeping the sizes manageable.
Cache scrubbers in an on-die-RAM package would be an incredible boost in efficiency compared to current designs, but even just having active cache scrubbers does more for efficiency than larger caches.

The ability to fully clear a cache can allow quite a few common operations to be processed faster, and even allow improvements that sound small but have the potential for bigger practical impact, like true random number generation with no influence from old data in the cache.
 
24 Jun 2022
3,314
5,743
RAM-on-die packages still have CPU cache, it’s just that they don’t have to go outside of the package to access memory, which reduces latency tremendously.

That's another solution that can be implemented with next-gen systems. I might be getting crazy here, but just imagine: 3D-packaged designs with on-package HBM-PIM (or at least HBM), maybe some implementation of UCIe spec, fully featured data I/O active interposer base die, PNM, low-power efficiency-type cores nested in or near main HBM memory.

Not only huge reduction on latency, but also power consumption too, more than what you can get with just a smaller node shrink while trying to stay in a console-friendly TDP. And that leaves more power budget for stronger CPU & GPU. All of that in a unified package, at least for the hardware side of things, could be very interesting for next-gen systems.

Though personally I think they will need more than just increased performance to stand out.

For 2023, im expecting Redfall, Starfield, Forza Motorsport and a bunch of AA console exclusives. For 2024, im expecting Avowed, Hellblade 2, Contraband and a few other smaller console exclusives. Thus far, the vast majority of Microsoft's first party titles have been great or better. Their Open Critic average is 85+ which is already better than what they had in the Xbox One generation through the first two years.

The thing about a lot of MS's GamePass launch exclusives is they don't stay exclusive for very long. STALKER 2 for example has a 3-month exclusivity window IIRC; The Medium's was like, what, six months?

RedFall I think for sure is 2023. Starfield I'm like 85% sure, but it's a H2 2023 at this rate. Forza I'm like 70% sure is 2023. I guess Avowed, Hellblade II & Contraband for 2024 isn't actually too unrealistic; at least Hellblade II & Avowed NEED to release by then, though, because the longer it takes those to come out, the longer it'll take Project Mara (TBH I think this isn't going to become an actual game anymore) and Outer Worlds 2 to release.
 

anonpuffs

Veteran
Icon Extra
29 Nov 2022
8,287
9,510
That's another solution that can be implemented with next-gen systems. I might be getting crazy here, but just imagine: 3D-packaged designs with on-package HBM-PIM (or at least HBM), maybe some implementation of UCIe spec, fully featured data I/O active interposer base die, PNM, low-power efficiency-type cores nested in or near main HBM memory.

Not only huge reduction on latency, but also power consumption too, more than what you can get with just a smaller node shrink while trying to stay in a console-friendly TDP. And that leaves more power budget for stronger CPU & GPU. All of that in a unified package, at least for the hardware side of things, could be very interesting for next-gen systems.

Though personally I think they will need more than just increased performance to stand out.

I guess my question is, what does this do for developers to give us new experiences. I can't really think of anything that just having lower latency/higher bandwidth memory would do besides let graphics be a bit better. Maybe AMD could stack some Xilinx AI accelerator FPGAs onto the memory package and do on-the-fly LOD leveling like nanite and crush pop-in for good.
 
P

peter42O

Guest
The thing about a lot of MS's GamePass launch exclusives is they don't stay exclusive for very long. STALKER 2 for example has a 3-month exclusivity window IIRC; The Medium's was like, what, six months?

RedFall I think for sure is 2023. Starfield I'm like 85% sure, but it's a H2 2023 at this rate. Forza I'm like 70% sure is 2023. I guess Avowed, Hellblade II & Contraband for 2024 isn't actually too unrealistic; at least Hellblade II & Avowed NEED to release by then, though, because the longer it takes those to come out, the longer it'll take Project Mara (TBH I think this isn't going to become an actual game anymore) and Outer Worlds 2 to release.

True. Microsoft doesn't do long timed exclusivity deals but that shows that they're not really looking to keep them off PlayStation for any long period of time because what's the incentive to do so? Vast majority of timed exclusive games don't impact anything unless it's a major franchise like Final Fantasy. I prefer either multi-platform day one on Game Pass with no timed exclusivity or full exclusivity due to Microsoft funding the game.

I do believe that the games I said for 2023 and 2024 will actually release in their respective years. Project Mara isn't a game. It's an experience or something that has to do with mental health. It has a small team on it though that's separate from Hellblade 2. Obsidian has multiple teams and are basically Microsoft's equivalent of Insomniac. Very efficient, very effective and very high quality releases.
 

ethomaz

Rebolation!
21 Jun 2022
8,515
7,220
Brasil 🇧🇷
PSN ID
ethomaz
Isn't that the same as having more cache? Jsyk coreteks is often full of shit. There's a reason chip makers don't put a lot of memory on-die right now, and that's because the space required would decrease the performance compared to the cost of silicon. That's why most cpus have kilobytes of level 1-2 cache instead of mb/gb, and it's rare for anything outside of server class hardware to have large level 3 caches. If there's potential for "memory near compute" it'll be something like what AMD is doing with their vertical cache stacking on desktop/server cpu or maybe the infinity cache on their gpu line.

Let me explain further why having a ton of memory on-die isn't that efficient. Let's say you have a processing unit devoted to adding two 32 bit float point numbers together. Well, yes, if you had infinite amount of cache/memory right next to the processing unit, you would never have to wait for the system to go out further to system RAM or storage for more numbers to add together. But your processing unit can only add 2 numbers together at a time. So you would only be using 2 of your cache lines at once, and the rest of it would be sitting idle. So in reality, the more cache you have sitting next to your processors, the more of your die space is idle. Now, there are some things that definitely benefit from a bigger cache, but not everything does - because sometimes the stuff you're working on fits in cache already, or is simply too large to cache - and then you're just better off running the processor faster or having more compute units.
More cache will only works and do better performance if the prediction algorithm can predict right what needs to stay on cache and what needs to be fetched from memory to cache.

Bigger the cache hard to predict that and so you have cache misses that is worst in performance than go direct on memory without any cache.

Of course there is a way to cache be perfect when it is the same size of the memory :D so that way it will always have everything in cache but it should not be called cache anymore.

I just wanted to see that bigger cache not necessary improves performance… it can drop it if it has a lot of miss.

Prediction of what need to be on cache is really hard and you will eventually fail… it is even hard in a PC for multi purposes… in a server that does the same job 24/7 that prediction is easy and bigger cache can helps a lot but for desktop the chances of miss is really high.
 

KiryuRealty

Cambridge Dictionary High Priest of Grammar
28 Nov 2022
6,646
8,165
Where it’s at.
That's another solution that can be implemented with next-gen systems. I might be getting crazy here, but just imagine: 3D-packaged designs with on-package HBM-PIM (or at least HBM), maybe some implementation of UCIe spec, fully featured data I/O active interposer base die, PNM, low-power efficiency-type cores nested in or near main HBM memory.

Not only huge reduction on latency, but also power consumption too, more than what you can get with just a smaller node shrink while trying to stay in a console-friendly TDP. And that leaves more power budget for stronger CPU & GPU. All of that in a unified package, at least for the hardware side of things, could be very interesting for next-gen systems.

Though personally I think they will need more than just increased performance to stand out.



The thing about a lot of MS's GamePass launch exclusives is they don't stay exclusive for very long. STALKER 2 for example has a 3-month exclusivity window IIRC; The Medium's was like, what, six months?

RedFall I think for sure is 2023. Starfield I'm like 85% sure, but it's a H2 2023 at this rate. Forza I'm like 70% sure is 2023. I guess Avowed, Hellblade II & Contraband for 2024 isn't actually too unrealistic; at least Hellblade II & Avowed NEED to release by then, though, because the longer it takes those to come out, the longer it'll take Project Mara (TBH I think this isn't going to become an actual game anymore) and Outer Worlds 2 to release.
It inspires nothing but awe that the world's biggest software publisher has such rock-solid release scheduling.
 

Alabtrosmyster

Veteran
26 Jun 2022
3,216
2,836
More cache will only works and do better performance if the prediction algorithm can predict right what needs to stay on cache and what needs to be fetched from memory to cache.

Bigger the cache hard to predict that and so you have cache misses that is worst in performance than go direct on memory without any cache.

Of course there is a way to cache be perfect when it is the same size of the memory :D so that way it will always have everything in cache but it should not be called cache anymore.

I just wanted to see that bigger cache not necessary improves performance… it can drop it if it has a lot of miss.

Prediction of what need to be on cache is really hard and you will eventually fail.
I think that when people say more cache improves performance they assume that the algorithm is equivalent or better. The eixstence of worse prediction algorithms does not negate the benefits of a proper good implementation of bigger cache sizes (see the new Epyc CPU that features 768MB of cache if I remember well, that has to help it).
 

KiryuRealty

Cambridge Dictionary High Priest of Grammar
28 Nov 2022
6,646
8,165
Where it’s at.
I think that when people say more cache improves performance they assume that the algorithm is equivalent or better. The eixstence of worse prediction algorithms does not negate the benefits of a proper good implementation of bigger cache sizes (see the new Epyc CPU that features 768MB of cache if I remember well, that has to help it).
A lot of people don't look at the realities of new techniques and implementations of system elements that are emerging, think that we are just seeing the same implementations being pumped up in size or clock speeds.
 
  • brain
Reactions: Sircaw
P

peter42O

Guest
I never said that they DIDN'T do them. I said they don't do them as in now, present day. The longest for this generation is like 6 months.
 

ethomaz

Rebolation!
21 Jun 2022
8,515
7,220
Brasil 🇧🇷
PSN ID
ethomaz
Well, I'll be damned. I stand corrected. Is this the only one?
I did not checked others... I just listed Xbox Series Exclusives... the first name was Valheim that is still set to launch in 2023... and the second was Sable that released in both already with a bit over 1 years difference.
 

anonpuffs

Veteran
Icon Extra
29 Nov 2022
8,287
9,510
More cache will only works and do better performance if the prediction algorithm can predict right what needs to stay on cache and what needs to be fetched from memory to cache.

Bigger the cache hard to predict that and so you have cache misses that is worst in performance than go direct on memory without any cache.

Of course there is a way to cache be perfect when it is the same size of the memory :D so that way it will always have everything in cache but it should not be called cache anymore.

I just wanted to see that bigger cache not necessary improves performance… it can drop it if it has a lot of miss.

Prediction of what need to be on cache is really hard and you will eventually fail… it is even hard in a PC for multi purposes… in a server that does the same job 24/7 that prediction is easy and bigger cache can helps a lot but for desktop the chances of miss is really high.
The way amd uses their vertical stacked cache in cpus is as a victim cache, basically L3 stores cache lines that are evicted from L2 due to age. So there's a high probability that they're cache lines that will be used again
 

ethomaz

Rebolation!
21 Jun 2022
8,515
7,220
Brasil 🇧🇷
PSN ID
ethomaz
The way amd uses their vertical stacked cache in cpus is as a victim cache, basically L3 stores cache lines that are evicted from L2 due to age. So there's a high probability that they're cache lines that will be used again
So they don't try to predict? It is just a junk (what was discarted by L2)? :unsure:
 
P

peter42O

Guest
I did not checked others... I just listed Xbox Series Exclusives... the first name was Valheim that is still set to launch in 2023... and the second was Sable that released in both already with a bit over 1 years difference.

Okay. So there's a few but rarely does Microsoft do timed exclusivity for a year or more. Luckily for me, the only timed exclusive outside of Final Fantasy that I care about on either side was Kena. lol