Isn't that the same as having more cache? Jsyk coreteks is often full of shit. There's a reason chip makers don't put a lot of memory on-die right now, and that's because the space required would decrease the performance compared to the cost of silicon. That's why most cpus have kilobytes of level 1-2 cache instead of mb/gb, and it's rare for anything outside of server class hardware to have large level 3 caches. If there's potential for "memory near compute" it'll be something like what AMD is doing with their vertical cache stacking on desktop/server cpu or maybe the infinity cache on their gpu line.
Well, there are chips that do a lot of on-chip cache, but they're very specific for big data and server markets it'd appear, not in the consumer space due to the cost/performance reasons you touched on. But I don't think Coreteks was actually talking about loading up chips with a ton of cache, and hopefully my response wasn't insinuating that either.
Actually, I'm looking at it more from the POV of having processing closer to bigger pools of data so that you don't have as much movement across the memory bus, which tends to take up a majority of the energy consumption in processing systems. Nvidia and other companies have touched on it conceptually in the past, where compared to moving data around in a design, the actual arithmetic is "free" in terms of power consumption.
I actually think the memory I/O subsystem Sony have for the PS5 is an implementation of processing-near-memory, because there's a lot that I/O subsystem does with the data coming in before it even goes into main memory for the CPU & GPU to access (or at least, it can depending on the needs of the game with that data). When I was referring to PNM I meant something more along the lines of that, but with even better granularity and with better memory technologies (i.e shifting from GDDR to HBM).
It's really more about having data processing closer to the main memory, not the caches of the chips, that way the data doesn't need to be moved around the bus or so many parts of the system to be fully processed, saving on power consumption. And I think that's going to be important with the next systems because they'll have to find ways to provide substantially more power/performance but without exploding in the power consumption budget. Node shrinks are only going to bring but so much, and some of the GPU architecture changes are going to be outside of the realm of Sony or Microsoft's control. So they'll have to make some smart choices in other areas to compliment any GPU advances to get that additional power at reasonable production budgets and I think further innovations with PNM (and if the pricing is affordable, PIM) will help with that.
Which is the main reason I thought about stuff like having smaller, efficiency cores tuned for certain functions closer to the main system memory for PNM, if not integrated directly into the memory modules for PIM. And that was the particular thing with Coreteks I was pointing to, not anything to do with buffing caches, necessarily.
Let me explain further why having a ton of memory on-die isn't that efficient. Let's say you have a processing unit devoted to adding two 32 bit float point numbers together. Well, yes, if you had infinite amount of cache/memory right next to the processing unit, you would never have to wait for the system to go out further to system RAM or storage for more numbers to add together. But your processing unit can only add 2 numbers together at a time. So you would only be using 2 of your cache lines at once, and the rest of it would be sitting idle. So in reality, the more cache you have sitting next to your processors, the more of your die space is idle. Now, there are some things that definitely benefit from a bigger cache, but not everything does - because sometimes the stuff you're working on fits in cache already, or is simply too large to cache - and then you're just better off running the processor faster or having more compute units.
I agree with this. I think smarter cache organization and data movement systems are where more benefits would come if just talking caches, rather than increasing the cache sizes. And again back to PS5, that's why I think things like the cache scrubbers are underappreciated, they help the caches work smarter while keeping the sizes manageable.