What kind of techniques will be introduced to games that will make a substantial boost to visuals when consoles can't even render native 4k/60FPS? Everything is using upscaling low resolution framebuffers. RT would be the go-to tech for getting substantial visuals but the consoles don't have dedicated RT cores nor hardware upscaling tensor cores whose only job is to use AI to converge the proper rendering image. The consoles are literally very underpowered for RT games. We aren't waiting for developers to get familiar to the hardware as it's very akin to last gens SDKs.
To make it easier to understand, the upcoming next gen engines are adding Nanite-like and Lumen-like tech.
In the past an artist had to manually make 3D objects like let's say a game character multiple times each one with different level of detail to use one or the another depending on the distance they were from the camera. All the ones what were going to be pontentially used in the next half a minute or so were loaded in the memory. The programmer had to control a budget for the memory deciding what was kept in memory, what was streamed into memory and what was removed from there.
This meant that in memory there was a lot of assets that didn't need to be there and that in many cases were never used. Space that could have been used to include extra detail on them, or to add other objects. Level design took these budgets into consideration limiting the density or variety of stuff on each level, or in screen.
Stuff like Nanite does all this process in a way more optimized way: first thanks to the new way faster streaming devs won't have in memory what may or may not appear in screen in the next minute or half a minute. They only need to put there what will be in the next second or half a second. Meaning that all these props, environment art, enemies etc that are in a 'distant' part of the level won't need to be in memory. You'll only have to keep in memory what's seen on screen and a bit more around you, the stuff that you may see in the next second or so.
This means that there is a lot of stuff removed from memory, and having more free memory means you can show more detailed stuff in front of your face. On top of that, new GPUs like the PS5 one allow the engines let's say to modify stuff like superdetailed 3D models and scale them down in the fly, super fast, something they could do with older hardware: meaning they won't have to make each object multiple times with different levels of detail and won't have to calculate which one to show and to optimize them to keep decent FPS.
They only will need to have a single super detailed one, let's say the one that in the past they did for cinematics. The engine will stream that super detailed one and thanks to the new GPU hardware will scale its detail down to the point that will be needed according to the target FPS, the target resolution (which will be way smaller than the one of the tv thanks to the new and upcoming upscaling techniques) and potential minimum distance from the camera in the next second or so. So that object will be only once in the memory and with the most optimized detail and size for that situation, without requiring the coder, artist or level designer worry about specific budgets.
Let's say the engine 'automatically' handles it in a way more optimized way than it was done before, leaving more GPU and CPU compute and memory free for other stuff, allowing to include more stuff and more detailed than before because they are taking way more advantage of the capabilites of the hardware, which with current/past engines weren't milked enough because of multiple bottlenecks.
This also means that many tricks that were made in the past to simulate detail like with fake bump mapping or lighting/shadows won't be needed anymore: the GPUs will have a super detailed model of the scene with proper materials/textures. This means that all GPU and CPU compute and memory spent on these tricks will be gone in the next gen only engines/versions of the game for next gen.
This means that again they have more detail and space available to show more detailed stuff (or make a more dense and populated scene) on screen during the next second or so.
Then there's the lighting, shadowing and reflections part. Like the previous part they are still working on it because all these things aren't as ready and optimized as needed for commercial games, but they are working on multiple Lumen-like approaches.
The goal is again to get rid of old baked tricks to fake lighting, shadows or reflections (again, freeing more CPU and GPU compute and memory) and to replace them with a single real time global illumination method that would take that entire scene of the stuff to be potentially shown in the next second or so taking into consideration the behavior of 'real' light sources and materials.
They are doing it in a different approach of how RTs are used today, again in way more optimized way. Again, calculating it in a let's say way smaller "resolution" taking into consideration the whole scene on real time with a single solution optimizing the time spent for it and also using that each 'ray traced' at the same time for lighting, shadowing, reflections and 3D audio.
And then they are using and making different techniques (relatively similar to the upscaling ones mentioned before) to give it more detail. They are also testing to calculate it at different speeds and other tricks to achieve a better quality. The goal is again getting a way more natural and realistic lighting, shadowing and reflections making a way more optimized approach than they did before, replacing many tricks to fake stuff with a more realistic single global approach.
But well, they are still working on it and testing a lot of stuff withing that somewhat similar approach and they still have a lot of work to do. It will take years to see all this stuff, and the earlier versions will be unoptimized and not that polished, but early tests lead to think we'll get games looking better than the original UE5 demo and the Matrix demo on PS5 by the 2nd half and specially late years or PS5. In fact I assume that the properly polished and optimized versions will be released on PS5 after the PS6 release.
Btw, the idea is that all these things will be scalable meaning that with more powerful hardware like PS6 or current high-end PC hardware would allow more detailed and dense scenes, higher fps, lighting/shadowing/reflections/3D audio, higher native resolution (even if not very noticiable). Because the idea is that with these kind of engines is that given the output resolution of the TV, the original scene and the target FPS the engine would try to go as far as possible when adding detail setting the tv resolution as limit.
As I remember with the early, not final implementation of UE5 they already were able to get 4K on a PS5 at 30fps with high end visuals or 1440p at 60fps, meaning that very likely with future more optimized versions they were going to achieve 4K 60fps or beyond specially considering the upcoming upscaling techniques.
In the next gen (PS6) we could expect somewhat similar high end visuals in terms of detail at 120fps or 8K (even if it's for supersampling) with a more detailed lighting due to more RT stuff available, better upscaling, faster streaming and bigger memories so even more detail, density or variety in the scenes and related lighting/shadowing/reflections.