[WC] PC gamers experiencing crashes with 13/14th gen Intel Core i9 CPUs | UP: Intel issues root cause findings, further stability update inbound

anonpuffs

Veteran
Icon Extra
29 Nov 2022
10,501
11,942
link

The problem is reportedly especially problematic on games that were created in Unreal Engine.

What you need to know​

  • Intel's 13th and 14th Gen Core i9 CPUs are reportedly causing problems when it comes to PC gaming.
  • People are claiming these CPUs are unstable and can cause crashing, especially with games that were created in Unreal Engine.
  • These CPUs also reportedly create "out of video memory" errors when a system clearly has plenty of memory.
  • This has led to PC gamers in South Korea returning the Intel CPUs and seeking out alternatives like with AMD Ryzen.
  • Intel is now looking into the matter.
It seems that PC gamers using both 13th and 14th Gen Intel Core i9 CPUs (both Raptor Lake and Raptor Lake Refresh) have reported stability issues when playing video games (thanks TechRadar). This reportedly, is especially an issue for games that run on Unreal Engine where crashing isn't out of the question. Some players are also getting an error that claims their system is "out of video memory" even though their system has plenty of memory to use.

According to Wccftech, this has led PC gamers in South Korea to return the Intel Core i9 processors and seek out alternatives like AMD Ryzen. Intel is reportedly looking into the matter.

More at the link

Basically TL;DR is that intel's highest tier CPUs are drawing so much power and wattage that they're becoming unstable and experiencing silicon degradation within days, weeks, or months of purchase.

If you have an intel CPU I highly recommend undervolting and power-limiting the CPU to preserve its lifespan.
_____________________________________________
UPDATE: Intel issues statement
tom's hardware: intel issues statement

Intel issues statement about CPU crashes, blames motherboard makers — BIOSes disable thermal and power protection, causing issues​


Igor's Lab seems to have obtained a message originally destined for motherboard manufacturers concerning a prolonged stability issue on the company's 13th Generation Raptor Lake and 14th Generation Raptor Lake Refresh chips, which rank among the best CPUs. It made sense for the company to clarify the issue where many blamed the motherboard manufacturers in a race to become 'the fastest' performer by having over-aggressive voltages for allowing higher clock speeds.

The company specifically points out the issue with 600/ 700 series motherboard manufacturers that disable thermal and power protection to achieve the highest possible overclocks, even at the cost of instability. The chipmaker said in the message:

Intel has observed that this issue may be related to out of specification operating conditions resulting in sustained high voltage and frequency during periods of elevated heat.

Analysis of affected processors shows some parts experience shifts in minimum operating voltages which may be related to operation outside of Intel® specified operating conditions.

While the root cause has not yet been identified, Intel® has observed the majority of reports of this issue are from users with unlocked/overclock capable motherboards.

Intel has observed 600/700 Series chipset boards often set BIOS defaults to disable thermal and power delivery safeguards designed to limit processor exposure to sustained periods of high voltage and frequency, for example:
– Disabling Current Excursion Protection (CEP)
– Enabling the IccMax Unlimited bit
– Disabling Thermal Velocity Boost (TVB) and/or Enhanced Thermal Velocity Boost (eTVB)
– Additional settings which may increase the risk of system instability:
– Disabling C-states
– Using Windows Ultimate Performance mode
– Increasing PL1 and PL2 beyond Intel® recommended limits

Intel requests system and motherboard manufacturers to provide end users with a default BIOS profile that matches Intel recommended settings.

Intel strongly recommends customer’s default BIOS settings should ensure operation within Intel’s recommended settings.

In addition, Intel strongly recommends motherboard manufacturers to implement warnings for end users alerting them to any unlocked or overclocking feature usage.

Intel is continuing to actively investigate this issue to determine the root cause and will provide additional updates as relevant information becomes available.


Intel will be publishing a public statement regarding issue status and Intel recommended BIOS setting recommendations targeted for May 2024.
UP:
Intel issues microcode patch
Bios updates rolling out via mobo vendors

UP: 9/25/24
root cause confirmed found

Following extensive investigation of the Intel® Core™ 13th and 14th Gen desktop processor Vmin Shift Instability issue, Intel can now confirm the root cause diagnosis for the issue. This post will cover Intel’s understanding of the root cause, as well as additional mitigations and next steps for Intel® Core™ 13th and 14th Gen desktop users.

Vmin Shift Instability Root Cause

Intel® has localized the Vmin Shift Instability issue to a clock tree circuit within the IA core which is particularly vulnerable to reliability aging under elevated voltage and temperature. Intel has observed these conditions can lead to a duty cycle shift of the clocks and observed system instability.

Intel® has identified four (4) operating scenarios that can lead to Vmin shift in affected processors:

  1. Motherboard power delivery settings exceeding Intel power guidance.
    a. Mitigation: Intel® Default Settings recommendations for Intel® Core™ 13th and 14th Gen desktop processors.
  2. eTVB Microcode algorithm which was allowing Intel® Core™ 13th and 14th Gen i9 desktop processors to operate at higher performance states even at high temperatures.
    a. Mitigation: microcode 0x125 (June 2024) addresses eTVB algorithm issue.
  3. Microcode SVID algorithm requesting high voltages at a frequency and duration which can cause Vmin shift.
    a. Mitigation: microcode 0x129 (August 2024) addresses high voltages requested by the processor.
  4. Microcode and BIOS code requesting elevated core voltages which can cause Vmin shift especially during periods of idle and/or light activity.
    a. Mitigation: Intel® is releasing microcode 0x12B, which encompasses 0x125 and 0x129 microcode updates, and addresses elevated voltage requests by the processor during idle and/or light activity periods.
Regarding the 0x12B update, Intel® is working with its partners to roll out the relevant BIOS update to the public.

Intel’s internal testing comparing 0x12B microcode to 0x125 microcode – on Intel® Core™ i9-14900K with DDR5 5200MT/s memory1 - indicates performance impact is within run-to-run variation (ie. Cinebench* R23, Speedometer*, WebXPRT4*, Crossmark*). For gaming workloads on Intel® Core™ i9-14900K with DDR5 5600MT/s memory2, performance is also within run-to-run variation (ie. Shadow of the Tomb Raider*, Cyberpunk* 2077, Hitman 3: Dartmoor*, Total War: Warhammer III – Mirrors of Madness*). However, system performance is dependent on configuration and several other factors.

Intel® reaffirms that both Intel® Core™ 13th and 14th Gen mobile processors and future client product families – including the codename Lunar Lake and Arrow Lake families - are unaffected by the Vmin Shift Instability issue. We appreciate our customers’ patience throughout the investigation, as well as our partners’ support in the analysis and relevant mitigations.

Next Steps

For all Intel® Core™ 13th/14th Gen desktop processor users: the 0x12B microcode update must be loaded via BIOS update and has been distributed to system and motherboard manufacturers to incorporate into their BIOS. Intel is working with its partners to encourage timely validation and rollout of the BIOS update for systems currently in service. This process may take several weeks.
 
Last edited:
OP
OP
anonpuffs

anonpuffs

Veteran
Icon Extra
29 Nov 2022
10,501
11,942
Funny enough windows central is pretty out of touch with the DIY PC gamer market. In the article it claims that Intel is seen as top dog... that hasn't been true for about 2-3 years, it was truly over after the 5800X3D came out. These days AMD holds an 80% market share or more in the DIY CPU space.
 

Polyh3dron

Veteran
31 Jan 2024
759
531
After my Ryzen 3950X and 5950X being kinda shitshows in terms of stability I am quite impressed by how relatively stable my 7950X build is, especially after hearing about how much of a shitshow 14th Gen Intel can be for some people. I was considering going Intel after my current platform dies out but that may not end up happening if things stay on their current course.

Also, can't wait for the usual suspects to say that the Unreal Engine issues aren't the fault of the engine itself, but of shitty devs lmao
 
Last edited:
OP
OP
anonpuffs

anonpuffs

Veteran
Icon Extra
29 Nov 2022
10,501
11,942
After my Ryzen 3950X and 5950X being kinda shitshows in terms of stability I am quite impressed by how relatively stable my 7950X build is, especially after hearing about how much of a shitshow 14th Gen Intel can be for some people. I was considering going Intel after my current platform dies out but that may not end up happening if things stay on their current course.

Also, can't wait for the usual suspects to say that the Unreal Engine issues aren't the fault of the engine itself, but of shitty devs lmao
Was the zen2/3 cpu issues due to motherboards? I remember early on they had a really rough time with BIOS settings and the quality of the voltage regulator modules on the AMD motherboards.
 

Polyh3dron

Veteran
31 Jan 2024
759
531
Was the zen2/3 cpu issues due to motherboards? I remember early on they had a really rough time with BIOS settings and the quality of the voltage regulator modules on the AMD motherboards.
From what I understand it was mostly down to certain games and apps not playing well with the infinity fabric and windows moving processes from one CCD to the other. Now that the infinity fabric has been seemingly improved, or the scheduling is better, it seems to not be as much of an issue. I would get games freeze up randomly with buzzing noises, background apps/processes crapping out, half-second long random freezes in game... lots of frustrating stuff.

Now that Intel has a similar scheduling situation with the disparity between their P-Cores and E-Cores, I guess more attention is being paid to this and is helping both product lines.
 
  • Informative
Reactions: anonpuffs

Polyh3dron

Veteran
31 Jan 2024
759
531
Was the zen2/3 cpu issues due to motherboards? I remember early on they had a really rough time with BIOS settings and the quality of the voltage regulator modules on the AMD motherboards.
Also the boards I was using for Zen 2 and 4 were the Crosshair VIII Formula and Dark Hero, respectively. I don't think VRM quality was a problem on those. Now my 3950X/Formula build has been relegated to a plex server and OBS streaming/capture role, and with the mature BIOS and chipset drivers it is relatively stable., aside from booting back to BIOS after auto updates sometimes, which requires a power cycle.
 
OP
OP
anonpuffs

anonpuffs

Veteran
Icon Extra
29 Nov 2022
10,501
11,942
Oh, asus boards. Maybe they had the core voltage set too high on them, i remember that was something ASUS did a lot and it caused some of the melting issues with zen4 x3d parts
 
D

Deleted member 223

Guest
Funny enough windows central is pretty out of touch with the DIY PC gamer market. In the article it claims that Intel is seen as top dog... that hasn't been true for about 2-3 years, it was truly over after the 5800X3D came out. These days AMD holds an 80% market share or more in the DIY CPU space.
13600k can hold its own just fine, as an after the fact response of course.

I do agree that it's just not competitive on the power draw front and eventually this was going to lead to all sorts of issues.
 
OP
OP
anonpuffs

anonpuffs

Veteran
Icon Extra
29 Nov 2022
10,501
11,942
13600k can hold its own just fine, as an after the fact response of course.

I do agree that it's just not competitive on the power draw front and eventually this was going to lead to all sorts of issues.
Yeah intel has been doing good on the low end parts, they have competitive stuff in the -400/-600 line, but it's still a bummer that effectively you only ever get 1 CPU gen out of a motherboard. 12th to 14th gen are basically the exact same with slightly higher clocks
 
D

Deleted member 223

Guest
Yeah intel has been doing good on the low end parts, they have competitive stuff in the -400/-600 line, but it's still a bummer that effectively you only ever get 1 CPU gen out of a motherboard. 12th to 14th gen are basically the exact same with slightly higher clocks
Agree. They def need a conceptual redesign of how they do gens to be more in line with AMD, but then, that market share will cockblock all common sense, which is why we are where we are.

On the bright side, their GPU business is such a great refresher. The tyranny of 2 must be brought to heel.
 
OP
OP
anonpuffs

anonpuffs

Veteran
Icon Extra
29 Nov 2022
10,501
11,942
Intel has issued a response

videocardz link

Intel has observed that this issue may be related to out of specification operating conditions resulting in sustained high voltage and frequency during periods of elevated heat.

Analysis of affected processors shows some parts experience shifts in minimum operating voltages which may be related to operation outside of Intel® specified operating conditions.

While the root cause has not yet been identified, Intel® has observed the majority of reports of this issue are from users with unlocked/overclock capable motherboards.

Intel has observed 600/700 Series chipset boards often set BIOS defaults to disable thermal and power delivery safeguards designed to limit processor exposure to sustained periods of high voltage and frequency, for example:
– Disabling Current Excursion Protection (CEP)
– Enabling the IccMax Unlimited bit
– Disabling Thermal Velocity Boost (TVB) and/or Enhanced Thermal Velocity Boost (eTVB)
– Additional settings which may increase the risk of system instability:
– Disabling C-states
– Using Windows Ultimate Performance mode
– Increasing PL1 and PL2 beyond Intel® recommended limits

Intel requests system and motherboard manufacturers to provide end users with a default BIOS profile that matches Intel recommended settings.

Intel strongly recommends customer’s default BIOS settings should ensure operation within Intel’s recommended settings.

In addition, Intel strongly recommends motherboard manufacturers to implement warnings for end users alerting them to any unlocked or overclocking feature usage.

Intel is continuing to actively investigate this issue to determine the root cause and will provide additional updates as relevant information becomes available.

Intel will be publishing a public statement regarding issue status and Intel recommended BIOS setting recommendations targeted for May 2024.

basically blaming motherboard vendors for not respecting intel's recommended cpu specs

meanwhile, motherboard vendors have pushed out a BIOS update which loses ~10% performance to combat the issue

article link

Some Intel CPUs lost 9% of their performance almost overnight​


Over the past few weeks, we’ve seen an increasing number of reports of instability on high-end Intel CPUs like the Core i9-14900K. Asus has released a BIOS update for its Z790 motherboards aimed at addressing the problem, but it carries a performance loss of upwards of 9% in some workloads.


The most recent BIOS update from Asus includes the Intel Baseline Profile. This profile disables various optimizations that are automatically applied on Asus Z790 motherboards and runs high-end Intel chips within Intel’s specific limits. Hardwareluxx tested the new profile with the Core i9-14900K and found that the CPU ran around 9% slower in multiple tests.


In Cinebench R23, for example, the German publication found that the Intel Baseline Profile slashed performance by 9%. In Y-Cruncher, a benchmark that calculates Pi, the performance drop was 11%. Even games were affected, with Starfield, Shadow of the Tomb Raider, and F1 2023 showing an 8% drop in performance when tested at 720p (these differences should disappear at higher resolutions).
One of the main reasons behind the instability, it seems, is the unlimited power budget available to high-end Intel CPUs on some motherboards. With the proper BIOS settings, the maximum turbo power available to a chip like the Core i9-14900K is 4,095 watts. Your CPU will never draw that much power, but such a high limit allows the chip to draw as much power as it needs for brief spurts, even if that results in a crash.


These settings follow some BIOS adjustments we’ve seen experts recommend over the past few days. Asus presumably released the BIOS update in response to Intel’s investigation of the problem, but it’s not clear if other motherboard vendors will follow suit.

The performance drop only applies if you were using the various enhancements available on Asus Z790 motherboards. However, you might have been using those enhancements without even knowing it. By default, Asus automatically applies whatever enhancements it deems best for your CPU within the BIOS, potentially causing instability. If you haven’t messed with your BIOS settings, there’s a good chance your CPU will run slower with the Intel Baseline Profile applied.


Thankfully, that shouldn’t impact gaming performance much. The performance drop mainly shows up in other applications, while the instability that this BIOS update addresses shows up mainly in games. It’s not an ideal trade-off regardless, but hopefully it addresses the problems with crashing for Core i9-14900K owners.


It’s still not clear what the scope of the problem is with Intel’s high-end CPUs. Right now, it appears that the Core i9-13900K and Core i9-14900K are the main culprits, with the Core i7-13700K and Core i7-14700K affected to a lesser degree. Only some of these chips show instability issues, while other Intel CPUs shouldn’t have any problems. If you aren’t experiencing instability, you don’t need to apply the Intel Baseline Profile.
 
  • haha
Reactions: Ghaleon
OP
OP
anonpuffs

anonpuffs

Veteran
Icon Extra
29 Nov 2022
10,501
11,942


gigabyte motherboard spiking 1.7(!!!)volts on core voltage on intel baseline specs

that's crazy
 
OP
OP
anonpuffs

anonpuffs

Veteran
Icon Extra
29 Nov 2022
10,501
11,942
Intel points at motherboard vendors for recent CPU instability issues but the chip maker isn't entirely blame-free

You might recall reports of Unreal-based games crashing when running on gaming PCs using high-end Intel processors toward the start of this year. As the problem was getting flagged up around the world, things were serious enough for Intel to investigate the problem formally. That's still ongoing but Intel has made an early statement on the matter, in which it's essentially blaming motherboard manufacturers for having default settings in the BIOS/UEFI that allow the CPU to run well past the recommended limits for power and current.

The full statement was relayed by Igor's Lab but the most telling part is this comment by Intel: "600/700 Series chipset boards often set BIOS defaults to disable thermal and power delivery safeguards designed to limit processor exposure to sustained periods of high voltage and frequency."


Anyone who bought an Intel-based motherboard in the last few years will probably already be aware of this. For example, by default, Asus enables its Multi Core Enhancement feature in the BIOS which simply sets the power and current limits to their maximum possible values.

Intel's CPUs have two primary power limits, PL1 and PL2, and the idea behind them is that the former is the maximum power the chip can consume under 'normal' circumstances, whereas the latter allows for more energy to be used, for a limited duration.

Take the Core i9 13900K. That has a PL1 and PL2 of 125 and 253 W respectively, but stick that into most Asus 600/700 series motherboards with MCE enabled, and the processor is given limits of 4095 W. The same is true for the current—the chip is permitted to draw well over Intel's recommended maximum value of 307 A. It's over 100 A more than the 'extreme config' limit of 400 A.
That particular CPU won't ever use that much power, but for the 13th and 14th Gen i9 models, the excessively high power and current limits—along with overly high operating voltages—all result in a processor being unstable in demanding scenarios, such as playing games. And it's not just Asus that does this; ASRock, Gigabyte, MSI, and others all have BIOS defaults that ignore Intel's recommended limits.

Asus responded to the reports by releasing a new BIOS for most of its Intel 600/700 series motherboard chipsets, and that includes an 'Intel Baseline Profile' that forces the PL2 and current limits to the recommended values. However, it still sets the PL1 higher than what it should be. That's not a problem for most Intel CPUs, provided they have a decent cooler on them, but it's an example of Intel saying one thing and a motherboard vendor doing something different.
Intel's datasheets for its 13th and 14th Generation Core processors do clearly show what the maximum limits for everything are, and contain statements such as "long term reliability cannot be assured in conditions above or below Maximum/Minimum functional limits." If the end user wants to push their hardware beyond these, then that's absolutely fine, but it should never be the case that one's motherboard defaults to such a scenario.

So, is this all a case of Intel being the victim of motherboard vendors seemingly doing whatever they like? No, not really.

Motherboard manufacturers have been setting BIOS default values for Intel processors well outside the recommended values for a good while now and I can't imagine that Intel is unaware of this. In fact, there's just no way it could miss this as every time the chip giant has released a new CPU, the marketing material always contains benchmarks comparing the new chip to previous models, or ones by AMD, that have been carried out on a test PC using a motherboard by Asus, ASRock, MSI, etc—it even has a dedicated webpage for "Truth and Transparency" that clearly shows what motherboard the tests were done with.

Even if Intel forced its power and current limits before commencing the testing, it would have clearly seen what the motherboard's BIOS was defaulting to before changing anything.

In other words, Intel has known about this for long enough and has been happy to enjoy the fruits of such settings, i.e. chart-topping performance figures. But now, with customers and retailers complaining about games crashing and chips completely failing, it would seem that Intel is trying to point the finger at motherboard vendors. The reality is both are to blame here.
"In other words, Intel has known about this for long enough and has been happy to enjoy the fruits of such settings, i.e. chart-topping performance figures. But now, with customers and retailers complaining about games crashing and chips completely failing, it would seem that Intel is trying to point the finger at motherboard vendors. The reality is both are to blame here."
 
  • Like
Reactions: Ghaleon
OP
OP
anonpuffs

anonpuffs

Veteran
Icon Extra
29 Nov 2022
10,501
11,942
Update: Hardware Unboxed claims silicon degradation in CPUs confirmed by board partners believed due to unlimited current draw, unenforced power limits and specs by intel.