public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* alderlake crashes (random memory corruption?) with 6.0 i915 / ucode related
@ 2022-10-13 20:33 Hans de Goede
  2022-10-15 14:25 ` Hans de Goede
  2022-10-17  8:30 ` Jani Nikula
  0 siblings, 2 replies; 13+ messages in thread
From: Hans de Goede @ 2022-10-13 20:33 UTC (permalink / raw)
  To: intel-gfx, Linux Kernel Mailing List,
	Thorsten Leemhuis (regressions address)

Hi All,

Yesterday I got a new Lenovo ThinkPad X1 yoga gen 7 laptop, since I plan
to make this my new day to day laptop I have copied over the entire
rootfs, /home, etc. from my current laptop to avoid having to tweak
everything to my liking again.

This meant I had an initramfs generated for the other laptop. Which should
be fine since both are Intel machines and the old 5.19.y initramfs-es
worked fine. But 6.0.0 crashed with what seems like random memory
corruption (list integrity checks failing) until I regenerated the initrd ...

Comparing the old vs regenerated initrds showed no relevant differences,
which made me think this is a CPU ucode issue (which is pre-fixed
to the initrd for early microcode loading).

After some tests I have the following obeservations with 6.0.0:

1. The least stable is the old initrd (so with the wrong
ucode prefixed) this crashes before ever reaching gdm.
I believe that this is caused by late microcode loading
kicking in in this case (I though that was being removed?)
and doing load microcode loading on the i7-1260P with its
mix of P + E cores seems to seriously mess things up.

2. Slightly more stable, lasting at least a few minutes
before crashing is using dis_ucode_ldr

3. Using nomodeset seems to stabilize things even with
the old initrd with the wrong microcode prefixed

4. 5.19, with an old initrd and with normal modesetting
enabled works fine, so in a way this is a 6.0.0 regression

5. Using 6.0 with the new initrd with the new microcode
seems mostly stable, although sometimes this seems to 
hang very early during boot, esp. if a previous boot
crashed and I have not run this for a long time yet.

6. After crashes it seems to be necessary to powercycle
the machine to get things back in working condition.


With 6.0 the following WARN triggers:
drivers/gpu/drm/i915/display/intel_bios.c:477:

        drm_WARN(&i915->drm, min_size == 0,
                 "Block %d min_size is zero\n", section_id);

Since nomodeset helps this might be quite relevant, in 5.19.13
this does not happen, but I'm not sure if 5.19 has this check
at all.


There is a 2022/10/07 BIOS update which includes a CPU microcode
update available from Lenovo, I have not applied this yet in case
people want to investigate this further first.

Regards,

Hans



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-10-18 10:33 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-13 20:33 alderlake crashes (random memory corruption?) with 6.0 i915 / ucode related Hans de Goede
2022-10-15 14:25 ` Hans de Goede
2022-10-17  8:17   ` [Intel-gfx] " Tvrtko Ursulin
2022-10-17  8:30 ` Jani Nikula
2022-10-17  8:32   ` Hans de Goede
2022-10-17  8:39   ` Jani Nikula
2022-10-17 10:48     ` Hans de Goede
2022-10-17 11:19       ` Thorsten Leemhuis
2022-10-17 13:14         ` Hans de Goede
2022-10-17 13:35           ` Jani Nikula
2022-10-17 14:32             ` Hans de Goede
2022-10-18 10:32               ` Ville Syrjälä
2022-10-17 11:40       ` Jani Nikula

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox