public inbox for regressions@lists.linux.dev
 help / color / mirror / Atom feed
From: Michal Pecio <michal.pecio@gmail.com>
To: x86@kernel.org
Cc: regressions@lists.linux.dev, Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Yazen Ghannam <yazen.ghannam@amd.com>,
	Mario Limonciello <mario.limonciello@amd.com>,
	Eric DeVolder <eric.devolder@oracle.com>
Subject: AMD topology broken on various 754/AM2+/AM3/AM3+ systems causes NB/EDAC/GART regression since 6.14
Date: Fri, 24 Oct 2025 20:46:58 +0200	[thread overview]
Message-ID: <20251024204658.3da9bf3f.michal.pecio@gmail.com> (raw)

Hi,

This report is related to discussion here:
https://lore.kernel.org/all/20251022011610.60d0ba6e.michal.pecio@gmail.com/

Commit bc7b2e629e0c ("x86/amd_nb: Use topology info to get AMD node
count") bails out if it can't find the NB of each node reportedy by
topology. Then NB features like EDAC or GART IOMMU aren't available.

Which was maybe not a bad idea, nobody expects those things to work
on selected nodes only. (I think?) But it relies on the optimistic
assumption that topology knows the true number of nodes.

Today I tested 5 older AMD64 systems with socket 754/AM2+/AM3/AM3+
on MSI/ASUS motherboards. *All* of them report more than one node if
the CPU has fewer cores than supported by the BIOS.

(I also have one AM4 system which is OK, but can't speak for others).

This is due to peculiarity of their MADT tables - they report as many
LAPICs as the BIOS can support and excess LAPICs are simply disabled.
FWIW, it's also a pattern that disabled APIC IDs have 0x80 bit set.

The kernel counts this as "hotpluggable CPUs", since supposedly it's
indistinguishable from actual multi-socket systems before ACPI 6.3,
where the "online capable" flag was added to disambiguate hotplug and
nonexistent but theoretically supported CPUs.

Or at least that's what commit fed8d8773b8e ("x86/acpi/boot: Correct
acpi_is_processor_usable() check") seems to imply.

On pre-ACPI 6.3 systems those disabled LAPICs inflate topology size
and result in breakage on recent kernels. A few examples below give
an idea what those MADTs look like and how the kernel reads them.

Regards,
Michal


Athlon 3000+ on S754:

[02Fh 0047 001h]               Local Apic ID : 00
[030h 0048 004h]       Flags (decoded below) : 00000001	# enabled
--
[037h 0055 001h]               Local Apic ID : 81
[038h 0056 004h]       Flags (decoded below) : 00000000

[    0.027690] CPU topo: Max. logical packages:   2
[    0.027691] CPU topo: Max. logical dies:       2
[    0.027692] CPU topo: Max. dies per package:   1
[    0.027703] CPU topo: Max. threads per core:   1
[    0.027704] CPU topo: Num. cores per package:     1
[    0.027705] CPU topo: Num. threads per package:   1
[    0.027706] CPU topo: Allowing 1 present CPUs plus 1 hotplug CPUs

Athlon II X2 250 on AM3+:

[02Fh 0047 001h]               Local Apic ID : 00
[030h 0048 004h]       Flags (decoded below) : 00000001 # enabled
--
[037h 0055 001h]               Local Apic ID : 01
[038h 0056 004h]       Flags (decoded below) : 00000001 # enabled
--
[03Fh 0063 001h]               Local Apic ID : 82
[040h 0064 004h]       Flags (decoded below) : 00000000
--
[047h 0071 001h]               Local Apic ID : 83
[048h 0072 004h]       Flags (decoded below) : 00000000
--
[04Fh 0079 001h]               Local Apic ID : 84
[050h 0080 004h]       Flags (decoded below) : 00000000
--
[057h 0087 001h]               Local Apic ID : 85
[058h 0088 004h]       Flags (decoded below) : 00000000
--
[05Fh 0095 001h]               Local Apic ID : 86
[060h 0096 004h]       Flags (decoded below) : 00000000
--
[067h 0103 001h]               Local Apic ID : 87
[068h 0104 004h]       Flags (decoded below) : 00000000

[    0.147372] CPU topo: Max. logical packages:   3 # not sure why not 4
[    0.147372] CPU topo: Max. logical dies:       3
[    0.147373] CPU topo: Max. dies per package:   1
[    0.147379] CPU topo: Max. threads per core:   1
[    0.147379] CPU topo: Num. cores per package:     2
[    0.147380] CPU topo: Num. threads per package:   2
[    0.147381] CPU topo: Allowing 2 present CPUs plus 6 hotplug CPUs

Phenom II X4 965 on AM3:

[02Fh 0047   1]                Local Apic ID : 00
[030h 0048   4]        Flags (decoded below) : 00000001 # enabled
--
[037h 0055   1]                Local Apic ID : 01
[038h 0056   4]        Flags (decoded below) : 00000001 # enabled
--
[03Fh 0063   1]                Local Apic ID : 02
[040h 0064   4]        Flags (decoded below) : 00000001 # enabled
--
[047h 0071   1]                Local Apic ID : 03
[048h 0072   4]        Flags (decoded below) : 00000001 # enabled
--
[04Fh 0079   1]                Local Apic ID : 84
[050h 0080   4]        Flags (decoded below) : 00000000
--
[057h 0087   1]                Local Apic ID : 85
[058h 0088   4]        Flags (decoded below) : 00000000

[    0.072112] CPU topo: Max. logical packages:   2
[    0.072112] CPU topo: Max. logical dies:       2
[    0.072113] CPU topo: Max. dies per package:   1
[    0.072118] CPU topo: Max. threads per core:   1
[    0.072118] CPU topo: Num. cores per package:     4
[    0.072119] CPU topo: Num. threads per package:   4
[    0.072120] CPU topo: Allowing 4 present CPUs plus 2 hotplug CPUs

             reply	other threads:[~2025-10-24 18:47 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-24 18:46 Michal Pecio [this message]
2025-10-24 21:32 ` AMD topology broken on various 754/AM2+/AM3/AM3+ systems causes NB/EDAC/GART regression since 6.14 Yazen Ghannam
2025-10-27 16:18   ` Mario Limonciello
2025-10-28 14:36     ` Yazen Ghannam
2025-11-03  7:40   ` Michal Pecio
2025-11-03 14:38     ` Yazen Ghannam
2025-11-03 17:12       ` Michal Pecio
2025-11-04 15:23         ` Yazen Ghannam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251024204658.3da9bf3f.michal.pecio@gmail.com \
    --to=michal.pecio@gmail.com \
    --cc=bp@alien8.de \
    --cc=eric.devolder@oracle.com \
    --cc=mario.limonciello@amd.com \
    --cc=mingo@redhat.com \
    --cc=regressions@lists.linux.dev \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yazen.ghannam@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox