From: "Zhang, Rui" <rui.zhang@intel.com>
To: "jmattson@google.com" <jmattson@google.com>
Cc: "ajorgens@google.com" <ajorgens@google.com>,
"myrade@google.com" <myrade@google.com>,
"bp@alien8.de" <bp@alien8.de>, "x86@kernel.org" <x86@kernel.org>,
"peterz@infradead.org" <peterz@infradead.org>,
"Tang, Feng" <feng.tang@intel.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"Wysocki, Rafael J" <rafael.j.wysocki@intel.com>,
"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
"jay.chen@amd.com" <jay.chen@amd.com>,
"vladteodor@google.com" <vladteodor@google.com>,
"jon.grimm@amd.com" <jon.grimm@amd.com>
Subject: Re: [RFC PATCH] x86/acpi: Ignore invalid x2APIC entries
Date: Fri, 11 Oct 2024 01:37:47 +0000 [thread overview]
Message-ID: <f5962c02ea46c3180e7c0e6e5e1f08f4209a1ca2.camel@intel.com> (raw)
In-Reply-To: <20241010213136.668672-1-jmattson@google.com>
On Thu, 2024-10-10 at 14:31 -0700, Jim Mattson wrote:
> > Currently, kernel enumerates the possible CPUs by parsing both ACPI
> > MADT
> > Local APIC entries and x2APIC entries. So CPUs with "valid" APIC
> > IDs,
> > even if they have duplicated APIC IDs in Local APIC and x2APIC, are
> > always enumerated.
> >
> > Below is what ACPI MADT Local APIC and x2APIC describes on an
> > Ivebridge-EP system,
> >
> > [02Ch 0044 1] Subtable Type : 00 [Processor Local
> > APIC]
> > [02Fh 0047 1] Local Apic ID : 00
> > ...
> > [164h 0356 1] Subtable Type : 00 [Processor Local
> > APIC]
> > [167h 0359 1] Local Apic ID : 39
> > [16Ch 0364 1] Subtable Type : 00 [Processor Local
> > APIC]
> > [16Fh 0367 1] Local Apic ID : FF
> > ...
> > [3ECh 1004 1] Subtable Type : 09 [Processor Local
> > x2APIC]
> > [3F0h 1008 4] Processor x2Apic ID : 00000000
> > ...
> > [B5Ch 2908 1] Subtable Type : 09 [Processor Local
> > x2APIC]
> > [B60h 2912 4] Processor x2Apic ID : 00000077
> >
> > As a result, kernel shows "smpboot: Allowing 168 CPUs, 120 hotplug
> > CPUs".
> > And this wastes significant amount of memory for the per-cpu data.
> > Plus this also breaks
> > https://lore.kernel.org/all/87edm36qqb.ffs@tglx/,
> > because __max_logical_packages is over-estimated by the APIC IDs in
> > the x2APIC entries.
> >
> > According to
> > https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#processor-local-x2apic-structure
> > ,
> > "[Compatibility note] On some legacy OSes, Logical processors with
> > APIC
> > ID values less than 255 (whether in XAPIC or X2APIC mode) must use
> > the
> > Processor Local APIC structure to convey their APIC information to
> > OSPM,
> > and those processors must be declared in the DSDT using the
> > Processor()
> > keyword. Logical processors with APIC ID values 255 and greater
> > must use
> > the Processor Local x2APIC structure and be declared using the
> > Device()
> > keyword.".
> >
> > Enumerate CPUs from x2APIC enties with APIC ID values 255 or
> > greater,
> > when valid CPU from Local APIC is already detected.
> >
> > Signed-off-by: Zhang Rui <rui.zhang@intel.com>
> > ---
> > I didn't find any clear statement in the ACPI spec about if a
> > mixture of
> > Local APIC and x2APIC entries is allowed or not. So it would be
> > great if
> > this can be clarified.
>
> Has this been clarified?
>
> The reason that I ask is that Google Cloud has a 360 vCPU Zen4 VM
> occupying two virtual sockets, and the corresponding MADT table has a
> mixture of Local APIC and X2APIC entries.
>
> All of the LPUs in virtual socket 0 have extended APIC IDs below 255,
> and they have Local APIC entries. All of the LPUs in virtual socket 1
> have extended APIC IDs above 255, and they have X2APIC entries.
>
> Prior to this change, Linux assigned CPU numbers to all even-numbered
> LPUs on virtual socket 0, followed by all even-numbered LPUs on
> virtual socket 1, followed by all odd-numbered LPUs on virtual socket
> 0, followed by all odd-numbered LPUs on virtual socket 1.
>
> node #0, CPUs: #1 #2 ... #87 #88 #89
> node #1, CPUs: #90 #91 #92 ... #177 #178 #179
> node #0, CPUs: #180 #181 #182 ... #267 #268 #269
> node #1, CPUs: #270 #271 #272 ... #357 #358 #359
>
> After this change, however, Linux assigns CPU numbers to all LPUs on
> virtual socket 0 before assigning any CPU numbers to LPUs on virtual
> socket 1.
>
> node #0, CPUs: #1 #2 ... #87 #88 #89
> node #1, CPUs: #180 #181 #182 ... #267 #268 #269
> node #0, CPUs: #90 #91 #92 ... #177 #178 #179
> node #1, CPUs: #270 #271 #272 ... #357 #358 #359
>
> I suspect that this is because all Local APIC MADT entries are now
> processed before all X2APIC MADT entries, whereas they may have been
> interleaved before.
agreed.
can you attach the acpidump to confirm this?
>
> TBH, I'm not sure that there is actually anything wrong with the new
> numbering scheme.
> The topology is reported correctly (e.g. in
> /sys/devices/system/cpu/cpu0/topology/thread_siblings_list). Yet, the
> new enumeration does seem to contradict user expectations.
>
Well, we can say this is a violation of the ACPI spec.
"OSPM should initialize processors in the order that they appear in the
MADT." even for interleaved LAPIC and X2APIC entries.
Maybe we need two steps for LAPIC/X2APIC parsing.
1. check if there is valid LAPIC entry by going through all LAPIC
entries first
2. parse LAPIC/X2APIC strictly following the order in MADT. (like we do
before)
thanks,
rui
next prev parent reply other threads:[~2024-10-11 1:37 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-02 16:28 [RFC PATCH] x86/acpi: Ignore invalid x2APIC entries Zhang Rui
2023-07-28 12:51 ` Thomas Gleixner
2023-07-28 12:55 ` Thomas Gleixner
2023-07-28 16:47 ` Zhang, Rui
2023-07-29 7:07 ` Thomas Gleixner
2023-07-31 13:04 ` Zhang, Rui
2024-10-10 21:31 ` Jim Mattson
2024-10-11 1:37 ` Zhang, Rui [this message]
2024-10-11 3:05 ` Jim Mattson
2024-10-14 13:05 ` Zhang, Rui
2024-10-14 18:00 ` Jim Mattson
2024-10-15 3:23 ` Zhang, Rui
2024-10-15 13:26 ` Jim Mattson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f5962c02ea46c3180e7c0e6e5e1f08f4209a1ca2.camel@intel.com \
--to=rui.zhang@intel.com \
--cc=ajorgens@google.com \
--cc=bp@alien8.de \
--cc=feng.tang@intel.com \
--cc=jay.chen@amd.com \
--cc=jmattson@google.com \
--cc=jon.grimm@amd.com \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=myrade@google.com \
--cc=peterz@infradead.org \
--cc=rafael.j.wysocki@intel.com \
--cc=tglx@linutronix.de \
--cc=vladteodor@google.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox