From: Xiaoyao Li <xiaoyao.li@intel.com>
To: Chuang Xu <xuchuangxclwt@bytedance.com>
Cc: pbonzini@redhat.com, imammedo@redhat.com,
xieyongji@bytedance.com, chaiwen.cc@bytedance.com,
zhao1.liu@intel.com, qemu-stable@nongnu.org,
Guixiong Wei <weiguixiong@bytedance.com>,
Yipeng Yin <yinyipeng@bytedance.com>,
qemu-devel@nongnu.org
Subject: Re: [PATCH v6] i386/cpu: fixup number of addressable IDs for logical processors in the physical package
Date: Mon, 14 Oct 2024 08:36:13 +0800 [thread overview]
Message-ID: <a48fcd78-d1c4-4359-bc18-d04147a93f50@intel.com> (raw)
In-Reply-To: <2f6b952d-4c21-4db5-9a8a-84a0c10feca8@bytedance.com>
On 10/12/2024 5:35 PM, Chuang Xu wrote:
>
> On 10/12/24 下午4:21, Xiaoyao Li wrote:
>> On 10/9/2024 11:56 AM, Chuang Xu wrote:
>>> When QEMU is started with:
>>> -cpu host,migratable=on,host-cache-info=on,l3-cache=off
>>> -smp 180,sockets=2,dies=1,cores=45,threads=2
>>>
>>> On Intel platform:
>>> CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
>>> logical processors in the physical package".
>>>
>>> When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of
>>> 90 for
>>> CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
>>> executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
>>> CPUID.04H.EAX[31:26], which matches the expected result.
>>>
>>> As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
>>> we'd beter round up CPUID.01H.EBX[23:16] to the nearest power-of-2
>>> integer too. Otherwise we may encounter unexpected results in guest.
>>>
>>> For example, when QEMU is started with CLI above and xtopology is
>>> disabled,
>>> guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/
>>> (1+CPUID.04H.EAX[31:26]) to
>>> calculate threads-per-core in detect_ht(). Then guest will get "90/
>>> (1+63)=1"
>>> as the result, even though threads-per-core should actually be 2.
>>
>> It's kernel's bug instead.
>>
>> In 1.5.3 "Sub ID Extraction Parameters for initial APIC ID" of "Intel
>> 64 Architecture Processor Topology Enumeration" [1], it is
>>
>> - SMT_Mask_Width = Log2(RoundToNearestPof2(CPUID.1:EBX[23:16])/
>> (CPUID.(EAX=4,ECX=0):EAX[31:26]) + 1))
>>
>> The value of CPUID.1:EBX[23:16] needs to be *rounded* to the neartest
>> power-of-two integer instead of itself being the power-of-two.
>>
>> This also is consistency with the SDM, where the comment for bit 23-16
>> of CPUID.1:EBX is:
>>
>> The nearest power-of-2 integer that is not smaller than EBX[23:16] is
>> the number of unique initial APIC IDs reserved for addressing
>> different logical processors in a physical package.
>>
>> What I read from this is, the nearest power-of-2 integer that is not
>> smaller than EBX[23:16] is a different thing than EBX[23:16]. i.e.,
>
> Yes, when I read sdm, I also thought it was a kernel bug. But on my
> 192ht spr host, the value of CPUID.1:EBX[23:16] was indeed rounded up
>
> to the nearest power of 2 by the hardware. After communicating with
> Intel technical support staff, we thought that perhaps the description
> in sdm
>
> is not so accurate, and rounding up CPUID.1:EBX[23:16] to the power of 2
> in qemu may be more in line with the hardware behavior.
I think above justification is important. We need to justify our changes
with the fact and correct reason.
I somehow agree to set EBX[23:16] to a value of power-of-2, because the
1.5.3 "Sub ID Extraction Parameters for initial APIC ID" of "Intel 64
Architecture Processor Topology Enumeration" spec says
CPUID.1:EBX[23:16] represents the maximum number of addressable IDs
(initial APIC ID) that can be assigned to logical processors in a
physical package. The value may not be the same as the number of
logical processors that are present in the hardware of a physical
package.
It uses the word "may not".
However, the justification of the change cannot be "it leads to
unexpected results in guest" because the guest implementation is not
correct.
>>
>> - EBX[23:16]: Maximum number of addressable IDs for logical processors
>> in this physical package
>>
>> - pow2ceil(EBX[23:16]): the number of unique initial APIC IDs reserved
>> for addressing different logical processors in a physical package.
>>
>> [1] https://cdrdv2-public.intel.com/759067/intel-64-architecture-
>> processor-topology-enumeration.pdf
>>
>>> And on AMD platform:
>>> CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
>>> result meets our expectation.
>>>
>>> So let us round up CPUID.01H.EBX[23:16] to the nearest power-of-2
>>> integer
>>> only for Intel platform to solve the unexpected result.
>>>
>>> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
>>> Acked-by: Igor Mammedov <imammedo@redhat.com>
>>> Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
>>> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
>>> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
>>> ---
>>> target/i386/cpu.c | 10 +++++++++-
>>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index ff227a8c5c..641d4577b0 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -6462,7 +6462,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t
>>> index, uint32_t count,
>>> }
>>> *edx = env->features[FEAT_1_EDX];
>>> if (threads_per_pkg > 1) {
>>> - *ebx |= threads_per_pkg << 16;
>>> + /*
>>> + * AMD requires logical processor count, but Intel needs
>>> maximum
>>> + * number of addressable IDs for logical processors per
>>> package.
>>> + */
>>> + if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
>>> + *ebx |= threads_per_pkg << 16;
>>> + } else {
>>> + *ebx |= 1 << apicid_pkg_offset(&topo_info) << 16;
>>> + }
>>> *edx |= CPUID_HT;
>>> }
>>> if (!cpu->enable_pmu) {
>>
next prev parent reply other threads:[~2024-10-14 0:37 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-09 3:56 [PATCH v6] i386/cpu: fixup number of addressable IDs for logical processors in the physical package Chuang Xu
2024-10-09 4:21 ` Zhao Liu
2024-10-12 7:13 ` Xiaoyao Li
2024-10-12 8:10 ` Chuang Xu
2024-10-12 8:32 ` Xiaoyao Li
2024-10-12 8:56 ` Zhao Liu
2024-10-12 8:21 ` Xiaoyao Li
2024-10-12 9:28 ` Zhao Liu
2024-10-12 9:35 ` Chuang Xu
2024-10-14 0:36 ` Xiaoyao Li [this message]
2024-10-14 1:32 ` Xiaoyao Li
2024-10-14 3:36 ` Zhao Liu
2024-10-17 8:18 ` Xiaoyao Li
2024-10-17 9:03 ` Zhao Liu
2024-10-28 16:07 ` Xiaoyao Li
2024-12-03 7:33 ` Zhao Liu
2024-12-03 15:04 ` Xiaoyao Li
2024-12-03 15:35 ` Zhao Liu
2024-12-03 15:29 ` Daniel P. Berrangé
2024-12-03 7:36 ` Zhao Liu
2024-12-03 7:29 ` Chuang Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a48fcd78-d1c4-4359-bc18-d04147a93f50@intel.com \
--to=xiaoyao.li@intel.com \
--cc=chaiwen.cc@bytedance.com \
--cc=imammedo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
--cc=weiguixiong@bytedance.com \
--cc=xieyongji@bytedance.com \
--cc=xuchuangxclwt@bytedance.com \
--cc=yinyipeng@bytedance.com \
--cc=zhao1.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).