qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Xiaoyao Li <xiaoyao.li@intel.com>
To: Chuang Xu <xuchuangxclwt@bytedance.com>
Cc: pbonzini@redhat.com, imammedo@redhat.com,
	xieyongji@bytedance.com, chaiwen.cc@bytedance.com,
	zhao1.liu@intel.com, qemu-stable@nongnu.org,
	Guixiong Wei <weiguixiong@bytedance.com>,
	Yipeng Yin <yinyipeng@bytedance.com>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH v6] i386/cpu: fixup number of addressable IDs for logical processors in the physical package
Date: Mon, 14 Oct 2024 08:36:13 +0800	[thread overview]
Message-ID: <a48fcd78-d1c4-4359-bc18-d04147a93f50@intel.com> (raw)
In-Reply-To: <2f6b952d-4c21-4db5-9a8a-84a0c10feca8@bytedance.com>

On 10/12/2024 5:35 PM, Chuang Xu wrote:
> 
> On 10/12/24 下午4:21, Xiaoyao Li wrote:
>> On 10/9/2024 11:56 AM, Chuang Xu wrote:
>>> When QEMU is started with:
>>> -cpu host,migratable=on,host-cache-info=on,l3-cache=off
>>> -smp 180,sockets=2,dies=1,cores=45,threads=2
>>>
>>> On Intel platform:
>>> CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
>>> logical processors in the physical package".
>>>
>>> When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 
>>> 90 for
>>> CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
>>> executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
>>> CPUID.04H.EAX[31:26], which matches the expected result.
>>>
>>> As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
>>> we'd beter round up CPUID.01H.EBX[23:16] to the nearest power-of-2
>>> integer too. Otherwise we may encounter unexpected results in guest.
>>>
>>> For example, when QEMU is started with CLI above and xtopology is 
>>> disabled,
>>> guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/ 
>>> (1+CPUID.04H.EAX[31:26]) to
>>> calculate threads-per-core in detect_ht(). Then guest will get "90/ 
>>> (1+63)=1"
>>> as the result, even though threads-per-core should actually be 2.
>>
>> It's kernel's bug instead.
>>
>> In 1.5.3 "Sub ID Extraction Parameters for initial APIC ID" of "Intel 
>> 64 Architecture Processor Topology Enumeration" [1], it is
>>
>>   - SMT_Mask_Width = Log2(RoundToNearestPof2(CPUID.1:EBX[23:16])/ 
>> (CPUID.(EAX=4,ECX=0):EAX[31:26]) + 1))
>>
>> The value of CPUID.1:EBX[23:16] needs to be *rounded* to the neartest 
>> power-of-two integer instead of itself being the power-of-two.
>>
>> This also is consistency with the SDM, where the comment for bit 23-16 
>> of CPUID.1:EBX is:
>>
>>   The nearest power-of-2 integer that is not smaller than EBX[23:16] is
>>   the number of unique initial APIC IDs reserved for addressing
>>   different logical processors in a physical package.
>>
>> What I read from this is, the nearest power-of-2 integer that is not 
>> smaller than EBX[23:16] is a different thing than EBX[23:16]. i.e.,
> 
> Yes, when I read sdm, I also thought it was a kernel bug. But on my 
> 192ht spr host, the value of CPUID.1:EBX[23:16] was indeed rounded up
> 
> to the nearest power of 2 by the hardware. After communicating with 
> Intel technical support staff, we thought that perhaps the description 
> in sdm
> 
> is not so accurate, and rounding up CPUID.1:EBX[23:16] to the power of 2 
> in qemu may be more in line with the hardware behavior.

I think above justification is important. We need to justify our changes 
with the fact and correct reason.

I somehow agree to set EBX[23:16] to a value of power-of-2, because the 
1.5.3 "Sub ID Extraction Parameters for initial APIC ID" of "Intel 64 
Architecture Processor Topology Enumeration" spec says

     CPUID.1:EBX[23:16] represents the maximum number of addressable IDs
     (initial APIC ID) that can be assigned to logical processors in a
     physical package. The value may not be the same as the number of
     logical processors that are present in the hardware of a physical
     package.

It uses the word "may not".

However, the justification of the change cannot be "it leads to 
unexpected results in guest" because the guest implementation is not 
correct.

>>
>> - EBX[23:16]: Maximum number of addressable IDs for logical processors
>>   in this physical package
>>
>> - pow2ceil(EBX[23:16]): the number of unique initial APIC IDs reserved
>>   for addressing different logical processors in a physical package.
>>
>> [1] https://cdrdv2-public.intel.com/759067/intel-64-architecture- 
>> processor-topology-enumeration.pdf
>>
>>> And on AMD platform:
>>> CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
>>> result meets our expectation.
>>>
>>> So let us round up CPUID.01H.EBX[23:16] to the nearest power-of-2 
>>> integer
>>> only for Intel platform to solve the unexpected result.
>>>
>>> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
>>> Acked-by: Igor Mammedov <imammedo@redhat.com>
>>> Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
>>> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
>>> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
>>> ---
>>>   target/i386/cpu.c | 10 +++++++++-
>>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index ff227a8c5c..641d4577b0 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -6462,7 +6462,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t 
>>> index, uint32_t count,
>>>           }
>>>           *edx = env->features[FEAT_1_EDX];
>>>           if (threads_per_pkg > 1) {
>>> -            *ebx |= threads_per_pkg << 16;
>>> +            /*
>>> +             * AMD requires logical processor count, but Intel needs 
>>> maximum
>>> +             * number of addressable IDs for logical processors per 
>>> package.
>>> +             */
>>> +            if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
>>> +                *ebx |= threads_per_pkg << 16;
>>> +            } else {
>>> +                *ebx |= 1 << apicid_pkg_offset(&topo_info) << 16;
>>> +            }
>>>               *edx |= CPUID_HT;
>>>           }
>>>           if (!cpu->enable_pmu) {
>>



  reply	other threads:[~2024-10-14  0:37 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-09  3:56 [PATCH v6] i386/cpu: fixup number of addressable IDs for logical processors in the physical package Chuang Xu
2024-10-09  4:21 ` Zhao Liu
2024-10-12  7:13 ` Xiaoyao Li
2024-10-12  8:10   ` Chuang Xu
2024-10-12  8:32     ` Xiaoyao Li
2024-10-12  8:56       ` Zhao Liu
2024-10-12  8:21 ` Xiaoyao Li
2024-10-12  9:28   ` Zhao Liu
2024-10-12  9:35   ` Chuang Xu
2024-10-14  0:36     ` Xiaoyao Li [this message]
2024-10-14  1:32       ` Xiaoyao Li
2024-10-14  3:36       ` Zhao Liu
2024-10-17  8:18         ` Xiaoyao Li
2024-10-17  9:03           ` Zhao Liu
2024-10-28 16:07             ` Xiaoyao Li
2024-12-03  7:33               ` Zhao Liu
2024-12-03 15:04                 ` Xiaoyao Li
2024-12-03 15:35                   ` Zhao Liu
2024-12-03 15:29                 ` Daniel P. Berrangé
2024-12-03  7:36 ` Zhao Liu
2024-12-03  7:29   ` Chuang Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a48fcd78-d1c4-4359-bc18-d04147a93f50@intel.com \
    --to=xiaoyao.li@intel.com \
    --cc=chaiwen.cc@bytedance.com \
    --cc=imammedo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=weiguixiong@bytedance.com \
    --cc=xieyongji@bytedance.com \
    --cc=xuchuangxclwt@bytedance.com \
    --cc=yinyipeng@bytedance.com \
    --cc=zhao1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).