qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Xiaoyao Li <xiaoyao.li@intel.com>
To: Chuang Xu <xuchuangxclwt@bytedance.com>
Cc: pbonzini@redhat.com, imammedo@redhat.com,
	xieyongji@bytedance.com, chaiwen.cc@bytedance.com,
	zhao1.liu@intel.com, qemu-stable@nongnu.org,
	Guixiong Wei <weiguixiong@bytedance.com>,
	Yipeng Yin <yinyipeng@bytedance.com>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH v6] i386/cpu: fixup number of addressable IDs for logical processors in the physical package
Date: Mon, 14 Oct 2024 09:32:45 +0800	[thread overview]
Message-ID: <34bc76ae-4d8f-430e-a6f5-5e4c73606644@intel.com> (raw)
In-Reply-To: <a48fcd78-d1c4-4359-bc18-d04147a93f50@intel.com>

On 10/14/2024 8:36 AM, Xiaoyao Li wrote:
> On 10/12/2024 5:35 PM, Chuang Xu wrote:
>>
>> On 10/12/24 下午4:21, Xiaoyao Li wrote:
>>> On 10/9/2024 11:56 AM, Chuang Xu wrote:
>>>> When QEMU is started with:
>>>> -cpu host,migratable=on,host-cache-info=on,l3-cache=off
>>>> -smp 180,sockets=2,dies=1,cores=45,threads=2
>>>>
>>>> On Intel platform:
>>>> CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
>>>> logical processors in the physical package".
>>>>
>>>> When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 
>>>> 90 for
>>>> CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
>>>> executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
>>>> CPUID.04H.EAX[31:26], which matches the expected result.
>>>>
>>>> As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 
>>>> integer,
>>>> we'd beter round up CPUID.01H.EBX[23:16] to the nearest power-of-2
>>>> integer too. Otherwise we may encounter unexpected results in guest.
>>>>
>>>> For example, when QEMU is started with CLI above and xtopology is 
>>>> disabled,
>>>> guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/ 
>>>> (1+CPUID.04H.EAX[31:26]) to
>>>> calculate threads-per-core in detect_ht(). Then guest will get "90/ 
>>>> (1+63)=1"
>>>> as the result, even though threads-per-core should actually be 2.
>>>
>>> It's kernel's bug instead.
>>>
>>> In 1.5.3 "Sub ID Extraction Parameters for initial APIC ID" of "Intel 
>>> 64 Architecture Processor Topology Enumeration" [1], it is
>>>
>>>   - SMT_Mask_Width = Log2(RoundToNearestPof2(CPUID.1:EBX[23:16])/ 
>>> (CPUID.(EAX=4,ECX=0):EAX[31:26]) + 1))
>>>
>>> The value of CPUID.1:EBX[23:16] needs to be *rounded* to the neartest 
>>> power-of-two integer instead of itself being the power-of-two.
>>>
>>> This also is consistency with the SDM, where the comment for bit 
>>> 23-16 of CPUID.1:EBX is:
>>>
>>>   The nearest power-of-2 integer that is not smaller than EBX[23:16] is
>>>   the number of unique initial APIC IDs reserved for addressing
>>>   different logical processors in a physical package.
>>>
>>> What I read from this is, the nearest power-of-2 integer that is not 
>>> smaller than EBX[23:16] is a different thing than EBX[23:16]. i.e.,
>>
>> Yes, when I read sdm, I also thought it was a kernel bug. But on my 
>> 192ht spr host, the value of CPUID.1:EBX[23:16] was indeed rounded up
>>
>> to the nearest power of 2 by the hardware. After communicating with 
>> Intel technical support staff, we thought that perhaps the description 
>> in sdm
>>
>> is not so accurate, and rounding up CPUID.1:EBX[23:16] to the power of 
>> 2 in qemu may be more in line with the hardware behavior.
> 
> I think above justification is important. We need to justify our changes 
> with the fact and correct reason.
> 
> I somehow agree to set EBX[23:16] to a value of power-of-2, because the 
> 1.5.3 "Sub ID Extraction Parameters for initial APIC ID" of "Intel 64 
> Architecture Processor Topology Enumeration" spec says
> 
>      CPUID.1:EBX[23:16] represents the maximum number of addressable IDs
>      (initial APIC ID) that can be assigned to logical processors in a
>      physical package. The value may not be the same as the number of
>      logical processors that are present in the hardware of a physical
>      package.
> 
> It uses the word "may not".
> 
> However, the justification of the change cannot be "it leads to 
> unexpected results in guest" because the guest implementation is not 
> correct.

FYI, latest linux already fix the issue, it calculates the shift via

   tscan->ebx1_nproc_shift = get_count_order(ebx.nproc);

>>>
>>> - EBX[23:16]: Maximum number of addressable IDs for logical processors
>>>   in this physical package
>>>
>>> - pow2ceil(EBX[23:16]): the number of unique initial APIC IDs reserved
>>>   for addressing different logical processors in a physical package.
>>>
>>> [1] https://cdrdv2-public.intel.com/759067/intel-64-architecture- 
>>> processor-topology-enumeration.pdf
>>>
>>>> And on AMD platform:
>>>> CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
>>>> result meets our expectation.
>>>>
>>>> So let us round up CPUID.01H.EBX[23:16] to the nearest power-of-2 
>>>> integer
>>>> only for Intel platform to solve the unexpected result.
>>>>
>>>> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
>>>> Acked-by: Igor Mammedov <imammedo@redhat.com>
>>>> Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
>>>> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
>>>> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
>>>> ---
>>>>   target/i386/cpu.c | 10 +++++++++-
>>>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>>> index ff227a8c5c..641d4577b0 100644
>>>> --- a/target/i386/cpu.c
>>>> +++ b/target/i386/cpu.c
>>>> @@ -6462,7 +6462,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t 
>>>> index, uint32_t count,
>>>>           }
>>>>           *edx = env->features[FEAT_1_EDX];
>>>>           if (threads_per_pkg > 1) {
>>>> -            *ebx |= threads_per_pkg << 16;
>>>> +            /*
>>>> +             * AMD requires logical processor count, but Intel 
>>>> needs maximum
>>>> +             * number of addressable IDs for logical processors per 
>>>> package.
>>>> +             */
>>>> +            if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
>>>> +                *ebx |= threads_per_pkg << 16;
>>>> +            } else {
>>>> +                *ebx |= 1 << apicid_pkg_offset(&topo_info) << 16;
>>>> +            }
>>>>               *edx |= CPUID_HT;
>>>>           }
>>>>           if (!cpu->enable_pmu) {
>>>
> 
> 



  reply	other threads:[~2024-10-14  1:33 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-09  3:56 [PATCH v6] i386/cpu: fixup number of addressable IDs for logical processors in the physical package Chuang Xu
2024-10-09  4:21 ` Zhao Liu
2024-10-12  7:13 ` Xiaoyao Li
2024-10-12  8:10   ` Chuang Xu
2024-10-12  8:32     ` Xiaoyao Li
2024-10-12  8:56       ` Zhao Liu
2024-10-12  8:21 ` Xiaoyao Li
2024-10-12  9:28   ` Zhao Liu
2024-10-12  9:35   ` Chuang Xu
2024-10-14  0:36     ` Xiaoyao Li
2024-10-14  1:32       ` Xiaoyao Li [this message]
2024-10-14  3:36       ` Zhao Liu
2024-10-17  8:18         ` Xiaoyao Li
2024-10-17  9:03           ` Zhao Liu
2024-10-28 16:07             ` Xiaoyao Li
2024-12-03  7:33               ` Zhao Liu
2024-12-03 15:04                 ` Xiaoyao Li
2024-12-03 15:35                   ` Zhao Liu
2024-12-03 15:29                 ` Daniel P. Berrangé
2024-12-03  7:36 ` Zhao Liu
2024-12-03  7:29   ` Chuang Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=34bc76ae-4d8f-430e-a6f5-5e4c73606644@intel.com \
    --to=xiaoyao.li@intel.com \
    --cc=chaiwen.cc@bytedance.com \
    --cc=imammedo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-stable@nongnu.org \
    --cc=weiguixiong@bytedance.com \
    --cc=xieyongji@bytedance.com \
    --cc=xuchuangxclwt@bytedance.com \
    --cc=yinyipeng@bytedance.com \
    --cc=zhao1.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).