qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5] i386/cpu: fixup number of addressable IDs for logical processors in the physical package
@ 2024-10-08 13:33 Chuang Xu
  2024-10-08 14:21 ` Zhao Liu
  0 siblings, 1 reply; 2+ messages in thread
From: Chuang Xu @ 2024-10-08 13:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, imammedo, xieyongji, chaiwen.cc, zhao1.liu, qemu-stable,
	Chuang Xu, Guixiong Wei, Yipeng Yin

When QEMU is started with:
-cpu host,migratable=on,host-cache-info=on,l3-cache=off
-smp 180,sockets=2,dies=1,cores=45,threads=2

When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 for
CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
CPUID.04H.EAX[31:26], which matches the expected result.

As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
we'd beter round up CPUID.01H.EBX[23:16] to the nearest power-of-2
integer too. Otherwise we may encounter unexpected results in guest.

For example, when QEMU is started with CLI above and xtopology is disabled,
guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) to
calculate threads-per-core in detect_ht(). Then guest will get "90/(1+63)=1"
as the result, even though threads-per-core should actually be 2.

So let us round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer
to solve the unexpected result.

In addition, we introduce max_thread_number_in_package() instead of
using pow2ceil() to be compatible with smp and hybrid.

Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
---
 target/i386/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ff227a8c5c..0749efc52c 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6462,7 +6462,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         }
         *edx = env->features[FEAT_1_EDX];
         if (threads_per_pkg > 1) {
-            *ebx |= threads_per_pkg << 16;
+            *ebx |= 1 << apicid_pkg_offset(&topo_info) << 16;
             *edx |= CPUID_HT;
         }
         if (!cpu->enable_pmu) {
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH v5] i386/cpu: fixup number of addressable IDs for logical processors in the physical package
  2024-10-08 13:33 [PATCH v5] i386/cpu: fixup number of addressable IDs for logical processors in the physical package Chuang Xu
@ 2024-10-08 14:21 ` Zhao Liu
  0 siblings, 0 replies; 2+ messages in thread
From: Zhao Liu @ 2024-10-08 14:21 UTC (permalink / raw)
  To: Chuang Xu
  Cc: qemu-devel, pbonzini, imammedo, xieyongji, chaiwen.cc,
	qemu-stable, Guixiong Wei, Yipeng Yin, Babu Moger

Hi Chuang,

Many thanks for the quick action! But we still need some more patience
to consider AMD case. (Cc Babu)

I just realized AMD and Intel have different definitions for this field...

On Tue, Oct 08, 2024 at 09:33:26PM +0800, Chuang Xu wrote:
> Date: Tue,  8 Oct 2024 21:33:26 +0800
> From: Chuang Xu <xuchuangxclwt@bytedance.com>
> Subject: [PATCH v5] i386/cpu: fixup number of addressable IDs for logical
>  processors in the physical package
> X-Mailer: git-send-email 2.39.3 (Apple Git-146)
> 
> When QEMU is started with:
> -cpu host,migratable=on,host-cache-info=on,l3-cache=off
> -smp 180,sockets=2,dies=1,cores=45,threads=2
> 
> When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 for
> CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
> executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
> CPUID.04H.EAX[31:26], which matches the expected result.
> 
> As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
> we'd beter round up CPUID.01H.EBX[23:16] to the nearest power-of-2
> integer too. Otherwise we may encounter unexpected results in guest.
> 
> For example, when QEMU is started with CLI above and xtopology is disabled,
> guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) to
> calculate threads-per-core in detect_ht(). Then guest will get "90/(1+63)=1"
> as the result, even though threads-per-core should actually be 2.
> 
> So let us round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer
> to solve the unexpected result.
> 
> In addition, we introduce max_thread_number_in_package() instead of
> using pow2ceil() to be compatible with smp and hybrid.
> 
> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
> Acked-by: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
> ---
>  target/i386/cpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index ff227a8c5c..0749efc52c 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6462,7 +6462,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>          }
>          *edx = env->features[FEAT_1_EDX];
>          if (threads_per_pkg > 1) {
> -            *ebx |= threads_per_pkg << 16;
> +            *ebx |= 1 << apicid_pkg_offset(&topo_info) << 16;

... I checked AMD's APM: for AMD, this field is "Logical processor
count", not max addressable IDs number (pls refer APM, vol 3, E.3.2 and
E.5.1).

Then we need to check the vender here, like this (with a note of
explanation):

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ff227a8c5c87..1f144b30e98e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6462,7 +6462,15 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         }
         *edx = env->features[FEAT_1_EDX];
         if (threads_per_pkg > 1) {
-            *ebx |= threads_per_pkg << 16;
+            /*
+             * AMD requires logical processor count, but Intel needs maximum
+             * number of addressable IDs for logical processors per package.
+             */
+            if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
+                *ebx |= threads_per_pkg << 16;
+            } else {
+                *ebx |= 1 << apicid_pkg_offset(&topo_info) << 16;
+            }
+
             *edx |= CPUID_HT;
         }
         if (!cpu->enable_pmu) {

In addition, it's necessary to briefly mention the differences between
AMD and Intel for this field in the commit message, similar to my comment
example, and mention that the case you're comparing is on an Intel platform.

Thanks,
Zhao



^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-10-08 14:05 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-08 13:33 [PATCH v5] i386/cpu: fixup number of addressable IDs for logical processors in the physical package Chuang Xu
2024-10-08 14:21 ` Zhao Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).