qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow
@ 2025-02-27  6:25 Zhao Liu
  2025-02-27  6:25 ` [PATCH 1/4] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16] Zhao Liu
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Zhao Liu @ 2025-02-27  6:25 UTC (permalink / raw)
  To: Paolo Bonzini, Igor Mammedov, Daniel P . Berrangé, Chuang Xu,
	Xiaoyao Li, Isaku Yamahata, Babu Moger
  Cc: qemu-devel, Zhao Liu

Hi,

This series collects and organizes several topology-related cleanups and
fixes, based on b69801dd6b1e ("Merge tag 'for_upstream' of
https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging").

Patch 1 is picked from Chuang's v6 [1].

Patch 2-3 are picked from Qian's v4 [2], though it had previously gone
through sufficient review (got R/b tags), I dropped its R/b tags because
of my code change.

Patch 4 is newly added, inspired by patch 3, to also perform a check on
AMD's cache CPUID. This is to consider the current maximum number of
supported CPUs, which is approaching the overflow boundary.

In addition to the 0x1, 0x4, and 0x8000001d leaves involved in the patch
series, there is also the 0x1f leaf related to topology. However, the
upper limit for CPUID.1FH.EBX[bits 15:0] is 65,535 threads, which
provides enough room. Therefore, this field does not currently require
overflow checks.

This series correct the CPUIDs, but it doesn't affect the Guest's live
migration. Therefore, I did not add the compat property for this.

[1]: https://lore.kernel.org/qemu-devel/20241009035638.59330-1-xuchuangxclwt@bytedance.com/
[2]: https://lore.kernel.org/qemu-devel/20230829042405.932523-2-qian.wen@intel.com/

Thanks and Best Regards,
Zhao
---
Chuang Xu (1):
  i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]

Qian Wen (2):
  i386/cpu: Fix cpu number overflow in CPUID.01H.EBX[23:16]
  i386/cpu: Fix overflow of cache topology fields in CPUID.04H

Zhao Liu (1):
  i386/cpu: Honor maximum value for CPUID.8000001DH.EAX[25:14]

 target/i386/cpu.c | 35 ++++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]
  2025-02-27  6:25 [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow Zhao Liu
@ 2025-02-27  6:25 ` Zhao Liu
  2025-05-12  9:32   ` Michael Tokarev
  2025-02-27  6:25 ` [PATCH 2/4] i386/cpu: Fix cpu number overflow in CPUID.01H.EBX[23:16] Zhao Liu
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Zhao Liu @ 2025-02-27  6:25 UTC (permalink / raw)
  To: Paolo Bonzini, Igor Mammedov, Daniel P . Berrangé, Chuang Xu,
	Xiaoyao Li, Isaku Yamahata, Babu Moger
  Cc: qemu-devel, Zhao Liu, qemu-stable, Guixiong Wei, Yipeng Yin

From: Chuang Xu <xuchuangxclwt@bytedance.com>

When QEMU is started with:
-cpu host,migratable=on,host-cache-info=on,l3-cache=off
-smp 180,sockets=2,dies=1,cores=45,threads=2

On Intel platform:
CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
logical processors in the physical package".

When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 for
CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
CPUID.04H.EAX[31:26], which matches the expected result.

As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
it's necessary to round up CPUID.01H.EBX[23:16] to the nearest power-of-2
integer too. Otherwise there would be unexpected results in guest with
older kernel.

For example, when QEMU is started with CLI above and xtopology is disabled,
guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) to
calculate threads-per-core in detect_ht(). Then guest will get "90/(1+63)=1"
as the result, even though threads-per-core should actually be 2.

And on AMD platform:
CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
result meets our expectation.

So round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer only
for Intel platform to solve the unexpected result.

This change doesn't need to add compat property since it does not affect
live migration between different versions of pc machines.

Cc: qemu-stable@nongnu.org
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since original v6 [*] :
 * Rebase on the b69801dd6b1e ("Merge tag 'for_upstream' of
   https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging").
 * Polish the comment in code.
 * Explain the change doesn't need extra compat property.

[*] original v6: https://lore.kernel.org/qemu-devel/20241009035638.59330-1-xuchuangxclwt@bytedance.com/
---
 target/i386/cpu.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 72ab147e851a..b8a78276cd50 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6691,7 +6691,16 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         }
         *edx = env->features[FEAT_1_EDX];
         if (threads_per_pkg > 1) {
-            *ebx |= threads_per_pkg << 16;
+            /*
+             * For CPUID.01H.EBX[Bits 23-16], AMD requires logical processor
+             * count, but Intel needs maximum number of addressable IDs for
+             * logical processors per package.
+             */
+            if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
+                *ebx |= threads_per_pkg << 16;
+            } else {
+                *ebx |= 1 << apicid_pkg_offset(topo_info) << 16;
+            }
         }
         if (!cpu->enable_pmu) {
             *ecx &= ~CPUID_EXT_PDCM;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] i386/cpu: Fix cpu number overflow in CPUID.01H.EBX[23:16]
  2025-02-27  6:25 [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow Zhao Liu
  2025-02-27  6:25 ` [PATCH 1/4] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16] Zhao Liu
@ 2025-02-27  6:25 ` Zhao Liu
  2025-02-27  7:13   ` Xiaoyao Li
  2025-02-27  6:25 ` [PATCH 3/4] i386/cpu: Fix overflow of cache topology fields in CPUID.04H Zhao Liu
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Zhao Liu @ 2025-02-27  6:25 UTC (permalink / raw)
  To: Paolo Bonzini, Igor Mammedov, Daniel P . Berrangé, Chuang Xu,
	Xiaoyao Li, Isaku Yamahata, Babu Moger
  Cc: qemu-devel, Zhao Liu, Qian Wen, qemu-stable

From: Qian Wen <qian.wen@intel.com>

The legacy topology enumerated by CPUID.1.EBX[23:16] is defined in SDM
Vol2:

Bits 23-16: Maximum number of addressable IDs for logical processors in
this physical package.

When threads_per_socket > 255, it will 1) overwrite bits[31:24] which is
apic_id, 2) bits [23:16] get truncated.

Specifically, if launching the VM with -smp 256, the value written to
EBX[23:16] is 0 because of data overflow. If the guest only supports
legacy topology, without V2 Extended Topology enumerated by CPUID.0x1f
or Extended Topology enumerated by CPUID.0x0b to support over 255 CPUs,
the return of the kernel invoking cpu_smt_allowed() is false and APs
(application processors) will fail to bring up. Then only CPU 0 is online,
and others are offline.

For example, launch VM via:
qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \
    -cpu qemu64,cpuid-0xb=off -smp 256 -m 32G \
    -drive file=guest.img,if=none,id=virtio-disk0,format=raw \
    -device virtio-blk-pci,drive=virtio-disk0,bootindex=1 --nographic

The guest shows:
    CPU(s):               256
    On-line CPU(s) list:  0
    Off-line CPU(s) list: 1-255

To avoid this issue caused by overflow, limit the max value written to
EBX[23:16] to 255 as the HW does.

Cc: qemu-stable@nongnu.org
Signed-off-by: Qian Wen <qian.wen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since original v4 [*]:
 * Rebase on addressable ID fixup.
 * Drop R/b tags since the code base changes.

[*] original v4: https://lore.kernel.org/qemu-devel/20230829042405.932523-2-qian.wen@intel.com/
---
 target/i386/cpu.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index b8a78276cd50..ae6c8bfd8b5e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6691,16 +6691,21 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         }
         *edx = env->features[FEAT_1_EDX];
         if (threads_per_pkg > 1) {
+            uint32_t num;
+
             /*
              * For CPUID.01H.EBX[Bits 23-16], AMD requires logical processor
              * count, but Intel needs maximum number of addressable IDs for
              * logical processors per package.
              */
             if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
-                *ebx |= threads_per_pkg << 16;
+                num = threads_per_pkg;
             } else {
-                *ebx |= 1 << apicid_pkg_offset(topo_info) << 16;
+                num = 1 << apicid_pkg_offset(topo_info);
             }
+
+            /* Fixup overflow: max value for bits 23-16 is 255. */
+            *ebx |= MIN(num, 255) << 16;
         }
         if (!cpu->enable_pmu) {
             *ecx &= ~CPUID_EXT_PDCM;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] i386/cpu: Fix overflow of cache topology fields in CPUID.04H
  2025-02-27  6:25 [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow Zhao Liu
  2025-02-27  6:25 ` [PATCH 1/4] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16] Zhao Liu
  2025-02-27  6:25 ` [PATCH 2/4] i386/cpu: Fix cpu number overflow in CPUID.01H.EBX[23:16] Zhao Liu
@ 2025-02-27  6:25 ` Zhao Liu
  2025-02-27  7:14   ` Xiaoyao Li
  2025-02-27  6:25 ` [PATCH 4/4] i386/cpu: Honor maximum value for CPUID.8000001DH.EAX[25:14] Zhao Liu
  2025-05-21  8:53 ` [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow Zhao Liu
  4 siblings, 1 reply; 10+ messages in thread
From: Zhao Liu @ 2025-02-27  6:25 UTC (permalink / raw)
  To: Paolo Bonzini, Igor Mammedov, Daniel P . Berrangé, Chuang Xu,
	Xiaoyao Li, Isaku Yamahata, Babu Moger
  Cc: qemu-devel, Zhao Liu, Qian Wen

From: Qian Wen <qian.wen@intel.com>

According to SDM, CPUID.0x4:EAX[31:26] indicates the Maximum number of
addressable IDs for processor cores in the physical package. If we
launch over 64 cores VM, the 6-bit field will overflow, and the wrong
core_id number will be reported.

Since the HW reports 0x3f when the intel processor has over 64 cores,
limit the max value written to EAX[31:26] to 63, so max num_cores should
be 64.

For EAX[14:25], though at present Q35 supports up to 4096 CPUs, to
prevent potential overflow issues from further increasing the number of
CPUs in the future, check and honor the maximum value for EAX[14:25] as
well.

In addition, for host-cache-info case, also apply the same checks and
fixes.

Signed-off-by: Qian Wen <qian.wen@intel.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
Changes since original v4 [*]:
 * Rebase on addressable ID fixup.
 * Drop R/b tags since the code base changes.
 * Teak bits 25-14 as well and add the comment.
 * Fix overflow for host-cache-info case.

[*]: original v4: https://lore.kernel.org/qemu-devel/20230829042405.932523-3-qian.wen@intel.com/
---
 target/i386/cpu.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ae6c8bfd8b5e..d75175b0850a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -280,11 +280,17 @@ static void encode_cache_cpuid4(CPUCacheInfo *cache,
     assert(cache->size == cache->line_size * cache->associativity *
                           cache->partitions * cache->sets);
 
+    /*
+     * The following fields have bit-width limitations, so consider the
+     * maximum values to avoid overflow:
+     * Bits 25-14: maximum 4095.
+     * Bits 31-26: maximum 63.
+     */
     *eax = CACHE_TYPE(cache->type) |
            CACHE_LEVEL(cache->level) |
            (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
-           (max_core_ids_in_package(topo_info) << 26) |
-           (max_thread_ids_for_cache(topo_info, cache->share_level) << 14);
+           (MIN(max_core_ids_in_package(topo_info), 63) << 26) |
+           (MIN(max_thread_ids_for_cache(topo_info, cache->share_level), 4095) << 14);
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
@@ -6743,13 +6749,13 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
                 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
 
                 *eax &= ~0xFC000000;
-                *eax |= max_core_ids_in_package(topo_info) << 26;
+                *eax |= MIN(max_core_ids_in_package(topo_info), 63) << 26;
                 if (host_vcpus_per_cache > threads_per_pkg) {
                     *eax &= ~0x3FFC000;
 
                     /* Share the cache at package level. */
-                    *eax |= max_thread_ids_for_cache(topo_info,
-                                CPU_TOPOLOGY_LEVEL_SOCKET) << 14;
+                    *eax |= MIN(max_thread_ids_for_cache(topo_info,
+                                CPU_TOPOLOGY_LEVEL_SOCKET), 4095) << 14;
                 }
             }
         } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] i386/cpu: Honor maximum value for CPUID.8000001DH.EAX[25:14]
  2025-02-27  6:25 [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow Zhao Liu
                   ` (2 preceding siblings ...)
  2025-02-27  6:25 ` [PATCH 3/4] i386/cpu: Fix overflow of cache topology fields in CPUID.04H Zhao Liu
@ 2025-02-27  6:25 ` Zhao Liu
  2025-05-21  8:53 ` [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow Zhao Liu
  4 siblings, 0 replies; 10+ messages in thread
From: Zhao Liu @ 2025-02-27  6:25 UTC (permalink / raw)
  To: Paolo Bonzini, Igor Mammedov, Daniel P . Berrangé, Chuang Xu,
	Xiaoyao Li, Isaku Yamahata, Babu Moger
  Cc: qemu-devel, Zhao Liu

CPUID.8000001DH:EAX[25:14] is "NumSharingCache", and the number of
logical processors sharing this cache is the value of this field
incremented by 1. Because of its width limitation, the maximum value
currently supported is 4095.

Though at present Q35 supports up to 4096 CPUs, to prevent potential
overflow issues from further increasing the number of CPUs in the
future, check and honor the maximum value as CPUID.04H did.

Cc: Babu Moger <babu.moger@amd.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
RFC:
 * Although there are currently no overflow cases, to avoid any
   potential issue, add the overflow check, just as I did for Intel.
---
 target/i386/cpu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index d75175b0850a..7ca9740e8c97 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -493,7 +493,8 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo *cache,
 
     *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
                (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
-    *eax |= max_thread_ids_for_cache(topo_info, cache->share_level) << 14;
+    /* Bits 25:14 - NumSharingCache: maximum 4095. */
+    *eax |= MIN(max_thread_ids_for_cache(topo_info, cache->share_level), 4095) << 14;
 
     assert(cache->line_size > 0);
     assert(cache->partitions > 0);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] i386/cpu: Fix cpu number overflow in CPUID.01H.EBX[23:16]
  2025-02-27  6:25 ` [PATCH 2/4] i386/cpu: Fix cpu number overflow in CPUID.01H.EBX[23:16] Zhao Liu
@ 2025-02-27  7:13   ` Xiaoyao Li
  0 siblings, 0 replies; 10+ messages in thread
From: Xiaoyao Li @ 2025-02-27  7:13 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Igor Mammedov, Daniel P . Berrangé,
	Chuang Xu, Isaku Yamahata, Babu Moger
  Cc: qemu-devel, Qian Wen, qemu-stable

On 2/27/2025 2:25 PM, Zhao Liu wrote:
> From: Qian Wen <qian.wen@intel.com>
> 
> The legacy topology enumerated by CPUID.1.EBX[23:16] is defined in SDM
> Vol2:
> 
> Bits 23-16: Maximum number of addressable IDs for logical processors in
> this physical package.
> 
> When threads_per_socket > 255, it will 1) overwrite bits[31:24] which is
> apic_id, 2) bits [23:16] get truncated.
> 
> Specifically, if launching the VM with -smp 256, the value written to
> EBX[23:16] is 0 because of data overflow. If the guest only supports
> legacy topology, without V2 Extended Topology enumerated by CPUID.0x1f
> or Extended Topology enumerated by CPUID.0x0b to support over 255 CPUs,
> the return of the kernel invoking cpu_smt_allowed() is false and APs
> (application processors) will fail to bring up. Then only CPU 0 is online,
> and others are offline.
> 
> For example, launch VM via:
> qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \
>      -cpu qemu64,cpuid-0xb=off -smp 256 -m 32G \
>      -drive file=guest.img,if=none,id=virtio-disk0,format=raw \
>      -device virtio-blk-pci,drive=virtio-disk0,bootindex=1 --nographic
> 
> The guest shows:
>      CPU(s):               256
>      On-line CPU(s) list:  0
>      Off-line CPU(s) list: 1-255
> 
> To avoid this issue caused by overflow, limit the max value written to
> EBX[23:16] to 255 as the HW does.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Qian Wen <qian.wen@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> ---
> Changes since original v4 [*]:
>   * Rebase on addressable ID fixup.
>   * Drop R/b tags since the code base changes.
> 
> [*] original v4: https://lore.kernel.org/qemu-devel/20230829042405.932523-2-qian.wen@intel.com/
> ---
>   target/i386/cpu.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index b8a78276cd50..ae6c8bfd8b5e 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -6691,16 +6691,21 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>           }
>           *edx = env->features[FEAT_1_EDX];
>           if (threads_per_pkg > 1) {
> +            uint32_t num;
> +
>               /*
>                * For CPUID.01H.EBX[Bits 23-16], AMD requires logical processor
>                * count, but Intel needs maximum number of addressable IDs for
>                * logical processors per package.
>                */
>               if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {
> -                *ebx |= threads_per_pkg << 16;
> +                num = threads_per_pkg;
>               } else {
> -                *ebx |= 1 << apicid_pkg_offset(topo_info) << 16;
> +                num = 1 << apicid_pkg_offset(topo_info);
>               }
> +
> +            /* Fixup overflow: max value for bits 23-16 is 255. */
> +            *ebx |= MIN(num, 255) << 16;
>           }
>           if (!cpu->enable_pmu) {
>               *ecx &= ~CPUID_EXT_PDCM;



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/4] i386/cpu: Fix overflow of cache topology fields in CPUID.04H
  2025-02-27  6:25 ` [PATCH 3/4] i386/cpu: Fix overflow of cache topology fields in CPUID.04H Zhao Liu
@ 2025-02-27  7:14   ` Xiaoyao Li
  0 siblings, 0 replies; 10+ messages in thread
From: Xiaoyao Li @ 2025-02-27  7:14 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Igor Mammedov, Daniel P . Berrangé,
	Chuang Xu, Isaku Yamahata, Babu Moger
  Cc: qemu-devel, Qian Wen

On 2/27/2025 2:25 PM, Zhao Liu wrote:
> From: Qian Wen <qian.wen@intel.com>
> 
> According to SDM, CPUID.0x4:EAX[31:26] indicates the Maximum number of
> addressable IDs for processor cores in the physical package. If we
> launch over 64 cores VM, the 6-bit field will overflow, and the wrong
> core_id number will be reported.
> 
> Since the HW reports 0x3f when the intel processor has over 64 cores,
> limit the max value written to EAX[31:26] to 63, so max num_cores should
> be 64.
> 
> For EAX[14:25], though at present Q35 supports up to 4096 CPUs, to
> prevent potential overflow issues from further increasing the number of
> CPUs in the future, check and honor the maximum value for EAX[14:25] as
> well.
> 
> In addition, for host-cache-info case, also apply the same checks and
> fixes.
> 
> Signed-off-by: Qian Wen <qian.wen@intel.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> ---
> Changes since original v4 [*]:
>   * Rebase on addressable ID fixup.
>   * Drop R/b tags since the code base changes.
>   * Teak bits 25-14 as well and add the comment.
>   * Fix overflow for host-cache-info case.
> 
> [*]: original v4: https://lore.kernel.org/qemu-devel/20230829042405.932523-3-qian.wen@intel.com/
> ---
>   target/i386/cpu.c | 16 +++++++++++-----
>   1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index ae6c8bfd8b5e..d75175b0850a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -280,11 +280,17 @@ static void encode_cache_cpuid4(CPUCacheInfo *cache,
>       assert(cache->size == cache->line_size * cache->associativity *
>                             cache->partitions * cache->sets);
>   
> +    /*
> +     * The following fields have bit-width limitations, so consider the
> +     * maximum values to avoid overflow:
> +     * Bits 25-14: maximum 4095.
> +     * Bits 31-26: maximum 63.
> +     */
>       *eax = CACHE_TYPE(cache->type) |
>              CACHE_LEVEL(cache->level) |
>              (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0) |
> -           (max_core_ids_in_package(topo_info) << 26) |
> -           (max_thread_ids_for_cache(topo_info, cache->share_level) << 14);
> +           (MIN(max_core_ids_in_package(topo_info), 63) << 26) |
> +           (MIN(max_thread_ids_for_cache(topo_info, cache->share_level), 4095) << 14);
>   
>       assert(cache->line_size > 0);
>       assert(cache->partitions > 0);
> @@ -6743,13 +6749,13 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>                   int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
>   
>                   *eax &= ~0xFC000000;
> -                *eax |= max_core_ids_in_package(topo_info) << 26;
> +                *eax |= MIN(max_core_ids_in_package(topo_info), 63) << 26;
>                   if (host_vcpus_per_cache > threads_per_pkg) {
>                       *eax &= ~0x3FFC000;
>   
>                       /* Share the cache at package level. */
> -                    *eax |= max_thread_ids_for_cache(topo_info,
> -                                CPU_TOPOLOGY_LEVEL_SOCKET) << 14;
> +                    *eax |= MIN(max_thread_ids_for_cache(topo_info,
> +                                CPU_TOPOLOGY_LEVEL_SOCKET), 4095) << 14;
>                   }
>               }
>           } else if (cpu->vendor_cpuid_only && IS_AMD_CPU(env)) {



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]
  2025-02-27  6:25 ` [PATCH 1/4] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16] Zhao Liu
@ 2025-05-12  9:32   ` Michael Tokarev
  2025-05-13  3:42     ` Zhao Liu
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Tokarev @ 2025-05-12  9:32 UTC (permalink / raw)
  To: Zhao Liu, Paolo Bonzini, Igor Mammedov, Daniel P . Berrangé,
	Chuang Xu, Xiaoyao Li, Isaku Yamahata, Babu Moger
  Cc: qemu-devel, qemu-stable, Guixiong Wei, Yipeng Yin

On 27.02.2025 09:25, Zhao Liu wrote:
> From: Chuang Xu <xuchuangxclwt@bytedance.com>
> 
> When QEMU is started with:
> -cpu host,migratable=on,host-cache-info=on,l3-cache=off
> -smp 180,sockets=2,dies=1,cores=45,threads=2
> 
> On Intel platform:
> CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
> logical processors in the physical package".
> 
> When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 for
> CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
> executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
> CPUID.04H.EAX[31:26], which matches the expected result.
> 
> As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
> it's necessary to round up CPUID.01H.EBX[23:16] to the nearest power-of-2
> integer too. Otherwise there would be unexpected results in guest with
> older kernel.
> 
> For example, when QEMU is started with CLI above and xtopology is disabled,
> guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) to
> calculate threads-per-core in detect_ht(). Then guest will get "90/(1+63)=1"
> as the result, even though threads-per-core should actually be 2.
> 
> And on AMD platform:
> CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
> result meets our expectation.
> 
> So round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer only
> for Intel platform to solve the unexpected result.
> 
> This change doesn't need to add compat property since it does not affect
> live migration between different versions of pc machines.
> 
> Cc: qemu-stable@nongnu.org
> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
> Acked-by: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
> Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
> Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>

Ping?

Is this series still needed ?

Thanks,

/mjt


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/4] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]
  2025-05-12  9:32   ` Michael Tokarev
@ 2025-05-13  3:42     ` Zhao Liu
  0 siblings, 0 replies; 10+ messages in thread
From: Zhao Liu @ 2025-05-13  3:42 UTC (permalink / raw)
  To: Michael Tokarev, Paolo Bonzini
  Cc: Igor Mammedov, Daniel P . Berrangé, Chuang Xu, Xiaoyao Li,
	Isaku Yamahata, Babu Moger, qemu-devel, qemu-stable, Guixiong Wei,
	Yipeng Yin

Hi Paolo,

Kindly ping. I think this series is still needed. Could you pls take
a look at here?

Thanks,
Zhao

On Mon, May 12, 2025 at 12:32:31PM +0300, Michael Tokarev wrote:
> Date: Mon, 12 May 2025 12:32:31 +0300
> From: Michael Tokarev <mjt@tls.msk.ru>
> Subject: Re: [PATCH 1/4] i386/cpu: Fix number of addressable IDs field for
>  CPUID.01H.EBX[23:16]
> 
> On 27.02.2025 09:25, Zhao Liu wrote:
> > From: Chuang Xu <xuchuangxclwt@bytedance.com>
> > 
> > When QEMU is started with:
> > -cpu host,migratable=on,host-cache-info=on,l3-cache=off
> > -smp 180,sockets=2,dies=1,cores=45,threads=2
> > 
> > On Intel platform:
> > CPUID.01H.EBX[23:16] is defined as "max number of addressable IDs for
> > logical processors in the physical package".
> > 
> > When executing "cpuid -1 -l 1 -r" in the guest, we obtain a value of 90 for
> > CPUID.01H.EBX[23:16], whereas the expected value is 128. Additionally,
> > executing "cpuid -1 -l 4 -r" in the guest yields a value of 63 for
> > CPUID.04H.EAX[31:26], which matches the expected result.
> > 
> > As (1+CPUID.04H.EAX[31:26]) rounds up to the nearest power-of-2 integer,
> > it's necessary to round up CPUID.01H.EBX[23:16] to the nearest power-of-2
> > integer too. Otherwise there would be unexpected results in guest with
> > older kernel.
> > 
> > For example, when QEMU is started with CLI above and xtopology is disabled,
> > guest kernel 5.15.120 uses CPUID.01H.EBX[23:16]/(1+CPUID.04H.EAX[31:26]) to
> > calculate threads-per-core in detect_ht(). Then guest will get "90/(1+63)=1"
> > as the result, even though threads-per-core should actually be 2.
> > 
> > And on AMD platform:
> > CPUID.01H.EBX[23:16] is defined as "Logical processor count". Current
> > result meets our expectation.
> > 
> > So round up CPUID.01H.EBX[23:16] to the nearest power-of-2 integer only
> > for Intel platform to solve the unexpected result.
> > 
> > This change doesn't need to add compat property since it does not affect
> > live migration between different versions of pc machines.
> > 
> > Cc: qemu-stable@nongnu.org
> > Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
> > Acked-by: Igor Mammedov <imammedo@redhat.com>
> > Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
> > Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
> > Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> 
> Ping?
> 
> Is this series still needed ?
> 
> Thanks,
> 
> /mjt


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow
  2025-02-27  6:25 [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow Zhao Liu
                   ` (3 preceding siblings ...)
  2025-02-27  6:25 ` [PATCH 4/4] i386/cpu: Honor maximum value for CPUID.8000001DH.EAX[25:14] Zhao Liu
@ 2025-05-21  8:53 ` Zhao Liu
  4 siblings, 0 replies; 10+ messages in thread
From: Zhao Liu @ 2025-05-21  8:53 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Igor Mammedov, Daniel P . Berrangé, Chuang Xu, Xiaoyao Li,
	Isaku Yamahata, Babu Moger, qemu-devel

Hi Paolo,

A gentle poke. There's no conflict for now. Do you agree with these
fixes?

Thanks,
Zhao

On Thu, Feb 27, 2025 at 02:25:19PM +0800, Zhao Liu wrote:
> Date: Thu, 27 Feb 2025 14:25:19 +0800
> From: Zhao Liu <zhao1.liu@intel.com>
> Subject: [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow
> X-Mailer: git-send-email 2.34.1
> 
> Hi,
> 
> This series collects and organizes several topology-related cleanups and
> fixes, based on b69801dd6b1e ("Merge tag 'for_upstream' of
> https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging").
> 
> Patch 1 is picked from Chuang's v6 [1].
> 
> Patch 2-3 are picked from Qian's v4 [2], though it had previously gone
> through sufficient review (got R/b tags), I dropped its R/b tags because
> of my code change.
> 
> Patch 4 is newly added, inspired by patch 3, to also perform a check on
> AMD's cache CPUID. This is to consider the current maximum number of
> supported CPUs, which is approaching the overflow boundary.
> 
> In addition to the 0x1, 0x4, and 0x8000001d leaves involved in the patch
> series, there is also the 0x1f leaf related to topology. However, the
> upper limit for CPUID.1FH.EBX[bits 15:0] is 65,535 threads, which
> provides enough room. Therefore, this field does not currently require
> overflow checks.
> 
> This series correct the CPUIDs, but it doesn't affect the Guest's live
> migration. Therefore, I did not add the compat property for this.
> 
> [1]: https://lore.kernel.org/qemu-devel/20241009035638.59330-1-xuchuangxclwt@bytedance.com/
> [2]: https://lore.kernel.org/qemu-devel/20230829042405.932523-2-qian.wen@intel.com/
> 
> Thanks and Best Regards,
> Zhao
> ---
> Chuang Xu (1):
>   i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16]
> 
> Qian Wen (2):
>   i386/cpu: Fix cpu number overflow in CPUID.01H.EBX[23:16]
>   i386/cpu: Fix overflow of cache topology fields in CPUID.04H
> 
> Zhao Liu (1):
>   i386/cpu: Honor maximum value for CPUID.8000001DH.EAX[25:14]
> 
>  target/i386/cpu.c | 35 ++++++++++++++++++++++++++++-------
>  1 file changed, 28 insertions(+), 7 deletions(-)
> 
> -- 
> 2.34.1
> 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-05-21  8:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-27  6:25 [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow Zhao Liu
2025-02-27  6:25 ` [PATCH 1/4] i386/cpu: Fix number of addressable IDs field for CPUID.01H.EBX[23:16] Zhao Liu
2025-05-12  9:32   ` Michael Tokarev
2025-05-13  3:42     ` Zhao Liu
2025-02-27  6:25 ` [PATCH 2/4] i386/cpu: Fix cpu number overflow in CPUID.01H.EBX[23:16] Zhao Liu
2025-02-27  7:13   ` Xiaoyao Li
2025-02-27  6:25 ` [PATCH 3/4] i386/cpu: Fix overflow of cache topology fields in CPUID.04H Zhao Liu
2025-02-27  7:14   ` Xiaoyao Li
2025-02-27  6:25 ` [PATCH 4/4] i386/cpu: Honor maximum value for CPUID.8000001DH.EAX[25:14] Zhao Liu
2025-05-21  8:53 ` [PATCH 0/4] i386/cpu: Fix topological field encoding & overflow Zhao Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).