qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
@ 2024-11-08  7:06 Xiaoyao Li
  2024-11-08  7:06 ` [PATCH v1 1/4] cpu: Introduce qemu_early_init_vcpu() to initialize nr_cores and nr_threads inside it Xiaoyao Li
                   ` (5 more replies)
  0 siblings, 6 replies; 26+ messages in thread
From: Xiaoyao Li @ 2024-11-08  7:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, David Hildenbrand,
	Mark Cave-Ayland, Artyom Tarasenko, Bastian Koppelmann,
	Max Filippov, qemu-devel, xiaoyao.li

This series is extracted from TDX QEMU v6[1] series per Paolo's request.

It is originally motivated by x86 TDX to track CPUID_HT in env->features[]
which requires nr_cores and nr_cores being initialized earlier than in
qemu_init_vcpu().

Initialize of nr_cores and nr_threads earlier in x86's cpu_realizefn()
can make it work for x86 but it's duplicated with the initialization in
qemu_init_vcpu() which are used by all the ARCHes. Since initialize them
earlier also work for other ARCHes, introduce qemu_init_early_vcpu() to
hold the initialization of nr_cores and nr_threads and call it at the
beginning in realizefn() for each ARCH.

Note, I only tested it for x86 ARCH. Please help test on other ARCHes.

The following patch 2 - 4 are x86 specific.

[1] https://lore.kernel.org/qemu-devel/CABgObfZVxaQL4FSJX396kAJ67Qp=XhEWwcmv+NQZCbdpfbV9xg@mail.gmail.com/

Xiaoyao Li (4):
  cpu: Introduce qemu_early_init_vcpu() to initialize nr_cores and
    nr_threads inside it
  i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of
    cpu_x86_cpuid()
  i386/cpu: Set and track CPUID_EXT3_CMP_LEG in
    env->features[FEAT_8000_0001_ECX]
  i386/cpu: Rectify the comment on order dependency on qemu_init_vcpu()

 accel/tcg/user-exec-stub.c |  4 +++
 hw/core/cpu-common.c       |  2 +-
 include/hw/core/cpu.h      |  8 +++++
 system/cpus.c              |  6 +++-
 target/alpha/cpu.c         |  2 ++
 target/arm/cpu.c           |  2 ++
 target/avr/cpu.c           |  2 ++
 target/hexagon/cpu.c       |  2 ++
 target/hppa/cpu.c          |  2 ++
 target/i386/cpu.c          | 61 +++++++++++++++++++-------------------
 target/loongarch/cpu.c     |  2 ++
 target/m68k/cpu.c          |  2 ++
 target/microblaze/cpu.c    |  2 ++
 target/mips/cpu.c          |  2 ++
 target/openrisc/cpu.c      |  2 ++
 target/ppc/cpu_init.c      |  2 ++
 target/riscv/cpu.c         |  2 ++
 target/rx/cpu.c            |  2 ++
 target/s390x/cpu.c         |  2 ++
 target/sh4/cpu.c           |  2 ++
 target/sparc/cpu.c         |  2 ++
 target/tricore/cpu.c       |  2 ++
 target/xtensa/cpu.c        |  2 ++
 23 files changed, 85 insertions(+), 32 deletions(-)

-- 
2.34.1



^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v1 1/4] cpu: Introduce qemu_early_init_vcpu() to initialize nr_cores and nr_threads inside it
  2024-11-08  7:06 [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup Xiaoyao Li
@ 2024-11-08  7:06 ` Xiaoyao Li
  2024-11-22 16:03   ` [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn() Xiaoyao Li
  2024-11-08  7:06 ` [PATCH v1 2/4] i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid() Xiaoyao Li
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Xiaoyao Li @ 2024-11-08  7:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, David Hildenbrand,
	Mark Cave-Ayland, Artyom Tarasenko, Bastian Koppelmann,
	Max Filippov, qemu-devel, xiaoyao.li

Currently cpu->nr_cores and cpu->nr_threads are initialized in
qemu_init_vcpu(), which is called a bit late in *cpu_realizefn() for
each ARCHes.

x86 arch would like to use nr_cores and nr_threads earlier in its
realizefn(). Introduce qemu_early_init_vcpu() and move the
initialization of nr_cores and nr_threads from qemu_init_vcpu() to it.
Call qemu_early_init_vcpu() at the beginning of realizefn() for each
ARCH.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
This patch only is testes on x86 machine, please help test on other ARCHes.
---
 accel/tcg/user-exec-stub.c | 4 ++++
 hw/core/cpu-common.c       | 2 +-
 include/hw/core/cpu.h      | 8 ++++++++
 system/cpus.c              | 6 +++++-
 target/alpha/cpu.c         | 2 ++
 target/arm/cpu.c           | 2 ++
 target/avr/cpu.c           | 2 ++
 target/hexagon/cpu.c       | 2 ++
 target/hppa/cpu.c          | 2 ++
 target/i386/cpu.c          | 2 ++
 target/loongarch/cpu.c     | 2 ++
 target/m68k/cpu.c          | 2 ++
 target/microblaze/cpu.c    | 2 ++
 target/mips/cpu.c          | 2 ++
 target/openrisc/cpu.c      | 2 ++
 target/ppc/cpu_init.c      | 2 ++
 target/riscv/cpu.c         | 2 ++
 target/rx/cpu.c            | 2 ++
 target/s390x/cpu.c         | 2 ++
 target/sh4/cpu.c           | 2 ++
 target/sparc/cpu.c         | 2 ++
 target/tricore/cpu.c       | 2 ++
 target/xtensa/cpu.c        | 2 ++
 23 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/accel/tcg/user-exec-stub.c b/accel/tcg/user-exec-stub.c
index 4fbe2dbdc883..64baf917b55c 100644
--- a/accel/tcg/user-exec-stub.c
+++ b/accel/tcg/user-exec-stub.c
@@ -10,6 +10,10 @@ void cpu_remove_sync(CPUState *cpu)
 {
 }
 
+void qemu_early_init_vcpu(CPUState *cpu)
+{
+}
+
 void qemu_init_vcpu(CPUState *cpu)
 {
 }
diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index 09c79035949b..3b60e62228e4 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -242,7 +242,7 @@ static void cpu_common_initfn(Object *obj)
     cpu->cpu_index = UNASSIGNED_CPU_INDEX;
     cpu->cluster_index = UNASSIGNED_CLUSTER_INDEX;
     /* user-mode doesn't have configurable SMP topology */
-    /* the default value is changed by qemu_init_vcpu() for system-mode */
+    /* the default value is changed by qemu_early_init_vcpu() for system-mode */
     cpu->nr_cores = 1;
     cpu->nr_threads = 1;
     cpu->cflags_next_tb = -1;
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index c3ca0babcb3f..99ecf18eec02 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -1063,6 +1063,14 @@ void start_exclusive(void);
  */
 void end_exclusive(void);
 
+/**
+ * qemu_early_init_vcpu:
+ * @cpu: The vCPU to initialize.
+ *
+ * Early initialize a vCPU.
+ */
+void qemu_early_init_vcpu(CPUState *cpu);
+
 /**
  * qemu_init_vcpu:
  * @cpu: The vCPU to initialize.
diff --git a/system/cpus.c b/system/cpus.c
index 1c818ff6828c..a1b46f68476a 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -662,12 +662,16 @@ const AccelOpsClass *cpus_get_accel(void)
     return cpus_accel;
 }
 
-void qemu_init_vcpu(CPUState *cpu)
+void qemu_early_init_vcpu(CPUState *cpu)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
 
     cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
     cpu->nr_threads =  ms->smp.threads;
+}
+
+void qemu_init_vcpu(CPUState *cpu)
+{
     cpu->stopped = true;
     cpu->random_seed = qemu_guest_random_seed_thread_part1();
 
diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
index 9db1dffc03ec..56309a647763 100644
--- a/target/alpha/cpu.c
+++ b/target/alpha/cpu.c
@@ -93,6 +93,8 @@ static void alpha_cpu_realizefn(DeviceState *dev, Error **errp)
     AlphaCPUClass *acc = ALPHA_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
 #ifndef CONFIG_USER_ONLY
     /* Use pc-relative instructions in system-mode */
     cs->tcg_cflags |= CF_PCREL;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 5b751439bdc7..98dcc0855868 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1971,6 +1971,8 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp)
     CPUARMState *env = &cpu->env;
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
 #if defined(CONFIG_TCG) && !defined(CONFIG_USER_ONLY)
     /* Use pc-relative instructions in system-mode */
     tcg_cflags_set(cs, CF_PCREL);
diff --git a/target/avr/cpu.c b/target/avr/cpu.c
index 3132842d5654..b8beaf9682c0 100644
--- a/target/avr/cpu.c
+++ b/target/avr/cpu.c
@@ -111,6 +111,8 @@ static void avr_cpu_realizefn(DeviceState *dev, Error **errp)
     AVRCPUClass *mcc = AVR_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     cpu_exec_realizefn(cs, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
index 020038fc4902..5931dce05bec 100644
--- a/target/hexagon/cpu.c
+++ b/target/hexagon/cpu.c
@@ -299,6 +299,8 @@ static void hexagon_cpu_realize(DeviceState *dev, Error **errp)
     HexagonCPUClass *mcc = HEXAGON_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     cpu_exec_realizefn(cs, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
diff --git a/target/hppa/cpu.c b/target/hppa/cpu.c
index c38439c1800e..cf7b46e4ff42 100644
--- a/target/hppa/cpu.c
+++ b/target/hppa/cpu.c
@@ -169,6 +169,8 @@ static void hppa_cpu_realizefn(DeviceState *dev, Error **errp)
     HPPACPUClass *acc = HPPA_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     cpu_exec_realizefn(cs, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3baa95481fbc..1179b7a3ce62 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7757,6 +7757,8 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
     Error *local_err = NULL;
     unsigned requested_lbr_fmt;
 
+    qemu_early_init_vcpu(cs);
+
 #if defined(CONFIG_TCG) && !defined(CONFIG_USER_ONLY)
     /* Use pc-relative instructions in system-mode */
     tcg_cflags_set(cs, CF_PCREL);
diff --git a/target/loongarch/cpu.c b/target/loongarch/cpu.c
index 57cc4f314bf7..9572cbd2a4f5 100644
--- a/target/loongarch/cpu.c
+++ b/target/loongarch/cpu.c
@@ -598,6 +598,8 @@ static void loongarch_cpu_realizefn(DeviceState *dev, Error **errp)
     LoongArchCPUClass *lacc = LOONGARCH_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     cpu_exec_realizefn(cs, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
index 1d49f4cb2382..77acbadeb65f 100644
--- a/target/m68k/cpu.c
+++ b/target/m68k/cpu.c
@@ -304,6 +304,8 @@ static void m68k_cpu_realizefn(DeviceState *dev, Error **errp)
     M68kCPUClass *mcc = M68K_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     register_m68k_insns(&cpu->env);
 
     cpu_exec_realizefn(cs, &local_err);
diff --git a/target/microblaze/cpu.c b/target/microblaze/cpu.c
index 135947ee8004..0850043ca1a2 100644
--- a/target/microblaze/cpu.c
+++ b/target/microblaze/cpu.c
@@ -226,6 +226,8 @@ static void mb_cpu_realizefn(DeviceState *dev, Error **errp)
     int i = 0;
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     cpu_exec_realizefn(cs, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
diff --git a/target/mips/cpu.c b/target/mips/cpu.c
index 9724e71a5e0f..8cd1b794e0af 100644
--- a/target/mips/cpu.c
+++ b/target/mips/cpu.c
@@ -460,6 +460,8 @@ static void mips_cpu_realizefn(DeviceState *dev, Error **errp)
     MIPSCPUClass *mcc = MIPS_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     if (!clock_get(cpu->clock)) {
 #ifndef CONFIG_USER_ONLY
         if (!qtest_enabled()) {
diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c
index 6ec54ad7a6cf..071239dd3134 100644
--- a/target/openrisc/cpu.c
+++ b/target/openrisc/cpu.c
@@ -147,6 +147,8 @@ static void openrisc_cpu_realizefn(DeviceState *dev, Error **errp)
     OpenRISCCPUClass *occ = OPENRISC_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     cpu_exec_realizefn(cs, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 23881d09e9f3..5c2ab87319e7 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -6949,6 +6949,8 @@ static void ppc_cpu_realize(DeviceState *dev, Error **errp)
     PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     cpu_exec_realizefn(cs, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index f219f0c3b527..40cf24d2e759 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1166,6 +1166,8 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp)
     RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     cpu_exec_realizefn(cs, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
diff --git a/target/rx/cpu.c b/target/rx/cpu.c
index 36d2a6f18906..a24b3f28b455 100644
--- a/target/rx/cpu.c
+++ b/target/rx/cpu.c
@@ -117,6 +117,8 @@ static void rx_cpu_realize(DeviceState *dev, Error **errp)
     RXCPUClass *rcc = RX_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     cpu_exec_realizefn(cs, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
diff --git a/target/s390x/cpu.c b/target/s390x/cpu.c
index 4e41a3dff59b..20f083834309 100644
--- a/target/s390x/cpu.c
+++ b/target/s390x/cpu.c
@@ -247,6 +247,8 @@ static void s390_cpu_realizefn(DeviceState *dev, Error **errp)
     S390CPUClass *scc = S390_CPU_GET_CLASS(dev);
     Error *err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     /* the model has to be realized before qemu_init_vcpu() due to kvm */
     s390_realize_cpu_model(cs, &err);
     if (err) {
diff --git a/target/sh4/cpu.c b/target/sh4/cpu.c
index 8f07261dcfd5..74488a1c0ba4 100644
--- a/target/sh4/cpu.c
+++ b/target/sh4/cpu.c
@@ -212,6 +212,8 @@ static void superh_cpu_realizefn(DeviceState *dev, Error **errp)
     SuperHCPUClass *scc = SUPERH_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     cpu_exec_realizefn(cs, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
diff --git a/target/sparc/cpu.c b/target/sparc/cpu.c
index 54cb269e0af1..57cd7911263e 100644
--- a/target/sparc/cpu.c
+++ b/target/sparc/cpu.c
@@ -788,6 +788,8 @@ static void sparc_cpu_realizefn(DeviceState *dev, Error **errp)
     Error *local_err = NULL;
     CPUSPARCState *env = cpu_env(cs);
 
+    qemu_early_init_vcpu(cs);
+
 #if defined(CONFIG_USER_ONLY)
     /* We are emulating the kernel, which will trap and emulate float128. */
     env->def.features |= CPU_FEATURE_FLOAT128;
diff --git a/target/tricore/cpu.c b/target/tricore/cpu.c
index 1a261715907d..d58271696631 100644
--- a/target/tricore/cpu.c
+++ b/target/tricore/cpu.c
@@ -88,6 +88,8 @@ static void tricore_cpu_realizefn(DeviceState *dev, Error **errp)
     CPUTriCoreState *env = &cpu->env;
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
     cpu_exec_realizefn(cs, &local_err);
     if (local_err != NULL) {
         error_propagate(errp, local_err);
diff --git a/target/xtensa/cpu.c b/target/xtensa/cpu.c
index a08c7a0b1f20..92120141fee2 100644
--- a/target/xtensa/cpu.c
+++ b/target/xtensa/cpu.c
@@ -163,6 +163,8 @@ static void xtensa_cpu_realizefn(DeviceState *dev, Error **errp)
     XtensaCPUClass *xcc = XTENSA_CPU_GET_CLASS(dev);
     Error *local_err = NULL;
 
+    qemu_early_init_vcpu(cs);
+
 #ifndef CONFIG_USER_ONLY
     xtensa_irq_init(&XTENSA_CPU(dev)->env);
 #endif
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v1 2/4] i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()
  2024-11-08  7:06 [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup Xiaoyao Li
  2024-11-08  7:06 ` [PATCH v1 1/4] cpu: Introduce qemu_early_init_vcpu() to initialize nr_cores and nr_threads inside it Xiaoyao Li
@ 2024-11-08  7:06 ` Xiaoyao Li
  2024-12-05  7:19   ` Zhao Liu
  2024-11-08  7:06 ` [PATCH v1 3/4] i386/cpu: Set and track CPUID_EXT3_CMP_LEG in env->features[FEAT_8000_0001_ECX] Xiaoyao Li
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 26+ messages in thread
From: Xiaoyao Li @ 2024-11-08  7:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, David Hildenbrand,
	Mark Cave-Ayland, Artyom Tarasenko, Bastian Koppelmann,
	Max Filippov, qemu-devel, xiaoyao.li

Track CPUID_HT in env->features[FEAT_1_EDX] instead of evaluating it
each time in cpu_x86_cpuid(). env->features[] should be set up in cpu's
realizefn() and cpu_x86_cpuid() should be the consumer of it.

Beside, TDX support also depends on it because TDX is going to validate
the feature configuration by comparing what TDX module reports with
env->features[]. If not tracking CPUID_HT in env->features[], it gets
warnings like below when number of vcpus > 1:

  warning: TDX enforces set the feature: CPUID.01H:EDX.ht [bit 28]

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/cpu.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1179b7a3ce62..e0c5a61ff615 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6544,7 +6544,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         *edx = env->features[FEAT_1_EDX];
         if (threads_per_pkg > 1) {
             *ebx |= threads_per_pkg << 16;
-            *edx |= CPUID_HT;
         }
         if (!cpu->enable_pmu) {
             *ecx &= ~CPUID_EXT_PDCM;
@@ -7490,6 +7489,7 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
 void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
 {
     CPUX86State *env = &cpu->env;
+    CPUState *cs = CPU(cpu);
     FeatureWord w;
     int i;
     GList *l;
@@ -7531,6 +7531,10 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
         }
     }
 
+    if (cs->nr_cores * cs->nr_threads > 1) {
+        env->features[FEAT_1_EDX] |= CPUID_HT;
+    }
+
     for (i = 0; i < ARRAY_SIZE(feature_dependencies); i++) {
         FeatureDep *d = &feature_dependencies[i];
         if (!(env->features[d->from.index] & d->from.mask)) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v1 3/4] i386/cpu: Set and track CPUID_EXT3_CMP_LEG in env->features[FEAT_8000_0001_ECX]
  2024-11-08  7:06 [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup Xiaoyao Li
  2024-11-08  7:06 ` [PATCH v1 1/4] cpu: Introduce qemu_early_init_vcpu() to initialize nr_cores and nr_threads inside it Xiaoyao Li
  2024-11-08  7:06 ` [PATCH v1 2/4] i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid() Xiaoyao Li
@ 2024-11-08  7:06 ` Xiaoyao Li
  2024-11-08  7:06 ` [PATCH v1 4/4] i386/cpu: Rectify the comment on order dependency on qemu_init_vcpu() Xiaoyao Li
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 26+ messages in thread
From: Xiaoyao Li @ 2024-11-08  7:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, David Hildenbrand,
	Mark Cave-Ayland, Artyom Tarasenko, Bastian Koppelmann,
	Max Filippov, qemu-devel, xiaoyao.li

... instead of manually set it in cpu_x86_cpuid().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/cpu.c | 20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index e0c5a61ff615..015e085fa66c 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6959,17 +6959,6 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         *ecx = env->features[FEAT_8000_0001_ECX];
         *edx = env->features[FEAT_8000_0001_EDX];
 
-        /* The Linux kernel checks for the CMPLegacy bit and
-         * discards multiple thread information if it is set.
-         * So don't set it here for Intel to make Linux guests happy.
-         */
-        if (threads_per_pkg > 1) {
-            if (env->cpuid_vendor1 != CPUID_VENDOR_INTEL_1 ||
-                env->cpuid_vendor2 != CPUID_VENDOR_INTEL_2 ||
-                env->cpuid_vendor3 != CPUID_VENDOR_INTEL_3) {
-                *ecx |= 1 << 1;    /* CmpLegacy bit */
-            }
-        }
         if (tcg_enabled() && env->cpuid_vendor1 == CPUID_VENDOR_INTEL_1 &&
             !(env->hflags & HF_LMA_MASK)) {
             *edx &= ~CPUID_EXT2_SYSCALL;
@@ -7533,6 +7522,15 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
 
     if (cs->nr_cores * cs->nr_threads > 1) {
         env->features[FEAT_1_EDX] |= CPUID_HT;
+
+        /*
+         * The Linux kernel checks for the CMPLegacy bit and
+         * discards multiple thread information if it is set.
+         * So don't set it here for Intel to make Linux guests happy.
+         */
+        if (!IS_INTEL_CPU(env)) {
+            env->features[FEAT_8000_0001_ECX] |= CPUID_EXT3_CMP_LEG;
+        }
     }
 
     for (i = 0; i < ARRAY_SIZE(feature_dependencies); i++) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v1 4/4] i386/cpu: Rectify the comment on order dependency on qemu_init_vcpu()
  2024-11-08  7:06 [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup Xiaoyao Li
                   ` (2 preceding siblings ...)
  2024-11-08  7:06 ` [PATCH v1 3/4] i386/cpu: Set and track CPUID_EXT3_CMP_LEG in env->features[FEAT_8000_0001_ECX] Xiaoyao Li
@ 2024-11-08  7:06 ` Xiaoyao Li
  2024-11-11 10:49 ` [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup David Hildenbrand
  2024-12-05  7:30 ` Zhao Liu
  5 siblings, 0 replies; 26+ messages in thread
From: Xiaoyao Li @ 2024-11-08  7:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, David Hildenbrand,
	Mark Cave-Ayland, Artyom Tarasenko, Bastian Koppelmann,
	Max Filippov, qemu-devel, xiaoyao.li

Now cs->nr_threads is initialized in qemu_early_init_vcpu() which is
called at the begining of realizef(). Drop the comment of the order
dependcy on qemu_init_vcpu() and hoist code to put it together with
other feature checking.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 target/i386/cpu.c | 33 +++++++++++++++------------------
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 015e085fa66c..98910fa49250 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -7887,6 +7887,21 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
      */
     cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
 
+    /*
+     * Most Intel and certain AMD CPUs support hyperthreading. Even though QEMU
+     * fixes this issue by adjusting CPUID_0000_0001_EBX and CPUID_8000_0008_ECX
+     * based on inputs (sockets,cores,threads), it is still better to give
+     * users a warning.
+     */
+    if (IS_AMD_CPU(env) &&
+        !(env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) &&
+        cs->nr_threads > 1) {
+            warn_report_once("This family of AMD CPU doesn't support "
+                             "hyperthreading(%d). Please configure -smp "
+                             "options properly or try enabling topoext "
+                             "feature.", cs->nr_threads);
+    }
+
     /* For 64bit systems think about the number of physical bits to present.
      * ideally this should be the same as the host; anything other than matching
      * the host can cause incorrect guest behaviour.
@@ -7991,24 +8006,6 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
     x86_cpu_gdb_init(cs);
     qemu_init_vcpu(cs);
 
-    /*
-     * Most Intel and certain AMD CPUs support hyperthreading. Even though QEMU
-     * fixes this issue by adjusting CPUID_0000_0001_EBX and CPUID_8000_0008_ECX
-     * based on inputs (sockets,cores,threads), it is still better to give
-     * users a warning.
-     *
-     * NOTE: the following code has to follow qemu_init_vcpu(). Otherwise
-     * cs->nr_threads hasn't be populated yet and the checking is incorrect.
-     */
-    if (IS_AMD_CPU(env) &&
-        !(env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) &&
-        cs->nr_threads > 1) {
-            warn_report_once("This family of AMD CPU doesn't support "
-                             "hyperthreading(%d). Please configure -smp "
-                             "options properly or try enabling topoext "
-                             "feature.", cs->nr_threads);
-    }
-
 #ifndef CONFIG_USER_ONLY
     x86_cpu_apic_realize(cpu, &local_err);
     if (local_err != NULL) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
  2024-11-08  7:06 [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup Xiaoyao Li
                   ` (3 preceding siblings ...)
  2024-11-08  7:06 ` [PATCH v1 4/4] i386/cpu: Rectify the comment on order dependency on qemu_init_vcpu() Xiaoyao Li
@ 2024-11-11 10:49 ` David Hildenbrand
  2024-11-21 16:24   ` Xiaoyao Li
  2024-12-05  7:30 ` Zhao Liu
  5 siblings, 1 reply; 26+ messages in thread
From: David Hildenbrand @ 2024-11-11 10:49 UTC (permalink / raw)
  To: Xiaoyao Li, Paolo Bonzini
  Cc: Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, Mark Cave-Ayland,
	Artyom Tarasenko, Bastian Koppelmann, Max Filippov, qemu-devel

On 08.11.24 08:06, Xiaoyao Li wrote:
> This series is extracted from TDX QEMU v6[1] series per Paolo's request.
> 
> It is originally motivated by x86 TDX to track CPUID_HT in env->features[]
> which requires nr_cores and nr_cores being initialized earlier than in

"and nr_threads"

> qemu_init_vcpu().
> 
> Initialize of nr_cores and nr_threads earlier in x86's cpu_realizefn()
> can make it work for x86 but it's duplicated with the initialization in
> qemu_init_vcpu() which are used by all the ARCHes. Since initialize them
> earlier also work for other ARCHes, introduce qemu_init_early_vcpu() to
> hold the initialization of nr_cores and nr_threads and call it at the
> beginning in realizefn() for each ARCH.
> 
> Note, I only tested it for x86 ARCH. Please help test on other ARCHes.

[...]

>   accel/tcg/user-exec-stub.c |  4 +++
>   hw/core/cpu-common.c       |  2 +-
>   include/hw/core/cpu.h      |  8 +++++
>   system/cpus.c              |  6 +++-
>   target/alpha/cpu.c         |  2 ++
>   target/arm/cpu.c           |  2 ++
>   target/avr/cpu.c           |  2 ++
>   target/hexagon/cpu.c       |  2 ++
>   target/hppa/cpu.c          |  2 ++
>   target/i386/cpu.c          | 61 +++++++++++++++++++-------------------
>   target/loongarch/cpu.c     |  2 ++
>   target/m68k/cpu.c          |  2 ++
>   target/microblaze/cpu.c    |  2 ++
>   target/mips/cpu.c          |  2 ++
>   target/openrisc/cpu.c      |  2 ++
>   target/ppc/cpu_init.c      |  2 ++
>   target/riscv/cpu.c         |  2 ++
>   target/rx/cpu.c            |  2 ++
>   target/s390x/cpu.c         |  2 ++
>   target/sh4/cpu.c           |  2 ++
>   target/sparc/cpu.c         |  2 ++
>   target/tricore/cpu.c       |  2 ++
>   target/xtensa/cpu.c        |  2 ++
>   23 files changed, 85 insertions(+), 32 deletions(-)

Hm. It looks like this belongs into the parent realize function. But the 
"bad thing" is that we call the parent realize function after we try 
realizing the derived CPU.

Could it go into cpu_common_initfn()?

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
  2024-11-11 10:49 ` [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup David Hildenbrand
@ 2024-11-21 16:24   ` Xiaoyao Li
  2024-11-21 17:39     ` Philippe Mathieu-Daudé
  2024-11-21 18:52     ` Paolo Bonzini
  0 siblings, 2 replies; 26+ messages in thread
From: Xiaoyao Li @ 2024-11-21 16:24 UTC (permalink / raw)
  To: David Hildenbrand, Paolo Bonzini
  Cc: Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, Mark Cave-Ayland,
	Artyom Tarasenko, Bastian Koppelmann, Max Filippov, qemu-devel

On 11/11/2024 6:49 PM, David Hildenbrand wrote:
> On 08.11.24 08:06, Xiaoyao Li wrote:
>> This series is extracted from TDX QEMU v6[1] series per Paolo's request.
>>
>> It is originally motivated by x86 TDX to track CPUID_HT in env- 
>> >features[]
>> which requires nr_cores and nr_cores being initialized earlier than in
> 
> "and nr_threads"
> 
>> qemu_init_vcpu().
>>
>> Initialize of nr_cores and nr_threads earlier in x86's cpu_realizefn()
>> can make it work for x86 but it's duplicated with the initialization in
>> qemu_init_vcpu() which are used by all the ARCHes. Since initialize them
>> earlier also work for other ARCHes, introduce qemu_init_early_vcpu() to
>> hold the initialization of nr_cores and nr_threads and call it at the
>> beginning in realizefn() for each ARCH.
>>
>> Note, I only tested it for x86 ARCH. Please help test on other ARCHes.
> 
> [...]
> 
>>   accel/tcg/user-exec-stub.c |  4 +++
>>   hw/core/cpu-common.c       |  2 +-
>>   include/hw/core/cpu.h      |  8 +++++
>>   system/cpus.c              |  6 +++-
>>   target/alpha/cpu.c         |  2 ++
>>   target/arm/cpu.c           |  2 ++
>>   target/avr/cpu.c           |  2 ++
>>   target/hexagon/cpu.c       |  2 ++
>>   target/hppa/cpu.c          |  2 ++
>>   target/i386/cpu.c          | 61 +++++++++++++++++++-------------------
>>   target/loongarch/cpu.c     |  2 ++
>>   target/m68k/cpu.c          |  2 ++
>>   target/microblaze/cpu.c    |  2 ++
>>   target/mips/cpu.c          |  2 ++
>>   target/openrisc/cpu.c      |  2 ++
>>   target/ppc/cpu_init.c      |  2 ++
>>   target/riscv/cpu.c         |  2 ++
>>   target/rx/cpu.c            |  2 ++
>>   target/s390x/cpu.c         |  2 ++
>>   target/sh4/cpu.c           |  2 ++
>>   target/sparc/cpu.c         |  2 ++
>>   target/tricore/cpu.c       |  2 ++
>>   target/xtensa/cpu.c        |  2 ++
>>   23 files changed, 85 insertions(+), 32 deletions(-)
> 
> Hm. It looks like this belongs into the parent realize function. But the 
> "bad thing" is that we call the parent realize function after we try 
> realizing the derived CPU.
> 
> Could it go into cpu_common_initfn()?
> 

It can, I think.

I'll move them into cpu_common_initfn() in v2 to avoid touching all the 
ARCHes.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
  2024-11-21 16:24   ` Xiaoyao Li
@ 2024-11-21 17:39     ` Philippe Mathieu-Daudé
  2024-11-21 18:52     ` Paolo Bonzini
  1 sibling, 0 replies; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-11-21 17:39 UTC (permalink / raw)
  To: Xiaoyao Li, David Hildenbrand, Paolo Bonzini, Igor Mammedov
  Cc: Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, Mark Cave-Ayland,
	Artyom Tarasenko, Bastian Koppelmann, Max Filippov, qemu-devel

On 21/11/24 17:24, Xiaoyao Li wrote:
> On 11/11/2024 6:49 PM, David Hildenbrand wrote:
>> On 08.11.24 08:06, Xiaoyao Li wrote:
>>> This series is extracted from TDX QEMU v6[1] series per Paolo's request.
>>>
>>> It is originally motivated by x86 TDX to track CPUID_HT in env- 
>>> >features[]
>>> which requires nr_cores and nr_cores being initialized earlier than in
>>
>> "and nr_threads"
>>
>>> qemu_init_vcpu().
>>>
>>> Initialize of nr_cores and nr_threads earlier in x86's cpu_realizefn()
>>> can make it work for x86 but it's duplicated with the initialization in
>>> qemu_init_vcpu() which are used by all the ARCHes. Since initialize them
>>> earlier also work for other ARCHes, introduce qemu_init_early_vcpu() to
>>> hold the initialization of nr_cores and nr_threads and call it at the
>>> beginning in realizefn() for each ARCH.
>>>
>>> Note, I only tested it for x86 ARCH. Please help test on other ARCHes.
>>
>> [...]
>>
>>>   accel/tcg/user-exec-stub.c |  4 +++
>>>   hw/core/cpu-common.c       |  2 +-
>>>   include/hw/core/cpu.h      |  8 +++++
>>>   system/cpus.c              |  6 +++-
>>>   target/alpha/cpu.c         |  2 ++
>>>   target/arm/cpu.c           |  2 ++
>>>   target/avr/cpu.c           |  2 ++
>>>   target/hexagon/cpu.c       |  2 ++
>>>   target/hppa/cpu.c          |  2 ++
>>>   target/i386/cpu.c          | 61 +++++++++++++++++++-------------------
>>>   target/loongarch/cpu.c     |  2 ++
>>>   target/m68k/cpu.c          |  2 ++
>>>   target/microblaze/cpu.c    |  2 ++
>>>   target/mips/cpu.c          |  2 ++
>>>   target/openrisc/cpu.c      |  2 ++
>>>   target/ppc/cpu_init.c      |  2 ++
>>>   target/riscv/cpu.c         |  2 ++
>>>   target/rx/cpu.c            |  2 ++
>>>   target/s390x/cpu.c         |  2 ++
>>>   target/sh4/cpu.c           |  2 ++
>>>   target/sparc/cpu.c         |  2 ++
>>>   target/tricore/cpu.c       |  2 ++
>>>   target/xtensa/cpu.c        |  2 ++
>>>   23 files changed, 85 insertions(+), 32 deletions(-)
>>
>> Hm. It looks like this belongs into the parent realize function. But 
>> the "bad thing" is that we call the parent realize function after we 
>> try realizing the derived CPU.
>>
>> Could it go into cpu_common_initfn()?
>>
> 
> It can, I think.
> 
> I'll move them into cpu_common_initfn() in v2 to avoid touching all the 
> ARCHes.


This seems a x86 issue, it also bugs me:
https://lore.kernel.org/qemu-devel/20231128171239.69b6d7b1@imammedo.users.ipa.redhat.com/

IMO we need to make vCPU creation steps more explicit.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
  2024-11-21 16:24   ` Xiaoyao Li
  2024-11-21 17:39     ` Philippe Mathieu-Daudé
@ 2024-11-21 18:52     ` Paolo Bonzini
  2024-11-22  2:40       ` Xiaoyao Li
  1 sibling, 1 reply; 26+ messages in thread
From: Paolo Bonzini @ 2024-11-21 18:52 UTC (permalink / raw)
  To: Xiaoyao Li, David Hildenbrand
  Cc: Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, Mark Cave-Ayland,
	Artyom Tarasenko, Bastian Koppelmann, Max Filippov, qemu-devel

On 11/21/24 17:24, Xiaoyao Li wrote:
>> Could it go into cpu_common_initfn()?
> 
> It can, I think.
> 
> I'll move them into cpu_common_initfn() in v2 to avoid touching all the 
> ARCHes.

It does look better than the alternative of duplicating code.

On the other hand qemu_init_vcpu is already duplicated and I'm not sure 
I like relying on qdev_get_machine() inside the instance_init function...

Paolo



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
  2024-11-21 18:52     ` Paolo Bonzini
@ 2024-11-22  2:40       ` Xiaoyao Li
  2024-11-22  9:44         ` David Hildenbrand
  0 siblings, 1 reply; 26+ messages in thread
From: Xiaoyao Li @ 2024-11-22  2:40 UTC (permalink / raw)
  To: Paolo Bonzini, David Hildenbrand
  Cc: Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, Mark Cave-Ayland,
	Artyom Tarasenko, Bastian Koppelmann, Max Filippov, qemu-devel

On 11/22/2024 2:52 AM, Paolo Bonzini wrote:
> On 11/21/24 17:24, Xiaoyao Li wrote:
>>> Could it go into cpu_common_initfn()?
>>
>> It can, I think.
>>
>> I'll move them into cpu_common_initfn() in v2 to avoid touching all 
>> the ARCHes.
> 
> It does look better than the alternative of duplicating code.
> 
> On the other hand qemu_init_vcpu is already duplicated and I'm not sure 
> I like relying on qdev_get_machine() inside the instance_init function...

I had the same concern.

But it seems all the ARCHes should create MACHINE before VCPUs. So it 
seems OK to qdev_get_machine() inside the instance_init function. But 
I'm not sure if there is any case to create VCPU standalone.

I think we can check if qdev_get_machine() gets a valid result. If not, 
fall back to assign nr_cores and nr_threads to 1.

Anyway, please let me know your preference, this series or a v2 to 
implement it inside cpu_common_initfn().

> Paolo
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
  2024-11-22  2:40       ` Xiaoyao Li
@ 2024-11-22  9:44         ` David Hildenbrand
  2024-11-22  9:53           ` Paolo Bonzini
  0 siblings, 1 reply; 26+ messages in thread
From: David Hildenbrand @ 2024-11-22  9:44 UTC (permalink / raw)
  To: Xiaoyao Li, Paolo Bonzini
  Cc: Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, Mark Cave-Ayland,
	Artyom Tarasenko, Bastian Koppelmann, Max Filippov, qemu-devel

On 22.11.24 03:40, Xiaoyao Li wrote:
> On 11/22/2024 2:52 AM, Paolo Bonzini wrote:
>> On 11/21/24 17:24, Xiaoyao Li wrote:
>>>> Could it go into cpu_common_initfn()?
>>>
>>> It can, I think.
>>>
>>> I'll move them into cpu_common_initfn() in v2 to avoid touching all
>>> the ARCHes.
>>
>> It does look better than the alternative of duplicating code.
>>
>> On the other hand qemu_init_vcpu is already duplicated and I'm not sure
>> I like relying on qdev_get_machine() inside the instance_init function...
> 

Good point.

> I had the same concern.
> 
> But it seems all the ARCHes should create MACHINE before VCPUs. So it
> seems OK to qdev_get_machine() inside the instance_init function. But
> I'm not sure if there is any case to create VCPU standalone.

There are, for example on s390x in create_cpu_model_list(). I recall 
there are ways to start QEMU without any machine and trigger that code. 
(or maybe this was just for the test environment with a special test 
machine ...)

> 
> I think we can check if qdev_get_machine() gets a valid result. If not,
> fall back to assign nr_cores and nr_threads to 1.

That sounds reasonable to me.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
  2024-11-22  9:44         ` David Hildenbrand
@ 2024-11-22  9:53           ` Paolo Bonzini
  0 siblings, 0 replies; 26+ messages in thread
From: Paolo Bonzini @ 2024-11-22  9:53 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Xiaoyao Li, Peter Maydell, Michael Rolnik, Brian Cain, Song Gao,
	Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno, Palmer Dabbelt,
	Alistair Francis, Bin Meng, Thomas Huth, Mark Cave-Ayland,
	Artyom Tarasenko, Bastian Koppelmann, Max Filippov, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 582 bytes --]

Il ven 22 nov 2024, 10:44 David Hildenbrand <david@redhat.com> ha scritto:

> > I think we can check if qdev_get_machine() gets a valid result. If not,
> > fall back to assign nr_cores and nr_threads to 1.
>
> That sounds reasonable to me.
>

Another possibility is to add a cpu_realize() function that sets two
properties and then calls qdev_realize(). It touches all architectures but
not in the sense of adding new code, just changing it. And it is more code
to write though, so if Xiaoyao prefers that I apply v1 I am okay with that.

Paolo


> Cheers,
>
> David / dhildenb
>
>

[-- Attachment #2: Type: text/html, Size: 1210 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn()
  2024-11-08  7:06 ` [PATCH v1 1/4] cpu: Introduce qemu_early_init_vcpu() to initialize nr_cores and nr_threads inside it Xiaoyao Li
@ 2024-11-22 16:03   ` Xiaoyao Li
  2024-11-22 17:26     ` Philippe Mathieu-Daudé
  2024-11-25  9:38     ` Igor Mammedov
  0 siblings, 2 replies; 26+ messages in thread
From: Xiaoyao Li @ 2024-11-22 16:03 UTC (permalink / raw)
  To: David Hildenbrand, Paolo Bonzini; +Cc: qemu-devel, xiaoyao.li

Currently cpu->nr_cores and cpu->nr_threads are initialized in
qemu_init_vcpu(), which is called a bit late in *cpu_realizefn() for
each ARCHes.

x86 arch would like to use nr_cores and nr_threads earlier in its
realizefn(). To serve this purpose, initialize nr_cores and nr_threads
in cpu_common_initfn(), which is earlier than *cpu_realizefn().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 hw/core/cpu-common.c | 10 +++++++++-
 system/cpus.c        |  4 ----
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index 09c79035949b..6de92ed854bc 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -237,14 +237,22 @@ static void cpu_common_unrealizefn(DeviceState *dev)
 static void cpu_common_initfn(Object *obj)
 {
     CPUState *cpu = CPU(obj);
+    Object *machine = qdev_get_machine();
+    MachineState *ms;
 
     gdb_init_cpu(cpu);
     cpu->cpu_index = UNASSIGNED_CPU_INDEX;
     cpu->cluster_index = UNASSIGNED_CLUSTER_INDEX;
     /* user-mode doesn't have configurable SMP topology */
-    /* the default value is changed by qemu_init_vcpu() for system-mode */
     cpu->nr_cores = 1;
     cpu->nr_threads = 1;
+#ifndef CONFIG_USER_ONLY
+    if (object_dynamic_cast(machine, TYPE_MACHINE)) {
+        ms = MACHINE(machine);
+        cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
+        cpu->nr_threads = ms->smp.threads;
+    }
+#endif
     cpu->cflags_next_tb = -1;
 
     /* allocate storage for thread info, initialise condition variables */
diff --git a/system/cpus.c b/system/cpus.c
index 1c818ff6828c..c1547fbfd39b 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -664,10 +664,6 @@ const AccelOpsClass *cpus_get_accel(void)
 
 void qemu_init_vcpu(CPUState *cpu)
 {
-    MachineState *ms = MACHINE(qdev_get_machine());
-
-    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
-    cpu->nr_threads =  ms->smp.threads;
     cpu->stopped = true;
     cpu->random_seed = qemu_guest_random_seed_thread_part1();
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn()
  2024-11-22 16:03   ` [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn() Xiaoyao Li
@ 2024-11-22 17:26     ` Philippe Mathieu-Daudé
  2024-11-22 19:17       ` Richard Henderson
  2024-11-25  9:38     ` Igor Mammedov
  1 sibling, 1 reply; 26+ messages in thread
From: Philippe Mathieu-Daudé @ 2024-11-22 17:26 UTC (permalink / raw)
  To: Xiaoyao Li, David Hildenbrand, Paolo Bonzini, Richard Henderson
  Cc: qemu-devel, Eduardo Habkost

On 22/11/24 17:03, Xiaoyao Li wrote:
> Currently cpu->nr_cores and cpu->nr_threads are initialized in
> qemu_init_vcpu(), which is called a bit late in *cpu_realizefn() for
> each ARCHes.
> 
> x86 arch would like to use nr_cores and nr_threads earlier in its
> realizefn(). To serve this purpose, initialize nr_cores and nr_threads
> in cpu_common_initfn(), which is earlier than *cpu_realizefn().
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>   hw/core/cpu-common.c | 10 +++++++++-
>   system/cpus.c        |  4 ----
>   2 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
> index 09c79035949b..6de92ed854bc 100644
> --- a/hw/core/cpu-common.c
> +++ b/hw/core/cpu-common.c
> @@ -237,14 +237,22 @@ static void cpu_common_unrealizefn(DeviceState *dev)
>   static void cpu_common_initfn(Object *obj)
>   {
>       CPUState *cpu = CPU(obj);
> +    Object *machine = qdev_get_machine();
> +    MachineState *ms;
>   
>       gdb_init_cpu(cpu);
>       cpu->cpu_index = UNASSIGNED_CPU_INDEX;
>       cpu->cluster_index = UNASSIGNED_CLUSTER_INDEX;
>       /* user-mode doesn't have configurable SMP topology */
> -    /* the default value is changed by qemu_init_vcpu() for system-mode */
>       cpu->nr_cores = 1;
>       cpu->nr_threads = 1;
> +#ifndef CONFIG_USER_ONLY

Is CONFIG_USER_ONLY available in an common_ss[] object? I don't recall.

Anyway, can we not use CONFIG_USER_ONLY in cpu-common.c?

> +    if (object_dynamic_cast(machine, TYPE_MACHINE)) {
> +        ms = MACHINE(machine);
> +        cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
> +        cpu->nr_threads = ms->smp.threads;
> +    }
> +#endif
>       cpu->cflags_next_tb = -1;
>   
>       /* allocate storage for thread info, initialise condition variables */
> diff --git a/system/cpus.c b/system/cpus.c
> index 1c818ff6828c..c1547fbfd39b 100644
> --- a/system/cpus.c
> +++ b/system/cpus.c
> @@ -664,10 +664,6 @@ const AccelOpsClass *cpus_get_accel(void)
>   
>   void qemu_init_vcpu(CPUState *cpu)
>   {
> -    MachineState *ms = MACHINE(qdev_get_machine());
> -
> -    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
> -    cpu->nr_threads =  ms->smp.threads;
>       cpu->stopped = true;
>       cpu->random_seed = qemu_guest_random_seed_thread_part1();
>   



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn()
  2024-11-22 17:26     ` Philippe Mathieu-Daudé
@ 2024-11-22 19:17       ` Richard Henderson
  0 siblings, 0 replies; 26+ messages in thread
From: Richard Henderson @ 2024-11-22 19:17 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, Xiaoyao Li, David Hildenbrand,
	Paolo Bonzini
  Cc: qemu-devel, Eduardo Habkost

On 11/22/24 11:26, Philippe Mathieu-Daudé wrote:
> On 22/11/24 17:03, Xiaoyao Li wrote:
>> Currently cpu->nr_cores and cpu->nr_threads are initialized in
>> qemu_init_vcpu(), which is called a bit late in *cpu_realizefn() for
>> each ARCHes.
>>
>> x86 arch would like to use nr_cores and nr_threads earlier in its
>> realizefn(). To serve this purpose, initialize nr_cores and nr_threads
>> in cpu_common_initfn(), which is earlier than *cpu_realizefn().
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   hw/core/cpu-common.c | 10 +++++++++-
>>   system/cpus.c        |  4 ----
>>   2 files changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
>> index 09c79035949b..6de92ed854bc 100644
>> --- a/hw/core/cpu-common.c
>> +++ b/hw/core/cpu-common.c
>> @@ -237,14 +237,22 @@ static void cpu_common_unrealizefn(DeviceState *dev)
>>   static void cpu_common_initfn(Object *obj)
>>   {
>>       CPUState *cpu = CPU(obj);
>> +    Object *machine = qdev_get_machine();
>> +    MachineState *ms;
>>       gdb_init_cpu(cpu);
>>       cpu->cpu_index = UNASSIGNED_CPU_INDEX;
>>       cpu->cluster_index = UNASSIGNED_CLUSTER_INDEX;
>>       /* user-mode doesn't have configurable SMP topology */
>> -    /* the default value is changed by qemu_init_vcpu() for system-mode */
>>       cpu->nr_cores = 1;
>>       cpu->nr_threads = 1;
>> +#ifndef CONFIG_USER_ONLY
> 
> Is CONFIG_USER_ONLY available in an common_ss[] object? I don't recall.

No.


r~



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn()
  2024-11-22 16:03   ` [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn() Xiaoyao Li
  2024-11-22 17:26     ` Philippe Mathieu-Daudé
@ 2024-11-25  9:38     ` Igor Mammedov
  2024-11-29  7:12       ` Xiaoyao Li
  2024-12-05 11:53       ` Xiaoyao Li
  1 sibling, 2 replies; 26+ messages in thread
From: Igor Mammedov @ 2024-11-25  9:38 UTC (permalink / raw)
  To: Xiaoyao Li; +Cc: David Hildenbrand, Paolo Bonzini, qemu-devel

On Fri, 22 Nov 2024 11:03:17 -0500
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

> Currently cpu->nr_cores and cpu->nr_threads are initialized in
> qemu_init_vcpu(), which is called a bit late in *cpu_realizefn() for
> each ARCHes.
> 
> x86 arch would like to use nr_cores and nr_threads earlier in its
> realizefn(). To serve this purpose, initialize nr_cores and nr_threads
> in cpu_common_initfn(), which is earlier than *cpu_realizefn().
> 
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  hw/core/cpu-common.c | 10 +++++++++-
>  system/cpus.c        |  4 ----
>  2 files changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
> index 09c79035949b..6de92ed854bc 100644
> --- a/hw/core/cpu-common.c
> +++ b/hw/core/cpu-common.c
> @@ -237,14 +237,22 @@ static void cpu_common_unrealizefn(DeviceState *dev)
>  static void cpu_common_initfn(Object *obj)
>  {
>      CPUState *cpu = CPU(obj);
> +    Object *machine = qdev_get_machine();
> +    MachineState *ms;
>  
>      gdb_init_cpu(cpu);
>      cpu->cpu_index = UNASSIGNED_CPU_INDEX;
>      cpu->cluster_index = UNASSIGNED_CLUSTER_INDEX;
>      /* user-mode doesn't have configurable SMP topology */
> -    /* the default value is changed by qemu_init_vcpu() for system-mode */
>      cpu->nr_cores = 1;
>      cpu->nr_threads = 1;
> +#ifndef CONFIG_USER_ONLY
> +    if (object_dynamic_cast(machine, TYPE_MACHINE)) {
> +        ms = MACHINE(machine);
> +        cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
> +        cpu->nr_threads = ms->smp.threads;
> +    }
> +#endif

Can't say, that I'm fond of adding/moving hack to access MachineState
from CPU context. Granted we did/still do it elsewhere, But I'd rather
prefer getting rid of those remnants that access globals.
It's basically technical debt we are carrying since 2009 (dc6b1c09849).
Moving that around doesn't help with getting rid of arbitrary access to globals.

As Paolo've noted there are other ways to set cores/threads,
albeit at expense of adding more code. And that could be fine
if it's done within expected cpu initialization flow.

Instead of accessing MachineState directly from CPU code (which is
essentially a layer violation), I'd suggest to set cores_nr/threads_nr
from pre_plug handler (which is machine code).
We do similar thing for nr_dies/nr_modules already, and we should do
same for cores/trheads.

Quick hack would be do the same for cores/threads in x86_cpu_pre_plug(),
and make qemu_init_vcpu() conditional to avoid touching other targets/machines.

I'd even ack that, however that's just leaves us with the same
old technical debt. So I'd like to coax a promise to fix it properly
(i.e. get rid of access to machine from CPU code).

(here goes typical ask to rewrite whole QEMU before doing thing you
actually need)

To do that we would need to:
  1. audit usage of cpu->nr_cores/cpu->nr_threads across QEMU, to figure out
     what targets/machines need them
  2. then add pre_plug() handlers to those machines to set them.
  3. In the end get rid of initializing them in cpu_common_initfn().

With that done we can then add a common helper to generalize topo config
based on -smp from pre_plug() handlers to reduce duplication caused by
per machine pre_plug handlers.

Or introduce a generic cpu_pre_plug() handler at Machine level and make
_pre_plug call chain to call it (sort of what we do with nested realize calls);

I'd prefer the 1st option (#2) as it explicitly documents what
targets/machines care about cores/threads at expense of some boiler plate code
duplication, instead of blanket generic fallback like we have now (regardless of
if it's actually needed).

>      cpu->cflags_next_tb = -1;
>  
>      /* allocate storage for thread info, initialise condition variables */
> diff --git a/system/cpus.c b/system/cpus.c
> index 1c818ff6828c..c1547fbfd39b 100644
> --- a/system/cpus.c
> +++ b/system/cpus.c
> @@ -664,10 +664,6 @@ const AccelOpsClass *cpus_get_accel(void)
>  
>  void qemu_init_vcpu(CPUState *cpu)
>  {
> -    MachineState *ms = MACHINE(qdev_get_machine());
> -
> -    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
> -    cpu->nr_threads =  ms->smp.threads;
>      cpu->stopped = true;
>      cpu->random_seed = qemu_guest_random_seed_thread_part1();
>  



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn()
  2024-11-25  9:38     ` Igor Mammedov
@ 2024-11-29  7:12       ` Xiaoyao Li
  2024-12-05 11:53       ` Xiaoyao Li
  1 sibling, 0 replies; 26+ messages in thread
From: Xiaoyao Li @ 2024-11-29  7:12 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: David Hildenbrand, Paolo Bonzini, qemu-devel

On 11/25/2024 5:38 PM, Igor Mammedov wrote:
> On Fri, 22 Nov 2024 11:03:17 -0500
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> Currently cpu->nr_cores and cpu->nr_threads are initialized in
>> qemu_init_vcpu(), which is called a bit late in *cpu_realizefn() for
>> each ARCHes.
>>
>> x86 arch would like to use nr_cores and nr_threads earlier in its
>> realizefn(). To serve this purpose, initialize nr_cores and nr_threads
>> in cpu_common_initfn(), which is earlier than *cpu_realizefn().
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   hw/core/cpu-common.c | 10 +++++++++-
>>   system/cpus.c        |  4 ----
>>   2 files changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
>> index 09c79035949b..6de92ed854bc 100644
>> --- a/hw/core/cpu-common.c
>> +++ b/hw/core/cpu-common.c
>> @@ -237,14 +237,22 @@ static void cpu_common_unrealizefn(DeviceState *dev)
>>   static void cpu_common_initfn(Object *obj)
>>   {
>>       CPUState *cpu = CPU(obj);
>> +    Object *machine = qdev_get_machine();
>> +    MachineState *ms;
>>   
>>       gdb_init_cpu(cpu);
>>       cpu->cpu_index = UNASSIGNED_CPU_INDEX;
>>       cpu->cluster_index = UNASSIGNED_CLUSTER_INDEX;
>>       /* user-mode doesn't have configurable SMP topology */
>> -    /* the default value is changed by qemu_init_vcpu() for system-mode */
>>       cpu->nr_cores = 1;
>>       cpu->nr_threads = 1;
>> +#ifndef CONFIG_USER_ONLY
>> +    if (object_dynamic_cast(machine, TYPE_MACHINE)) {
>> +        ms = MACHINE(machine);
>> +        cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
>> +        cpu->nr_threads = ms->smp.threads;
>> +    }
>> +#endif
> 
> Can't say, that I'm fond of adding/moving hack to access MachineState
> from CPU context. Granted we did/still do it elsewhere, But I'd rather
> prefer getting rid of those remnants that access globals.
> It's basically technical debt we are carrying since 2009 (dc6b1c09849).
> Moving that around doesn't help with getting rid of arbitrary access to globals.
> 
> As Paolo've noted there are other ways to set cores/threads,
> albeit at expense of adding more code. And that could be fine
> if it's done within expected cpu initialization flow.
> 
> Instead of accessing MachineState directly from CPU code (which is
> essentially a layer violation), I'd suggest to set cores_nr/threads_nr
> from pre_plug handler (which is machine code).
> We do similar thing for nr_dies/nr_modules already, and we should do
> same for cores/trheads.
> 
> Quick hack would be do the same for cores/threads in x86_cpu_pre_plug(),
> and make qemu_init_vcpu() conditional to avoid touching other targets/machines.
> 
> I'd even ack that, however that's just leaves us with the same
> old technical debt. So I'd like to coax a promise to fix it properly
> (i.e. get rid of access to machine from CPU code).
> 
> (here goes typical ask to rewrite whole QEMU before doing thing you
> actually need)
> 
> To do that we would need to:
>    1. audit usage of cpu->nr_cores/cpu->nr_threads across QEMU, to figure out
>       what targets/machines need them
 >    2. then add pre_plug() handlers to those machines to set them.> 
3. In the end get rid of initializing them in cpu_common_initfn().

It looks sane to me.

I'm wondering how to audit usage of cpu->nr_cores/cpu->nr_threads for 
other target than x86. I haven't played with them.

> With that done we can then add a common helper to generalize topo config
> based on -smp from pre_plug() handlers to reduce duplication caused by
> per machine pre_plug handlers.
> 
> Or introduce a generic cpu_pre_plug() handler at Machine level and make
> _pre_plug call chain to call it (sort of what we do with nested realize calls);
> 
> I'd prefer the 1st option (#2) as it explicitly documents what
> targets/machines care about cores/threads at expense of some boiler plate code
> duplication, instead of blanket generic fallback like we have now (regardless of
> if it's actually needed).
> 
>>       cpu->cflags_next_tb = -1;
>>   
>>       /* allocate storage for thread info, initialise condition variables */
>> diff --git a/system/cpus.c b/system/cpus.c
>> index 1c818ff6828c..c1547fbfd39b 100644
>> --- a/system/cpus.c
>> +++ b/system/cpus.c
>> @@ -664,10 +664,6 @@ const AccelOpsClass *cpus_get_accel(void)
>>   
>>   void qemu_init_vcpu(CPUState *cpu)
>>   {
>> -    MachineState *ms = MACHINE(qdev_get_machine());
>> -
>> -    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
>> -    cpu->nr_threads =  ms->smp.threads;
>>       cpu->stopped = true;
>>       cpu->random_seed = qemu_guest_random_seed_thread_part1();
>>   
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/4] i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()
  2024-11-08  7:06 ` [PATCH v1 2/4] i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid() Xiaoyao Li
@ 2024-12-05  7:19   ` Zhao Liu
  2024-12-05  7:54     ` Xiaoyao Li
  0 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2024-12-05  7:19 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Peter Maydell, Michael Rolnik, Brian Cain,
	Song Gao, Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno,
	Palmer Dabbelt, Alistair Francis, Bin Meng, Thomas Huth,
	David Hildenbrand, Mark Cave-Ayland, Artyom Tarasenko,
	Bastian Koppelmann, Max Filippov, qemu-devel

Hi Xiaoyao,

Sorry for late reply.

> @@ -7490,6 +7489,7 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
>  void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
>  {
>      CPUX86State *env = &cpu->env;
> +    CPUState *cs = CPU(cpu);
>      FeatureWord w;
>      int i;
>      GList *l;
> @@ -7531,6 +7531,10 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
>          }
>      }
>  
> +    if (cs->nr_cores * cs->nr_threads > 1) {
> +        env->features[FEAT_1_EDX] |= CPUID_HT;
> +    }
> +

We shouldn't place any CLI-configurable features here,
especially after expanding plus_features and minus_features.

HT has been made configurable since the commit 83629b1 ("target/i386/
cpu: Fix CPUID_HT exposure"), so if you want palce HT here, you
should make it un-configurable first.

Regarding commit 83629b1, in what cases do we need to actively set HT?

That commit even introduces more issues. Ideally, the hardware being
emulated by setting or masking feature bits should be feature-consistent.

However, "-cpu *,-ht -smp 2" does not remove the HT flag (which is
unexpected), and "-cpu *,+ht -smp 1" forcibly sets HT (which results in
buggy emulation). :(

In fact, HT should not be freely configurable in hardware emulation;
users should configure it in the BIOS.


>      for (i = 0; i < ARRAY_SIZE(feature_dependencies); i++) {
>          FeatureDep *d = &feature_dependencies[i];
>          if (!(env->features[d->from.index] & d->from.mask)) {
> -- 
> 2.34.1
> 
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
  2024-11-08  7:06 [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup Xiaoyao Li
                   ` (4 preceding siblings ...)
  2024-11-11 10:49 ` [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup David Hildenbrand
@ 2024-12-05  7:30 ` Zhao Liu
  2024-12-05  8:05   ` Xiaoyao Li
  5 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2024-12-05  7:30 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Peter Maydell, Michael Rolnik, Brian Cain,
	Song Gao, Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno,
	Palmer Dabbelt, Alistair Francis, Bin Meng, Thomas Huth,
	David Hildenbrand, Mark Cave-Ayland, Artyom Tarasenko,
	Bastian Koppelmann, Max Filippov, qemu-devel

I'm also very sorry, but I have a slightly different opinion...

>  accel/tcg/user-exec-stub.c |  4 +++
>  hw/core/cpu-common.c       |  2 +-
>  include/hw/core/cpu.h      |  8 +++++
>  system/cpus.c              |  6 +++-
>  target/alpha/cpu.c         |  2 ++
>  target/arm/cpu.c           |  2 ++
>  target/avr/cpu.c           |  2 ++
>  target/hexagon/cpu.c       |  2 ++
>  target/hppa/cpu.c          |  2 ++
>  target/i386/cpu.c          | 61 +++++++++++++++++++-------------------
>  target/loongarch/cpu.c     |  2 ++
>  target/m68k/cpu.c          |  2 ++
>  target/microblaze/cpu.c    |  2 ++
>  target/mips/cpu.c          |  2 ++
>  target/openrisc/cpu.c      |  2 ++
>  target/ppc/cpu_init.c      |  2 ++
>  target/riscv/cpu.c         |  2 ++
>  target/rx/cpu.c            |  2 ++
>  target/s390x/cpu.c         |  2 ++
>  target/sh4/cpu.c           |  2 ++
>  target/sparc/cpu.c         |  2 ++
>  target/tricore/cpu.c       |  2 ++
>  target/xtensa/cpu.c        |  2 ++
>  23 files changed, 85 insertions(+), 32 deletions(-)
> 

I have some doubts about the necessity of changing the initialization of
nr_cores/nr_threads, because you can access the machine's topology info
via machine_topo_get_threads_per_socket(), which gives the same result as
`nr_cores * nr_threads`.

Especially, the TDX feature check hook is also within the context of
`current_machine`, so why not check if TDX's HT is consistent with QEMU's
emulation in the TDX hook?

For this reason, and based on my comment on patch 2, I think checking HT
in the TDX hook or even ignoring HT, would be a more straightforward and
less impactful solution.

-Zhao



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/4] i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()
  2024-12-05  7:19   ` Zhao Liu
@ 2024-12-05  7:54     ` Xiaoyao Li
  2024-12-05  8:31       ` Zhao Liu
  0 siblings, 1 reply; 26+ messages in thread
From: Xiaoyao Li @ 2024-12-05  7:54 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Paolo Bonzini, Peter Maydell, Michael Rolnik, Brian Cain,
	Song Gao, Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno,
	Palmer Dabbelt, Alistair Francis, Bin Meng, Thomas Huth,
	David Hildenbrand, Mark Cave-Ayland, Artyom Tarasenko,
	Bastian Koppelmann, Max Filippov, qemu-devel

On 12/5/2024 3:19 PM, Zhao Liu wrote:
> Hi Xiaoyao,
> 
> Sorry for late reply.
> 
>> @@ -7490,6 +7489,7 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
>>   void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
>>   {
>>       CPUX86State *env = &cpu->env;
>> +    CPUState *cs = CPU(cpu);
>>       FeatureWord w;
>>       int i;
>>       GList *l;
>> @@ -7531,6 +7531,10 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
>>           }
>>       }
>>   
>> +    if (cs->nr_cores * cs->nr_threads > 1) {
>> +        env->features[FEAT_1_EDX] |= CPUID_HT;
>> +    }
>> +
> 
> We shouldn't place any CLI-configurable features here,
> especially after expanding plus_features and minus_features.

yah, it needs to be placed before manipulation of plus_features and 
minus_features.

> HT has been made configurable since the commit 83629b1 ("target/i386/
> cpu: Fix CPUID_HT exposure"), so if you want palce HT here, you
> should make it un-configurable first.

No, commit 83629b1 doesn't make HT configurable but fix the warning of

    warning: host doesn't support requested feature: CPUID.01H:EDX.ht 
[bit 28]

when "-cpu *,+ht"

> Regarding commit 83629b1, in what cases do we need to actively set HT?

when users want to do so. QEMU allows users to so do.

> That commit even introduces more issues. Ideally, the hardware being
> emulated by setting or masking feature bits should be feature-consistent.
> 
> However, "-cpu *,-ht -smp 2" does not remove the HT flag (which is
> unexpected), and "-cpu *,+ht -smp 1" forcibly sets HT (which results in
> buggy emulation). :(

For the case "-cpu *,-ht -smp 2" we can add some warn like what for AMD:

     if (IS_AMD_CPU(env) &&
         !(env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) &&
         cs->nr_threads > 1) {
             warn_report_once("This family of AMD CPU doesn't support "
                              "hyperthreading(%d). Please configure -smp "
                              "options properly or try enabling topoext "
                              "feature.", cs->nr_threads);
     }

for the case of "-cpu *,+ht, -smp 1", we can add a dependency between 
"HT" and "smp > 1", similar as feature_dependencies[]

> In fact, HT should not be freely configurable in hardware emulation;
> users should configure it in the BIOS.

How users configure it in the BIOS? Or do you mean the BIOS will 
set/clear it based on the actual (v)cpus get activated? Any reference to 
teh BIOS spec?

> 
>>       for (i = 0; i < ARRAY_SIZE(feature_dependencies); i++) {
>>           FeatureDep *d = &feature_dependencies[i];
>>           if (!(env->features[d->from.index] & d->from.mask)) {
>> -- 
>> 2.34.1
>>
>>



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
  2024-12-05  7:30 ` Zhao Liu
@ 2024-12-05  8:05   ` Xiaoyao Li
  2024-12-05  8:48     ` Zhao Liu
  0 siblings, 1 reply; 26+ messages in thread
From: Xiaoyao Li @ 2024-12-05  8:05 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Paolo Bonzini, Peter Maydell, Michael Rolnik, Brian Cain,
	Song Gao, Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno,
	Palmer Dabbelt, Alistair Francis, Bin Meng, Thomas Huth,
	David Hildenbrand, Mark Cave-Ayland, Artyom Tarasenko,
	Bastian Koppelmann, Max Filippov, qemu-devel

On 12/5/2024 3:30 PM, Zhao Liu wrote:
> I'm also very sorry, but I have a slightly different opinion...
> 
>>   accel/tcg/user-exec-stub.c |  4 +++
>>   hw/core/cpu-common.c       |  2 +-
>>   include/hw/core/cpu.h      |  8 +++++
>>   system/cpus.c              |  6 +++-
>>   target/alpha/cpu.c         |  2 ++
>>   target/arm/cpu.c           |  2 ++
>>   target/avr/cpu.c           |  2 ++
>>   target/hexagon/cpu.c       |  2 ++
>>   target/hppa/cpu.c          |  2 ++
>>   target/i386/cpu.c          | 61 +++++++++++++++++++-------------------
>>   target/loongarch/cpu.c     |  2 ++
>>   target/m68k/cpu.c          |  2 ++
>>   target/microblaze/cpu.c    |  2 ++
>>   target/mips/cpu.c          |  2 ++
>>   target/openrisc/cpu.c      |  2 ++
>>   target/ppc/cpu_init.c      |  2 ++
>>   target/riscv/cpu.c         |  2 ++
>>   target/rx/cpu.c            |  2 ++
>>   target/s390x/cpu.c         |  2 ++
>>   target/sh4/cpu.c           |  2 ++
>>   target/sparc/cpu.c         |  2 ++
>>   target/tricore/cpu.c       |  2 ++
>>   target/xtensa/cpu.c        |  2 ++
>>   23 files changed, 85 insertions(+), 32 deletions(-)
>>
> 
> I have some doubts about the necessity of changing the initialization of
> nr_cores/nr_threads, because you can access the machine's topology info
> via machine_topo_get_threads_per_socket(), which gives the same result as
> `nr_cores * nr_threads`.

yeah, it works. The goal is to maintain HT in env->features[].

The problem is, as Igor objects, accessing MachineState from CPU 
context. This is what I'm working on to avoid for the next version.

> Especially, the TDX feature check hook is also within the context of
> `current_machine`, so why not check if TDX's HT is consistent with QEMU's
> emulation in the TDX hook?
> 
> For this reason, and based on my comment on patch 2, I think checking HT
> in the TDX hook or even ignoring HT, would be a more straightforward and
> less impactful solution.

Though it's motivated by TDX series, the goal is not TDX specific. The 
goal is to track features in env->features[] instead of manually 
generating a one-off value in cpu_x86_cpuid().

> -Zhao
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/4] i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()
  2024-12-05  7:54     ` Xiaoyao Li
@ 2024-12-05  8:31       ` Zhao Liu
  2024-12-05  8:34         ` Xiaoyao Li
  0 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2024-12-05  8:31 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Peter Maydell, Michael Rolnik, Brian Cain,
	Song Gao, Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno,
	Palmer Dabbelt, Alistair Francis, Bin Meng, Thomas Huth,
	David Hildenbrand, Mark Cave-Ayland, Artyom Tarasenko,
	Bastian Koppelmann, Max Filippov, qemu-devel

Hi Xiaoyao,

Thanks for your reply!

On Thu, Dec 05, 2024 at 03:54:36PM +0800, Xiaoyao Li wrote:
> Date: Thu, 5 Dec 2024 15:54:36 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH v1 2/4] i386/cpu: Set up CPUID_HT in
>  x86_cpu_expand_features() instead of cpu_x86_cpuid()
> 
> On 12/5/2024 3:19 PM, Zhao Liu wrote:
> > Hi Xiaoyao,
> > 
> > Sorry for late reply.
> > 
> > > @@ -7490,6 +7489,7 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
> > >   void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
> > >   {
> > >       CPUX86State *env = &cpu->env;
> > > +    CPUState *cs = CPU(cpu);
> > >       FeatureWord w;
> > >       int i;
> > >       GList *l;
> > > @@ -7531,6 +7531,10 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
> > >           }
> > >       }
> > > +    if (cs->nr_cores * cs->nr_threads > 1) {
> > > +        env->features[FEAT_1_EDX] |= CPUID_HT;
> > > +    }
> > > +
> > 
> > We shouldn't place any CLI-configurable features here,
> > especially after expanding plus_features and minus_features.
> 
> yah, it needs to be placed before manipulation of plus_features and
> minus_features.

Please refer my comment at cover letter, you should do such thing in TDX
context.

> > HT has been made configurable since the commit 83629b1 ("target/i386/
> > cpu: Fix CPUID_HT exposure"), so if you want palce HT here, you
> > should make it un-configurable first.
> 
> No, commit 83629b1 doesn't make HT configurable but fix the warning of
> 
>    warning: host doesn't support requested feature: CPUID.01H:EDX.ht [bit
> 28]
> 
> when "-cpu *,+ht"
>
> > Regarding commit 83629b1, in what cases do we need to actively set HT?
> 
> when users want to do so. QEMU allows users to so do.

You haven't told the exact case you are fixing with HT.

HT is inherently tied to the topology, and custom modifications to HT
has already broken this. And I think we shouldn't go any further.

> > That commit even introduces more issues. Ideally, the hardware being
> > emulated by setting or masking feature bits should be feature-consistent.
> > 
> > However, "-cpu *,-ht -smp 2" does not remove the HT flag (which is
> > unexpected), and "-cpu *,+ht -smp 1" forcibly sets HT (which results in
> > buggy emulation). :(
> 
> For the case "-cpu *,-ht -smp 2" we can add some warn like what for AMD:
> 
>     if (IS_AMD_CPU(env) &&
>         !(env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) &&
>         cs->nr_threads > 1) {
>             warn_report_once("This family of AMD CPU doesn't support "
>                              "hyperthreading(%d). Please configure -smp "
>                              "options properly or try enabling topoext "
>                              "feature.", cs->nr_threads);
>     }
> 
> for the case of "-cpu *,+ht, -smp 1", we can add a dependency between "HT"
> and "smp > 1", similar as feature_dependencies[]

So I think the 83629b1 just masked the issue, a thorough fix should be
to avoid CLI configuration.

> > In fact, HT should not be freely configurable in hardware emulation;
> > users should configure it in the BIOS.
> 
> How users configure it in the BIOS? Or do you mean the BIOS will set/clear
> it based on the actual (v)cpus get activated? Any reference to teh BIOS
> spec?

Sorry, I think we should focus more on this issue. Such rhetorical
questions are not very helpful...

Thanks,
Zhao



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 2/4] i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid()
  2024-12-05  8:31       ` Zhao Liu
@ 2024-12-05  8:34         ` Xiaoyao Li
  0 siblings, 0 replies; 26+ messages in thread
From: Xiaoyao Li @ 2024-12-05  8:34 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Paolo Bonzini, Peter Maydell, Michael Rolnik, Brian Cain,
	Song Gao, Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno,
	Palmer Dabbelt, Alistair Francis, Bin Meng, Thomas Huth,
	David Hildenbrand, Mark Cave-Ayland, Artyom Tarasenko,
	Bastian Koppelmann, Max Filippov, qemu-devel

On 12/5/2024 4:31 PM, Zhao Liu wrote:
> Hi Xiaoyao,
> 
> Thanks for your reply!
> 
> On Thu, Dec 05, 2024 at 03:54:36PM +0800, Xiaoyao Li wrote:
>> Date: Thu, 5 Dec 2024 15:54:36 +0800
>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>> Subject: Re: [PATCH v1 2/4] i386/cpu: Set up CPUID_HT in
>>   x86_cpu_expand_features() instead of cpu_x86_cpuid()
>>
>> On 12/5/2024 3:19 PM, Zhao Liu wrote:
>>> Hi Xiaoyao,
>>>
>>> Sorry for late reply.
>>>
>>>> @@ -7490,6 +7489,7 @@ static void x86_cpu_enable_xsave_components(X86CPU *cpu)
>>>>    void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
>>>>    {
>>>>        CPUX86State *env = &cpu->env;
>>>> +    CPUState *cs = CPU(cpu);
>>>>        FeatureWord w;
>>>>        int i;
>>>>        GList *l;
>>>> @@ -7531,6 +7531,10 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
>>>>            }
>>>>        }
>>>> +    if (cs->nr_cores * cs->nr_threads > 1) {
>>>> +        env->features[FEAT_1_EDX] |= CPUID_HT;
>>>> +    }
>>>> +
>>>
>>> We shouldn't place any CLI-configurable features here,
>>> especially after expanding plus_features and minus_features.
>>
>> yah, it needs to be placed before manipulation of plus_features and
>> minus_features.
> 
> Please refer my comment at cover letter, you should do such thing in TDX
> context.

TDX is one of the reasons for this patch, but not the whole reason.

>>> HT has been made configurable since the commit 83629b1 ("target/i386/
>>> cpu: Fix CPUID_HT exposure"), so if you want palce HT here, you
>>> should make it un-configurable first.
>>
>> No, commit 83629b1 doesn't make HT configurable but fix the warning of
>>
>>     warning: host doesn't support requested feature: CPUID.01H:EDX.ht [bit
>> 28]
>>
>> when "-cpu *,+ht"
>>
>>> Regarding commit 83629b1, in what cases do we need to actively set HT?
>>
>> when users want to do so. QEMU allows users to so do.
> 
> You haven't told the exact case you are fixing with HT.

just when users try to boot a VM with "-cpu *,+ht -smp n" where n > 1.

This is a valid user configuration. Before commit 83629b1, it got 
warning, and commit 83629b1 tried to fix the warning.

> HT is inherently tied to the topology, and custom modifications to HT
> has already broken this. And I think we shouldn't go any further.

I don't object on this direction.

But it has nothing to do with this patch. This patch is trying to track 
HT in env->features[].

>>> That commit even introduces more issues. Ideally, the hardware being
>>> emulated by setting or masking feature bits should be feature-consistent.
>>>
>>> However, "-cpu *,-ht -smp 2" does not remove the HT flag (which is
>>> unexpected), and "-cpu *,+ht -smp 1" forcibly sets HT (which results in
>>> buggy emulation). :(
>>
>> For the case "-cpu *,-ht -smp 2" we can add some warn like what for AMD:
>>
>>      if (IS_AMD_CPU(env) &&
>>          !(env->features[FEAT_8000_0001_ECX] & CPUID_EXT3_TOPOEXT) &&
>>          cs->nr_threads > 1) {
>>              warn_report_once("This family of AMD CPU doesn't support "
>>                               "hyperthreading(%d). Please configure -smp "
>>                               "options properly or try enabling topoext "
>>                               "feature.", cs->nr_threads);
>>      }
>>
>> for the case of "-cpu *,+ht, -smp 1", we can add a dependency between "HT"
>> and "smp > 1", similar as feature_dependencies[]
> 
> So I think the 83629b1 just masked the issue, a thorough fix should be
> to avoid CLI configuration.

Again, I don't object on this direction. But it's another thing against 
this patch.

>>> In fact, HT should not be freely configurable in hardware emulation;
>>> users should configure it in the BIOS.
>>
>> How users configure it in the BIOS? Or do you mean the BIOS will set/clear
>> it based on the actual (v)cpus get activated? Any reference to teh BIOS
>> spec?
> 
> Sorry, I think we should focus more on this issue. Such rhetorical
> questions are not very helpful...

Sorry? it is you said "users should configure it in the BIOS". I was 
curious on it

> Thanks,
> Zhao
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
  2024-12-05  8:05   ` Xiaoyao Li
@ 2024-12-05  8:48     ` Zhao Liu
  2024-12-05  8:50       ` Xiaoyao Li
  0 siblings, 1 reply; 26+ messages in thread
From: Zhao Liu @ 2024-12-05  8:48 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Paolo Bonzini, Peter Maydell, Michael Rolnik, Brian Cain,
	Song Gao, Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno,
	Palmer Dabbelt, Alistair Francis, Bin Meng, Thomas Huth,
	David Hildenbrand, Mark Cave-Ayland, Artyom Tarasenko,
	Bastian Koppelmann, Max Filippov, qemu-devel

On Thu, Dec 05, 2024 at 04:05:22PM +0800, Xiaoyao Li wrote:
> Date: Thu, 5 Dec 2024 16:05:22 +0800
> From: Xiaoyao Li <xiaoyao.li@intel.com>
> Subject: Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and
>  related clearup
> 
> On 12/5/2024 3:30 PM, Zhao Liu wrote:
> > I'm also very sorry, but I have a slightly different opinion...
> > 
> > >   accel/tcg/user-exec-stub.c |  4 +++
> > >   hw/core/cpu-common.c       |  2 +-
> > >   include/hw/core/cpu.h      |  8 +++++
> > >   system/cpus.c              |  6 +++-
> > >   target/alpha/cpu.c         |  2 ++
> > >   target/arm/cpu.c           |  2 ++
> > >   target/avr/cpu.c           |  2 ++
> > >   target/hexagon/cpu.c       |  2 ++
> > >   target/hppa/cpu.c          |  2 ++
> > >   target/i386/cpu.c          | 61 +++++++++++++++++++-------------------
> > >   target/loongarch/cpu.c     |  2 ++
> > >   target/m68k/cpu.c          |  2 ++
> > >   target/microblaze/cpu.c    |  2 ++
> > >   target/mips/cpu.c          |  2 ++
> > >   target/openrisc/cpu.c      |  2 ++
> > >   target/ppc/cpu_init.c      |  2 ++
> > >   target/riscv/cpu.c         |  2 ++
> > >   target/rx/cpu.c            |  2 ++
> > >   target/s390x/cpu.c         |  2 ++
> > >   target/sh4/cpu.c           |  2 ++
> > >   target/sparc/cpu.c         |  2 ++
> > >   target/tricore/cpu.c       |  2 ++
> > >   target/xtensa/cpu.c        |  2 ++
> > >   23 files changed, 85 insertions(+), 32 deletions(-)
> > > 
> > 
> > I have some doubts about the necessity of changing the initialization of
> > nr_cores/nr_threads, because you can access the machine's topology info
> > via machine_topo_get_threads_per_socket(), which gives the same result as
> > `nr_cores * nr_threads`.
> 
> yeah, it works. The goal is to maintain HT in env->features[].
> 
> The problem is, as Igor objects, accessing MachineState from CPU context.
> This is what I'm working on to avoid for the next version.

:( I also noticed that patch. Adding #ifndef CONFIG_USER_ONLY is not
optimal there, so that approach is not general enough.

But your TDX case is different, TDX hook has been already under
`current_machine` context. So you can directly access any topo info
without CONFIG_USER_ONLY.

> > Especially, the TDX feature check hook is also within the context of
> > `current_machine`, so why not check if TDX's HT is consistent with QEMU's
> > emulation in the TDX hook?
> > 
> > For this reason, and based on my comment on patch 2, I think checking HT
> > in the TDX hook or even ignoring HT, would be a more straightforward and
> > less impactful solution.
> 
> Though it's motivated by TDX series, the goal is not TDX specific. The goal
> is to track features in env->features[] instead of manually generating a
> one-off value in cpu_x86_cpuid().

I agree, the principle of designing code should be like this. HT already
has some issues, and adding this check in x86_cpu_expand_features() is
also a one-off approach with very much effort. Checking HT in TDX is a
less costly and less impactful one-off solution.

Even, why not ignore HT in the TDX check? Isn't placing the TDX feature
check before build_cpuid() implicitly assuming that any subsequent
updates to CPUID by QEMU are valid?

Based on this, why can't TDX trust that QEMU will set HT correctly?
(However, this assumes that HT cannot be arbitrarily set by the user.)

I'm not sure how the next version will be, but it would definitely be
better if it helps with generalization.

Based on the current approach, affecting all architectures, it would be
better to directly obtain the topology information from current_machine
in TDX.

-Zhao




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup
  2024-12-05  8:48     ` Zhao Liu
@ 2024-12-05  8:50       ` Xiaoyao Li
  0 siblings, 0 replies; 26+ messages in thread
From: Xiaoyao Li @ 2024-12-05  8:50 UTC (permalink / raw)
  To: Zhao Liu
  Cc: Paolo Bonzini, Peter Maydell, Michael Rolnik, Brian Cain,
	Song Gao, Laurent Vivier, Edgar E. Iglesias, Aurelien Jarno,
	Palmer Dabbelt, Alistair Francis, Bin Meng, Thomas Huth,
	David Hildenbrand, Mark Cave-Ayland, Artyom Tarasenko,
	Bastian Koppelmann, Max Filippov, qemu-devel

On 12/5/2024 4:48 PM, Zhao Liu wrote:
> On Thu, Dec 05, 2024 at 04:05:22PM +0800, Xiaoyao Li wrote:
>> Date: Thu, 5 Dec 2024 16:05:22 +0800
>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>> Subject: Re: [PATCH v1 0/4] Initialize nr_cores and nr_threads early and
>>   related clearup
>>
>> On 12/5/2024 3:30 PM, Zhao Liu wrote:
>>> I'm also very sorry, but I have a slightly different opinion...
>>>
>>>>    accel/tcg/user-exec-stub.c |  4 +++
>>>>    hw/core/cpu-common.c       |  2 +-
>>>>    include/hw/core/cpu.h      |  8 +++++
>>>>    system/cpus.c              |  6 +++-
>>>>    target/alpha/cpu.c         |  2 ++
>>>>    target/arm/cpu.c           |  2 ++
>>>>    target/avr/cpu.c           |  2 ++
>>>>    target/hexagon/cpu.c       |  2 ++
>>>>    target/hppa/cpu.c          |  2 ++
>>>>    target/i386/cpu.c          | 61 +++++++++++++++++++-------------------
>>>>    target/loongarch/cpu.c     |  2 ++
>>>>    target/m68k/cpu.c          |  2 ++
>>>>    target/microblaze/cpu.c    |  2 ++
>>>>    target/mips/cpu.c          |  2 ++
>>>>    target/openrisc/cpu.c      |  2 ++
>>>>    target/ppc/cpu_init.c      |  2 ++
>>>>    target/riscv/cpu.c         |  2 ++
>>>>    target/rx/cpu.c            |  2 ++
>>>>    target/s390x/cpu.c         |  2 ++
>>>>    target/sh4/cpu.c           |  2 ++
>>>>    target/sparc/cpu.c         |  2 ++
>>>>    target/tricore/cpu.c       |  2 ++
>>>>    target/xtensa/cpu.c        |  2 ++
>>>>    23 files changed, 85 insertions(+), 32 deletions(-)
>>>>
>>>
>>> I have some doubts about the necessity of changing the initialization of
>>> nr_cores/nr_threads, because you can access the machine's topology info
>>> via machine_topo_get_threads_per_socket(), which gives the same result as
>>> `nr_cores * nr_threads`.
>>
>> yeah, it works. The goal is to maintain HT in env->features[].
>>
>> The problem is, as Igor objects, accessing MachineState from CPU context.
>> This is what I'm working on to avoid for the next version.
> 
> :( I also noticed that patch. Adding #ifndef CONFIG_USER_ONLY is not
> optimal there, so that approach is not general enough.
> 
> But your TDX case is different, TDX hook has been already under
> `current_machine` context. So you can directly access any topo info
> without CONFIG_USER_ONLY.
> 
>>> Especially, the TDX feature check hook is also within the context of
>>> `current_machine`, so why not check if TDX's HT is consistent with QEMU's
>>> emulation in the TDX hook?
>>>
>>> For this reason, and based on my comment on patch 2, I think checking HT
>>> in the TDX hook or even ignoring HT, would be a more straightforward and
>>> less impactful solution.
>>
>> Though it's motivated by TDX series, the goal is not TDX specific. The goal
>> is to track features in env->features[] instead of manually generating a
>> one-off value in cpu_x86_cpuid().
> 
> I agree, the principle of designing code should be like this. HT already
> has some issues, and adding this check in x86_cpu_expand_features() is
> also a one-off approach with very much effort. Checking HT in TDX is a
> less costly and less impactful one-off solution.

Please disconnect it from TDX. Even without TDX, it's reasonable enough 
to make the change to track HT in env->features[].

> Even, why not ignore HT in the TDX check? Isn't placing the TDX feature
> check before build_cpuid() implicitly assuming that any subsequent
> updates to CPUID by QEMU are valid?

Why choose to ignore HT in the TDX check? HT isn't special. The problem 
is just HT is not tracked in env->features[], which is a incorrect design.

It assumes the env->features[] is finalized.

> Based on this, why can't TDX trust that QEMU will set HT correctly?
> (However, this assumes that HT cannot be arbitrarily set by the user.)

It's not TDX can't trust QEMU. It's all QEMU's business that QEMU to 
ensure what it is going to set for TD guest is consistent with what TD 
guest sees.

You are suggesting that when QEMU checks the features, it makes HT a 
special case that HT's value is finalized later than the check so please 
ignore HT.

What I want to achieve is to get rid of the special case, because HT 
isn't special. It can be tracked in env->features[] and finalized earlier.

> I'm not sure how the next version will be, but it would definitely be
> better if it helps with generalization.

The goal of this series is exactly generalization to make HT not special.

> Based on the current approach, affecting all architectures, it would be
> better to directly obtain the topology information from current_machine
> in TDX.

Imagine why Paolo suggested to post it separately? IIUC, because it's 
not TDX specific thing.

> -Zhao
> 
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn()
  2024-11-25  9:38     ` Igor Mammedov
  2024-11-29  7:12       ` Xiaoyao Li
@ 2024-12-05 11:53       ` Xiaoyao Li
  1 sibling, 0 replies; 26+ messages in thread
From: Xiaoyao Li @ 2024-12-05 11:53 UTC (permalink / raw)
  To: Igor Mammedov; +Cc: David Hildenbrand, Paolo Bonzini, qemu-devel

On 11/25/2024 5:38 PM, Igor Mammedov wrote:
> On Fri, 22 Nov 2024 11:03:17 -0500
> Xiaoyao Li <xiaoyao.li@intel.com> wrote:
> 
>> Currently cpu->nr_cores and cpu->nr_threads are initialized in
>> qemu_init_vcpu(), which is called a bit late in *cpu_realizefn() for
>> each ARCHes.
>>
>> x86 arch would like to use nr_cores and nr_threads earlier in its
>> realizefn(). To serve this purpose, initialize nr_cores and nr_threads
>> in cpu_common_initfn(), which is earlier than *cpu_realizefn().
>>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>> ---
>>   hw/core/cpu-common.c | 10 +++++++++-
>>   system/cpus.c        |  4 ----
>>   2 files changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
>> index 09c79035949b..6de92ed854bc 100644
>> --- a/hw/core/cpu-common.c
>> +++ b/hw/core/cpu-common.c
>> @@ -237,14 +237,22 @@ static void cpu_common_unrealizefn(DeviceState *dev)
>>   static void cpu_common_initfn(Object *obj)
>>   {
>>       CPUState *cpu = CPU(obj);
>> +    Object *machine = qdev_get_machine();
>> +    MachineState *ms;
>>   
>>       gdb_init_cpu(cpu);
>>       cpu->cpu_index = UNASSIGNED_CPU_INDEX;
>>       cpu->cluster_index = UNASSIGNED_CLUSTER_INDEX;
>>       /* user-mode doesn't have configurable SMP topology */
>> -    /* the default value is changed by qemu_init_vcpu() for system-mode */
>>       cpu->nr_cores = 1;
>>       cpu->nr_threads = 1;
>> +#ifndef CONFIG_USER_ONLY
>> +    if (object_dynamic_cast(machine, TYPE_MACHINE)) {
>> +        ms = MACHINE(machine);
>> +        cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
>> +        cpu->nr_threads = ms->smp.threads;
>> +    }
>> +#endif
> 
> Can't say, that I'm fond of adding/moving hack to access MachineState
> from CPU context. Granted we did/still do it elsewhere, But I'd rather
> prefer getting rid of those remnants that access globals.
> It's basically technical debt we are carrying since 2009 (dc6b1c09849).
> Moving that around doesn't help with getting rid of arbitrary access to globals.
> 
> As Paolo've noted there are other ways to set cores/threads,
> albeit at expense of adding more code. And that could be fine
> if it's done within expected cpu initialization flow.
> 
> Instead of accessing MachineState directly from CPU code (which is
> essentially a layer violation), I'd suggest to set cores_nr/threads_nr
> from pre_plug handler (which is machine code).
> We do similar thing for nr_dies/nr_modules already, and we should do
> same for cores/trheads.
> 
> Quick hack would be do the same for cores/threads in x86_cpu_pre_plug(),
> and make qemu_init_vcpu() conditional to avoid touching other targets/machines.
> 
> I'd even ack that, however that's just leaves us with the same
> old technical debt. So I'd like to coax a promise to fix it properly
> (i.e. get rid of access to machine from CPU code).
> 
> (here goes typical ask to rewrite whole QEMU before doing thing you
> actually need)
> 
> To do that we would need to:
>    1. audit usage of cpu->nr_cores/cpu->nr_threads across QEMU, to figure out
>       what targets/machines need them
>    2. then add pre_plug() handlers to those machines to set them.
>    3. In the end get rid of initializing them in cpu_common_initfn().

here is the update:

For cpu->nr_cores, it's only used by x86 ARCH. We can remove it and 
maintain one for x86 separately.

For cpu->nr_threads, besides x86, it's also used by

1) hw/mips/malta.c

     env->mvp->CP0_MVPConf0 = deposit32(env->mvp->CP0_MVPConf0,
                                        CP0MVPC0_PTC, 8,
                                        smp_cpus * cs->nr_threads - 1);

2) target/mips/tcg/sysemu

     vpe_idx = tc_idx / cs->nr_threads;
     *tc = tc_idx % cs->nr_threads;

3) target/ppc/compat.c

     int n_threads = CPU(cpu)->nr_threads;

There are no existing CPU pre_plug() function for above cases, and I 
don't know how to add it because I know nothing about MIPS/PPC at all.

If desire is still to go with direction, I need someone else to help 
MIPS/PPC. Or is it OK that only change the X86 implementation to 
initialize cpu->nr_threads earlier in  x86_cpu_pre_plug() and leave 
other ARCHes as todo?
	

> With that done we can then add a common helper to generalize topo config
> based on -smp from pre_plug() handlers to reduce duplication caused by
> per machine pre_plug handlers.
> 
> Or introduce a generic cpu_pre_plug() handler at Machine level and make
> _pre_plug call chain to call it (sort of what we do with nested realize calls);
> 
> I'd prefer the 1st option (#2) as it explicitly documents what
> targets/machines care about cores/threads at expense of some boiler plate code
> duplication, instead of blanket generic fallback like we have now (regardless of
> if it's actually needed).
> 
>>       cpu->cflags_next_tb = -1;
>>   
>>       /* allocate storage for thread info, initialise condition variables */
>> diff --git a/system/cpus.c b/system/cpus.c
>> index 1c818ff6828c..c1547fbfd39b 100644
>> --- a/system/cpus.c
>> +++ b/system/cpus.c
>> @@ -664,10 +664,6 @@ const AccelOpsClass *cpus_get_accel(void)
>>   
>>   void qemu_init_vcpu(CPUState *cpu)
>>   {
>> -    MachineState *ms = MACHINE(qdev_get_machine());
>> -
>> -    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
>> -    cpu->nr_threads =  ms->smp.threads;
>>       cpu->stopped = true;
>>       cpu->random_seed = qemu_guest_random_seed_thread_part1();
>>   
> 



^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2024-12-05 11:54 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-08  7:06 [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup Xiaoyao Li
2024-11-08  7:06 ` [PATCH v1 1/4] cpu: Introduce qemu_early_init_vcpu() to initialize nr_cores and nr_threads inside it Xiaoyao Li
2024-11-22 16:03   ` [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn() Xiaoyao Li
2024-11-22 17:26     ` Philippe Mathieu-Daudé
2024-11-22 19:17       ` Richard Henderson
2024-11-25  9:38     ` Igor Mammedov
2024-11-29  7:12       ` Xiaoyao Li
2024-12-05 11:53       ` Xiaoyao Li
2024-11-08  7:06 ` [PATCH v1 2/4] i386/cpu: Set up CPUID_HT in x86_cpu_expand_features() instead of cpu_x86_cpuid() Xiaoyao Li
2024-12-05  7:19   ` Zhao Liu
2024-12-05  7:54     ` Xiaoyao Li
2024-12-05  8:31       ` Zhao Liu
2024-12-05  8:34         ` Xiaoyao Li
2024-11-08  7:06 ` [PATCH v1 3/4] i386/cpu: Set and track CPUID_EXT3_CMP_LEG in env->features[FEAT_8000_0001_ECX] Xiaoyao Li
2024-11-08  7:06 ` [PATCH v1 4/4] i386/cpu: Rectify the comment on order dependency on qemu_init_vcpu() Xiaoyao Li
2024-11-11 10:49 ` [PATCH v1 0/4] Initialize nr_cores and nr_threads early and related clearup David Hildenbrand
2024-11-21 16:24   ` Xiaoyao Li
2024-11-21 17:39     ` Philippe Mathieu-Daudé
2024-11-21 18:52     ` Paolo Bonzini
2024-11-22  2:40       ` Xiaoyao Li
2024-11-22  9:44         ` David Hildenbrand
2024-11-22  9:53           ` Paolo Bonzini
2024-12-05  7:30 ` Zhao Liu
2024-12-05  8:05   ` Xiaoyao Li
2024-12-05  8:48     ` Zhao Liu
2024-12-05  8:50       ` Xiaoyao Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).