qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/5] spapr: fix VCPU ids miscalculation
@ 2018-02-14 19:40 Greg Kurz
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 1/5] spapr: use spapr->vsmt to compute VCPU ids Greg Kurz
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Greg Kurz @ 2018-02-14 19:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, David Gibson, Sam Bobroff, Laurent Vivier

After the review of Laurent's patch "spapr: set vsmt to MAX(8, smp_threads)",
I realized that most of the machine code that deals with VCPU ids assume the
spacing between cores is kvmppc_smt_threads() instead of spapr->vsmt. Most
of the time this is ok because both are equals, and QEMU used to exit if
it failed to set the VSMT mode into KVM.

This changed with the recent commit 8904e5a75005 "spapr: Adjust default
VSMT value for better migration compatibility", and really hurts setups
where kvmppc_smt_threads() and spapr->vsmt can have a different value:
older KVMs or even a recent KVM on a POWER8 host with subcores. The most
notable effects are DRCs associated to wrong VCPUs, which makes CPU
hot-plug completely unusable.

The fix is as simple as using spapr->vsmt everywhere. This is patch 1.
The rest of the series is an effort to consolidate the numbering logic
in a single place. This will avoid future breakage if the VCPU id logic
needs to be changed again later. Changes in patches 2 to 4 are rather
mechanical. Patch 5 is a bit more controversial as it breaks the
migration of unusual CPU topologies that used to be supported by
pre-2.7 machine types.

Please comment.

--
Greg

---

Greg Kurz (5):
      spapr: use spapr->vsmt to compute VCPU ids
      spapr: move VCPU calculation to core machine code
      spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id()
      spapr: consolidate the VCPU id numbering logic in a single place
      spapr: drop DIV_ROUND_UP() from xics_max_server_number()


 hw/ppc/spapr.c          |   74 +++++++++++++++++++++++++++++++++--------------
 hw/ppc/spapr_cpu_core.c |    9 +-----
 include/hw/ppc/spapr.h  |    3 +-
 3 files changed, 56 insertions(+), 30 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 1/5] spapr: use spapr->vsmt to compute VCPU ids
  2018-02-14 19:40 [Qemu-devel] [PATCH 0/5] spapr: fix VCPU ids miscalculation Greg Kurz
@ 2018-02-14 19:40 ` Greg Kurz
  2018-02-15  3:42   ` David Gibson
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 2/5] spapr: move VCPU calculation to core machine code Greg Kurz
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Greg Kurz @ 2018-02-14 19:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, David Gibson, Sam Bobroff, Laurent Vivier

Since the introduction of VSMT in 2.11, the spacing of VCPU ids
between cores is controllable through a machine property instead
of being only dictated by the SMT mode of the host:

    cpu->vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i

Until recently, the machine code would try to change the SMT mode
of the host to be equal to VSMT or exit. This allowed the rest of
the code to assume that kvmppc_smt_threads() == spapr->vsmt is
always true.

Recent commit "8904e5a75005 spapr: Adjust default VSMT value for
better migration compatibility" relaxed the rule. If the VSMT
mode cannot be set in KVM for some reasons, but the requested
CPU topology is compatible with the current SMT mode, then we
let the guest run with  kvmppc_smt_threads() != spapr->vsmt.

This breaks quite a few places in the code, in particular when
calculating DRC indexes.

This is what happens on a POWER host with subcores-per-core=2 (ie,
supports up to SMT4) when passing the following topology:

    -smp threads=4,maxcpus=16 \
    -device host-spapr-cpu-core,core-id=4,id=core1 \
    -device host-spapr-cpu-core,core-id=8,id=core2

qemu-system-ppc64: warning: Failed to set KVM's VSMT mode to 8 (errno -22)

This is expected since KVM is limited to SMT4, but the guest is started
anyway because this topology can run on SMT4 even with a VSMT8 spacing.

But when we look at the DT, things get nastier:

cpus {
        ...
        ibm,drc-indexes = <0x4 0x10000000 0x10000004 0x10000008 0x1000000c>;

This means that we have the following association:

 CPU core device |     DRC    | VCPU id
-----------------+------------+---------
   boot core     | 0x10000000 | 0
   core1         | 0x10000004 | 4
   core2         | 0x10000008 | 8
   core3         | 0x1000000c | 12

But since the spacing of VCPU ids is 8, the DRC for core1 points to a
VCPU that doesn't exist, the DRC for core2 points to the first VCPU of
core1 and and so on...

        ...

        PowerPC,POWER8@0 {
                ...
                ibm,my-drc-index = <0x10000000>;
                ...
        };

        PowerPC,POWER8@8 {
                ...
                ibm,my-drc-index = <0x10000008>;
                ...
        };

        PowerPC,POWER8@10 {
                ...

No ibm,my-drc-index property for this core since 0x10000010 doesn't
exist in ibm,drc-indexes above.

                ...
        };
};

...

interrupt-controller {
        ...
        ibm,interrupt-server-ranges = <0x0 0x10>;

With a spacing of 8, the highest VCPU id for the given topology should be:
        16 * 8 / 4 = 32 and not 16

        ...
        linux,phandle = <0x7e7323b8>;
        interrupt-controller;
};

And CPU hot-plug/unplug is broken:

(qemu) device_del core1
pseries-hotplug-cpu: Cannot find CPU (drc index 10000004) to remove

(qemu) device_del core2
cpu 4 (hwid 8) Ready to die...
cpu 5 (hwid 9) Ready to die...
cpu 6 (hwid 10) Ready to die...
cpu 7 (hwid 11) Ready to die...

These are the VCPU ids of core1 actually

(qemu) device_add host-spapr-cpu-core,core-id=12,id=core3
(qemu) device_del core3
pseries-hotplug-cpu: Cannot find CPU (drc index 1000000c) to remove

This patches all the code in hw/ppc/spapr.c to assume the VSMT
spacing when manipulating VCPU ids.

Fixes: 8904e5a75005
Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c |   24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 9f29434819bd..ea7429c92a97 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -160,9 +160,9 @@ static void pre_2_10_vmstate_unregister_dummy_icp(int i)
                        (void *)(uintptr_t) i);
 }
 
-static inline int xics_max_server_number(void)
+static int xics_max_server_number(sPAPRMachineState *spapr)
 {
-    return DIV_ROUND_UP(max_cpus * kvmppc_smt_threads(), smp_threads);
+    return DIV_ROUND_UP(max_cpus * spapr->vsmt, smp_threads);
 }
 
 static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
@@ -194,7 +194,7 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
     if (smc->pre_2_10_has_unused_icps) {
         int i;
 
-        for (i = 0; i < xics_max_server_number(); i++) {
+        for (i = 0; i < xics_max_server_number(spapr); i++) {
             /* Dummy entries get deregistered when real ICPState objects
              * are registered during CPU core hotplug.
              */
@@ -337,7 +337,6 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
     int ret = 0, offset, cpus_offset;
     CPUState *cs;
     char cpu_model[32];
-    int smt = kvmppc_smt_threads();
     uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
 
     CPU_FOREACH(cs) {
@@ -346,7 +345,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
         int index = spapr_vcpu_id(cpu);
         int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
 
-        if ((index % smt) != 0) {
+        if (index % spapr->vsmt != 0) {
             continue;
         }
 
@@ -614,7 +613,6 @@ static void spapr_populate_cpus_dt_node(void *fdt, sPAPRMachineState *spapr)
     CPUState *cs;
     int cpus_offset;
     char *nodename;
-    int smt = kvmppc_smt_threads();
 
     cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
     _FDT(cpus_offset);
@@ -632,7 +630,7 @@ static void spapr_populate_cpus_dt_node(void *fdt, sPAPRMachineState *spapr)
         DeviceClass *dc = DEVICE_GET_CLASS(cs);
         int offset;
 
-        if ((index % smt) != 0) {
+        if (index % spapr->vsmt != 0) {
             continue;
         }
 
@@ -1131,7 +1129,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
     _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
 
     /* /interrupt controller */
-    spapr_dt_xics(xics_max_server_number(), fdt, PHANDLE_XICP);
+    spapr_dt_xics(xics_max_server_number(spapr), fdt, PHANDLE_XICP);
 
     ret = spapr_populate_memory(spapr, fdt);
     if (ret < 0) {
@@ -2224,7 +2222,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
     MachineState *machine = MACHINE(spapr);
     MachineClass *mc = MACHINE_GET_CLASS(machine);
     const char *type = spapr_get_cpu_core_type(machine->cpu_type);
-    int smt = kvmppc_smt_threads();
     const CPUArchIdList *possible_cpus;
     int boot_cores_nr = smp_cpus / smp_threads;
     int i;
@@ -2254,7 +2251,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
 
         if (mc->has_hotpluggable_cpus) {
             spapr_dr_connector_new(OBJECT(spapr), TYPE_SPAPR_DRC_CPU,
-                                   (core_id / smp_threads) * smt);
+                                   (core_id / smp_threads) * spapr->vsmt);
         }
 
         if (i < boot_cores_nr) {
@@ -3281,10 +3278,10 @@ static
 void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
                                Error **errp)
 {
+    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
     int index;
     sPAPRDRConnector *drc;
     CPUCore *cc = CPU_CORE(dev);
-    int smt = kvmppc_smt_threads();
 
     if (!spapr_find_cpu_slot(MACHINE(hotplug_dev), cc->core_id, &index)) {
         error_setg(errp, "Unable to find CPU core with core-id: %d",
@@ -3296,7 +3293,7 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
         return;
     }
 
-    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * smt);
+    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * spapr->vsmt);
     g_assert(drc);
 
     spapr_drc_detach(drc);
@@ -3315,7 +3312,6 @@ static void spapr_core_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     CPUState *cs = CPU(core->threads[0]);
     sPAPRDRConnector *drc;
     Error *local_err = NULL;
-    int smt = kvmppc_smt_threads();
     CPUArchId *core_slot;
     int index;
     bool hotplugged = spapr_drc_hotplugged(dev);
@@ -3326,7 +3322,7 @@ static void spapr_core_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                    cc->core_id);
         return;
     }
-    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * smt);
+    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * spapr->vsmt);
 
     g_assert(drc || !mc->has_hotpluggable_cpus);
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 2/5] spapr: move VCPU calculation to core machine code
  2018-02-14 19:40 [Qemu-devel] [PATCH 0/5] spapr: fix VCPU ids miscalculation Greg Kurz
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 1/5] spapr: use spapr->vsmt to compute VCPU ids Greg Kurz
@ 2018-02-14 19:40 ` Greg Kurz
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 3/5] spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id() Greg Kurz
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Greg Kurz @ 2018-02-14 19:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, David Gibson, Sam Bobroff, Laurent Vivier

The VCPU ids are currently computed and assigned to each individual
CPU threads in spapr_cpu_core_realize(). But the numbering logic
of VCPU ids is actually a machine-level concept, and many places
in hw/ppc/spapr.c also have to compute VCPU ids out of CPU indexes.

The current formula used in spapr_cpu_core_realize() is:

    vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i

where:

    cc->core_id is a multiple of smp_threads
    cpu_index = cc->core_id + i
    0 <= i < smp_threads

So we have:

    cpu_index % smp_threads == i
    cc->core_id / smp_threads == cpu_index / smp_threads

hence:

    vcpu_id =
        (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads;

This formula was used before VSMT at the time VCPU ids where computed
at the target emulation level. It has the advantage of being useable
to derive a VPCU id out of a CPU index only. It is fitted for all the
places where the machine code has to compute a VCPU id.

This patch introduces an accessor to set the VCPU id in a PowerPCCPU object
using the above formula. It is a first step to consolidate all the VCPU id
logic in a single place.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c          |   19 +++++++++++++++++++
 hw/ppc/spapr_cpu_core.c |    9 ++-------
 include/hw/ppc/spapr.h  |    1 +
 3 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index ea7429c92a97..30cc48fd5264 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3802,6 +3802,25 @@ int spapr_vcpu_id(PowerPCCPU *cpu)
     }
 }
 
+void spapr_set_vcpu_id(PowerPCCPU *cpu, int cpu_index, Error **errp)
+{
+    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
+    int vcpu_id;
+
+    vcpu_id =
+        (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads;
+
+    if (kvm_enabled() && !kvm_vcpu_id_is_valid(vcpu_id)) {
+        error_setg(errp, "Can't create CPU with id %d in KVM", vcpu_id);
+        error_append_hint(errp, "Adjust the number of cpus to %d "
+                          "or try to raise the number of threads per core\n",
+                          vcpu_id * smp_threads / spapr->vsmt);
+        return;
+    }
+
+    cpu->vcpu_id = vcpu_id;
+}
+
 PowerPCCPU *spapr_find_cpu(int vcpu_id)
 {
     CPUState *cs;
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 590d167b04c8..94afeb399e99 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -172,13 +172,8 @@ static void spapr_cpu_core_realize(DeviceState *dev, Error **errp)
         cs = CPU(obj);
         cpu = sc->threads[i] = POWERPC_CPU(obj);
         cs->cpu_index = cc->core_id + i;
-        cpu->vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i;
-        if (kvm_enabled() && !kvm_vcpu_id_is_valid(cpu->vcpu_id)) {
-            error_setg(&local_err, "Can't create CPU with id %d in KVM",
-                       cpu->vcpu_id);
-            error_append_hint(&local_err, "Adjust the number of cpus to %d "
-                              "or try to raise the number of threads per core\n",
-                              cpu->vcpu_id * smp_threads / spapr->vsmt);
+        spapr_set_vcpu_id(cpu, cs->cpu_index, &local_err);
+        if (local_err) {
             goto err;
         }
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 62c077ac2037..af19320d2f8a 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -767,6 +767,7 @@ void spapr_do_system_reset_on_cpu(CPUState *cs, run_on_cpu_data arg);
 #define HTAB_SIZE(spapr)        (1ULL << ((spapr)->htab_shift))
 
 int spapr_vcpu_id(PowerPCCPU *cpu);
+void spapr_set_vcpu_id(PowerPCCPU *cpu, int cpu_index, Error **errp);
 PowerPCCPU *spapr_find_cpu(int vcpu_id);
 
 int spapr_irq_alloc(sPAPRMachineState *spapr, int irq_hint, bool lsi,

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 3/5] spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id()
  2018-02-14 19:40 [Qemu-devel] [PATCH 0/5] spapr: fix VCPU ids miscalculation Greg Kurz
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 1/5] spapr: use spapr->vsmt to compute VCPU ids Greg Kurz
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 2/5] spapr: move VCPU calculation to core machine code Greg Kurz
@ 2018-02-14 19:40 ` Greg Kurz
  2018-02-15  3:50   ` David Gibson
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 4/5] spapr: consolidate the VCPU id numbering logic in a single place Greg Kurz
  2018-02-14 19:41 ` [Qemu-devel] [PATCH 5/5] spapr: drop DIV_ROUND_UP() from xics_max_server_number() Greg Kurz
  4 siblings, 1 reply; 13+ messages in thread
From: Greg Kurz @ 2018-02-14 19:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, David Gibson, Sam Bobroff, Laurent Vivier

The spapr_vcpu_id() function is an accessor actually. Let's rename it
for symmetry with the recently added spapr_set_vcpu_id() helper.

The motivation behind this is that a later patch will consolidate
the VCPU id formula in a function and spapr_vcpu_id looks like an
appropriate name.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c         |   16 ++++++++--------
 include/hw/ppc/spapr.h |    2 +-
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 30cc48fd5264..18ebc058acdd 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -209,7 +209,7 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
     int i, ret = 0;
     uint32_t servers_prop[smt_threads];
     uint32_t gservers_prop[smt_threads * 2];
-    int index = spapr_vcpu_id(cpu);
+    int index = spapr_get_vcpu_id(cpu);
 
     if (cpu->compat_pvr) {
         ret = fdt_setprop_cell(fdt, offset, "cpu-version", cpu->compat_pvr);
@@ -238,7 +238,7 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
 
 static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, PowerPCCPU *cpu)
 {
-    int index = spapr_vcpu_id(cpu);
+    int index = spapr_get_vcpu_id(cpu);
     uint32_t associativity[] = {cpu_to_be32(0x5),
                                 cpu_to_be32(0x0),
                                 cpu_to_be32(0x0),
@@ -342,7 +342,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
     CPU_FOREACH(cs) {
         PowerPCCPU *cpu = POWERPC_CPU(cs);
         DeviceClass *dc = DEVICE_GET_CLASS(cs);
-        int index = spapr_vcpu_id(cpu);
+        int index = spapr_get_vcpu_id(cpu);
         int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
 
         if (index % spapr->vsmt != 0) {
@@ -492,7 +492,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
     PowerPCCPU *cpu = POWERPC_CPU(cs);
     CPUPPCState *env = &cpu->env;
     PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
-    int index = spapr_vcpu_id(cpu);
+    int index = spapr_get_vcpu_id(cpu);
     uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
                        0xffffffff, 0xffffffff};
     uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq()
@@ -626,7 +626,7 @@ static void spapr_populate_cpus_dt_node(void *fdt, sPAPRMachineState *spapr)
      */
     CPU_FOREACH_REVERSE(cs) {
         PowerPCCPU *cpu = POWERPC_CPU(cs);
-        int index = spapr_vcpu_id(cpu);
+        int index = spapr_get_vcpu_id(cpu);
         DeviceClass *dc = DEVICE_GET_CLASS(cs);
         int offset;
 
@@ -3234,7 +3234,7 @@ static void *spapr_populate_hotplug_cpu_dt(CPUState *cs, int *fdt_offset,
 {
     PowerPCCPU *cpu = POWERPC_CPU(cs);
     DeviceClass *dc = DEVICE_GET_CLASS(cs);
-    int id = spapr_vcpu_id(cpu);
+    int id = spapr_get_vcpu_id(cpu);
     void *fdt;
     int offset, fdt_size;
     char *nodename;
@@ -3791,7 +3791,7 @@ static void spapr_pic_print_info(InterruptStatsProvider *obj,
     ics_pic_print_info(spapr->ics, mon);
 }
 
-int spapr_vcpu_id(PowerPCCPU *cpu)
+int spapr_get_vcpu_id(PowerPCCPU *cpu)
 {
     CPUState *cs = CPU(cpu);
 
@@ -3828,7 +3828,7 @@ PowerPCCPU *spapr_find_cpu(int vcpu_id)
     CPU_FOREACH(cs) {
         PowerPCCPU *cpu = POWERPC_CPU(cs);
 
-        if (spapr_vcpu_id(cpu) == vcpu_id) {
+        if (spapr_get_vcpu_id(cpu) == vcpu_id) {
             return cpu;
         }
     }
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index af19320d2f8a..36942b378daa 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -766,7 +766,7 @@ void spapr_do_system_reset_on_cpu(CPUState *cs, run_on_cpu_data arg);
 
 #define HTAB_SIZE(spapr)        (1ULL << ((spapr)->htab_shift))
 
-int spapr_vcpu_id(PowerPCCPU *cpu);
+int spapr_get_vcpu_id(PowerPCCPU *cpu);
 void spapr_set_vcpu_id(PowerPCCPU *cpu, int cpu_index, Error **errp);
 PowerPCCPU *spapr_find_cpu(int vcpu_id);
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 4/5] spapr: consolidate the VCPU id numbering logic in a single place
  2018-02-14 19:40 [Qemu-devel] [PATCH 0/5] spapr: fix VCPU ids miscalculation Greg Kurz
                   ` (2 preceding siblings ...)
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 3/5] spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id() Greg Kurz
@ 2018-02-14 19:40 ` Greg Kurz
  2018-02-15  4:05   ` David Gibson
  2018-02-14 19:41 ` [Qemu-devel] [PATCH 5/5] spapr: drop DIV_ROUND_UP() from xics_max_server_number() Greg Kurz
  4 siblings, 1 reply; 13+ messages in thread
From: Greg Kurz @ 2018-02-14 19:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, David Gibson, Sam Bobroff, Laurent Vivier

Several places in the code need to calculate a VCPU id:

    (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads
    (core_id / smp_threads) * spapr->vsmt (1 user)
    index * spapr->vsmt (2 users)

or guess that the VCPU id of a given VCPU is the first thread of a virtual
core:

    index % spapr->vsmt != 0

Even if the numbering logic isn't that complex, it is rather fragile to
have these assumptions open-coded in several places. FWIW this was
proved with recent issues related to VSMT.

This patch moves the VCPU id formula to a single function to be called
everywhere the code needs to compute one. It also adds an helper to
guess if a VCPU is the first thread of a VCORE.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c |   29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 18ebc058acdd..800d3f001253 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -99,6 +99,20 @@
 
 #define PHANDLE_XICP            0x00001111
 
+/* These two functions implement the VCPU id numbering: one to compute them
+ * all and one to identify thread 0 of a VCORE. Any change to the first one
+ * is likely to have an impact on the second one, so let's keep them close.
+ */
+static int spapr_vcpu_id(sPAPRMachineState *spapr, int cpu_index)
+{
+    return
+        (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads;
+}
+static bool spapr_is_vcore(sPAPRMachineState *spapr, PowerPCCPU *cpu)
+{
+    return spapr_get_vcpu_id(cpu) % spapr->vsmt == 0;
+}
+
 static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
                                   const char *type_ics,
                                   int nr_irqs, Error **errp)
@@ -345,7 +359,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
         int index = spapr_get_vcpu_id(cpu);
         int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
 
-        if (index % spapr->vsmt != 0) {
+        if (!spapr_is_vcore(spapr, cpu)) {
             continue;
         }
 
@@ -630,7 +644,7 @@ static void spapr_populate_cpus_dt_node(void *fdt, sPAPRMachineState *spapr)
         DeviceClass *dc = DEVICE_GET_CLASS(cs);
         int offset;
 
-        if (index % spapr->vsmt != 0) {
+        if (!spapr_is_vcore(spapr, cpu)) {
             continue;
         }
 
@@ -2251,7 +2265,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
 
         if (mc->has_hotpluggable_cpus) {
             spapr_dr_connector_new(OBJECT(spapr), TYPE_SPAPR_DRC_CPU,
-                                   (core_id / smp_threads) * spapr->vsmt);
+                                   spapr_vcpu_id(spapr, core_id));
         }
 
         if (i < boot_cores_nr) {
@@ -3293,7 +3307,8 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
         return;
     }
 
-    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * spapr->vsmt);
+    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU,
+                          spapr_vcpu_id(spapr, cc->core_id));
     g_assert(drc);
 
     spapr_drc_detach(drc);
@@ -3322,7 +3337,8 @@ static void spapr_core_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                    cc->core_id);
         return;
     }
-    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * spapr->vsmt);
+    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU,
+                          spapr_vcpu_id(spapr, cc->core_id));
 
     g_assert(drc || !mc->has_hotpluggable_cpus);
 
@@ -3807,8 +3823,7 @@ void spapr_set_vcpu_id(PowerPCCPU *cpu, int cpu_index, Error **errp)
     sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
     int vcpu_id;
 
-    vcpu_id =
-        (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads;
+    vcpu_id = spapr_vcpu_id(spapr, cpu_index);
 
     if (kvm_enabled() && !kvm_vcpu_id_is_valid(vcpu_id)) {
         error_setg(errp, "Can't create CPU with id %d in KVM", vcpu_id);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 5/5] spapr: drop DIV_ROUND_UP() from xics_max_server_number()
  2018-02-14 19:40 [Qemu-devel] [PATCH 0/5] spapr: fix VCPU ids miscalculation Greg Kurz
                   ` (3 preceding siblings ...)
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 4/5] spapr: consolidate the VCPU id numbering logic in a single place Greg Kurz
@ 2018-02-14 19:41 ` Greg Kurz
  2018-02-15  4:08   ` David Gibson
  4 siblings, 1 reply; 13+ messages in thread
From: Greg Kurz @ 2018-02-14 19:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-ppc, David Gibson, Sam Bobroff, Laurent Vivier

XICS needs to know the highest VCPU id that may be presented to the
guest plus 1. Commit f303f117fec3 "spapr: ensure we have at least one
XICS server" changed how the maximum is computed from:

        smp_cpus * kvmppc_smt_threads() / smp_threads

to:

        DIV_ROUND_UP(smp_cpus * kvmppc_smt_threads(), smp_threads)

This was done because at the time we could pass broken CPU topologies
to the -smp command line options, such as threads=9,cpus=1. On a POWER8
host this would give:

        1 * 8 / 9 == 0 servers

and cause QEMU to crash later during XICS setup.

The formulat evolved a bit to accomodate CPU hot-plug and VSMT, but
most important, stricter checks are performed on the CPU topology.

With -smp threads=9,cpus=1:

qemu-system-ppc64:
 cpu topology: sockets (1) * cores (1) * threads (9) > maxcpus (1)

With -smp threads=9,maxcpus=1:

qemu-system-ppc64: maxcpus must be equal to or greater than smp

More generally, machine types with hotplug support (2.7 and up), no
longer allow to set maxcpus or smp_cpus to a value that isnt't a
multiple of smp_threads.

With -smp threads=4,cpus=6:

qemu-system-ppc64: smp_cpus (6) must be multiple of threads (4)

With -smp threads=4,maxcpus=6:

qemu-system-ppc64: max_cpus (6) must be multiple of threads (4)

This means that the division is perfect and we don't need DIV_ROUND_UP(),
and we could do a regular division:

        max_cpus * spapr->vsmt / smp_threads

So this patch changes xics_max_server_number() to use the spapr_vcpu_id(),
which works too since max_cpus is a multiple of smp_threads:

    (max_cpus / smp_threads ) * spapr->vsmt + max_cpus % smp_threads

It breaks migration of pre-2.7 machine types with unusual CPU topologies,
but I guess this is an acceptable trade-off.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 800d3f001253..f1722214cc74 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -176,7 +176,7 @@ static void pre_2_10_vmstate_unregister_dummy_icp(int i)
 
 static int xics_max_server_number(sPAPRMachineState *spapr)
 {
-    return DIV_ROUND_UP(max_cpus * spapr->vsmt, smp_threads);
+    return spapr_vcpu_id(spapr, max_cpus)
 }
 
 static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] spapr: use spapr->vsmt to compute VCPU ids
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 1/5] spapr: use spapr->vsmt to compute VCPU ids Greg Kurz
@ 2018-02-15  3:42   ` David Gibson
  0 siblings, 0 replies; 13+ messages in thread
From: David Gibson @ 2018-02-15  3:42 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-devel, qemu-ppc, Sam Bobroff, Laurent Vivier

[-- Attachment #1: Type: text/plain, Size: 9267 bytes --]

On Wed, Feb 14, 2018 at 08:40:26PM +0100, Greg Kurz wrote:
> Since the introduction of VSMT in 2.11, the spacing of VCPU ids
> between cores is controllable through a machine property instead
> of being only dictated by the SMT mode of the host:
> 
>     cpu->vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i
> 
> Until recently, the machine code would try to change the SMT mode
> of the host to be equal to VSMT or exit. This allowed the rest of
> the code to assume that kvmppc_smt_threads() == spapr->vsmt is
> always true.
> 
> Recent commit "8904e5a75005 spapr: Adjust default VSMT value for
> better migration compatibility" relaxed the rule. If the VSMT
> mode cannot be set in KVM for some reasons, but the requested
> CPU topology is compatible with the current SMT mode, then we
> let the guest run with  kvmppc_smt_threads() != spapr->vsmt.
> 
> This breaks quite a few places in the code, in particular when
> calculating DRC indexes.
> 
> This is what happens on a POWER host with subcores-per-core=2 (ie,
> supports up to SMT4) when passing the following topology:
> 
>     -smp threads=4,maxcpus=16 \
>     -device host-spapr-cpu-core,core-id=4,id=core1 \
>     -device host-spapr-cpu-core,core-id=8,id=core2
> 
> qemu-system-ppc64: warning: Failed to set KVM's VSMT mode to 8 (errno -22)
> 
> This is expected since KVM is limited to SMT4, but the guest is started
> anyway because this topology can run on SMT4 even with a VSMT8 spacing.
> 
> But when we look at the DT, things get nastier:
> 
> cpus {
>         ...
>         ibm,drc-indexes = <0x4 0x10000000 0x10000004 0x10000008 0x1000000c>;
> 
> This means that we have the following association:
> 
>  CPU core device |     DRC    | VCPU id
> -----------------+------------+---------
>    boot core     | 0x10000000 | 0
>    core1         | 0x10000004 | 4
>    core2         | 0x10000008 | 8
>    core3         | 0x1000000c | 12
> 
> But since the spacing of VCPU ids is 8, the DRC for core1 points to a
> VCPU that doesn't exist, the DRC for core2 points to the first VCPU of
> core1 and and so on...
> 
>         ...
> 
>         PowerPC,POWER8@0 {
>                 ...
>                 ibm,my-drc-index = <0x10000000>;
>                 ...
>         };
> 
>         PowerPC,POWER8@8 {
>                 ...
>                 ibm,my-drc-index = <0x10000008>;
>                 ...
>         };
> 
>         PowerPC,POWER8@10 {
>                 ...
> 
> No ibm,my-drc-index property for this core since 0x10000010 doesn't
> exist in ibm,drc-indexes above.
> 
>                 ...
>         };
> };
> 
> ...
> 
> interrupt-controller {
>         ...
>         ibm,interrupt-server-ranges = <0x0 0x10>;
> 
> With a spacing of 8, the highest VCPU id for the given topology should be:
>         16 * 8 / 4 = 32 and not 16
> 
>         ...
>         linux,phandle = <0x7e7323b8>;
>         interrupt-controller;
> };
> 
> And CPU hot-plug/unplug is broken:
> 
> (qemu) device_del core1
> pseries-hotplug-cpu: Cannot find CPU (drc index 10000004) to remove
> 
> (qemu) device_del core2
> cpu 4 (hwid 8) Ready to die...
> cpu 5 (hwid 9) Ready to die...
> cpu 6 (hwid 10) Ready to die...
> cpu 7 (hwid 11) Ready to die...
> 
> These are the VCPU ids of core1 actually
> 
> (qemu) device_add host-spapr-cpu-core,core-id=12,id=core3
> (qemu) device_del core3
> pseries-hotplug-cpu: Cannot find CPU (drc index 1000000c) to remove
> 
> This patches all the code in hw/ppc/spapr.c to assume the VSMT
> spacing when manipulating VCPU ids.
> 
> Fixes: 8904e5a75005
> Signed-off-by: Greg Kurz <groug@kaod.org>

Ouch, good catch.  That's a lot of nasty bugs I hadn't realised were
there.  Applied, thanks.

> ---
>  hw/ppc/spapr.c |   24 ++++++++++--------------
>  1 file changed, 10 insertions(+), 14 deletions(-)



> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 9f29434819bd..ea7429c92a97 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -160,9 +160,9 @@ static void pre_2_10_vmstate_unregister_dummy_icp(int i)
>                         (void *)(uintptr_t) i);
>  }
>  
> -static inline int xics_max_server_number(void)
> +static int xics_max_server_number(sPAPRMachineState *spapr)
>  {
> -    return DIV_ROUND_UP(max_cpus * kvmppc_smt_threads(), smp_threads);
> +    return DIV_ROUND_UP(max_cpus * spapr->vsmt, smp_threads);
>  }
>  
>  static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
> @@ -194,7 +194,7 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
>      if (smc->pre_2_10_has_unused_icps) {
>          int i;
>  
> -        for (i = 0; i < xics_max_server_number(); i++) {
> +        for (i = 0; i < xics_max_server_number(spapr); i++) {
>              /* Dummy entries get deregistered when real ICPState objects
>               * are registered during CPU core hotplug.
>               */
> @@ -337,7 +337,6 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
>      int ret = 0, offset, cpus_offset;
>      CPUState *cs;
>      char cpu_model[32];
> -    int smt = kvmppc_smt_threads();
>      uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
>  
>      CPU_FOREACH(cs) {
> @@ -346,7 +345,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
>          int index = spapr_vcpu_id(cpu);
>          int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
>  
> -        if ((index % smt) != 0) {
> +        if (index % spapr->vsmt != 0) {
>              continue;
>          }
>  
> @@ -614,7 +613,6 @@ static void spapr_populate_cpus_dt_node(void *fdt, sPAPRMachineState *spapr)
>      CPUState *cs;
>      int cpus_offset;
>      char *nodename;
> -    int smt = kvmppc_smt_threads();
>  
>      cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
>      _FDT(cpus_offset);
> @@ -632,7 +630,7 @@ static void spapr_populate_cpus_dt_node(void *fdt, sPAPRMachineState *spapr)
>          DeviceClass *dc = DEVICE_GET_CLASS(cs);
>          int offset;
>  
> -        if ((index % smt) != 0) {
> +        if (index % spapr->vsmt != 0) {
>              continue;
>          }
>  
> @@ -1131,7 +1129,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
>      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
>  
>      /* /interrupt controller */
> -    spapr_dt_xics(xics_max_server_number(), fdt, PHANDLE_XICP);
> +    spapr_dt_xics(xics_max_server_number(spapr), fdt, PHANDLE_XICP);
>  
>      ret = spapr_populate_memory(spapr, fdt);
>      if (ret < 0) {
> @@ -2224,7 +2222,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
>      MachineState *machine = MACHINE(spapr);
>      MachineClass *mc = MACHINE_GET_CLASS(machine);
>      const char *type = spapr_get_cpu_core_type(machine->cpu_type);
> -    int smt = kvmppc_smt_threads();
>      const CPUArchIdList *possible_cpus;
>      int boot_cores_nr = smp_cpus / smp_threads;
>      int i;
> @@ -2254,7 +2251,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
>  
>          if (mc->has_hotpluggable_cpus) {
>              spapr_dr_connector_new(OBJECT(spapr), TYPE_SPAPR_DRC_CPU,
> -                                   (core_id / smp_threads) * smt);
> +                                   (core_id / smp_threads) * spapr->vsmt);
>          }
>  
>          if (i < boot_cores_nr) {
> @@ -3281,10 +3278,10 @@ static
>  void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                 Error **errp)
>  {
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(OBJECT(hotplug_dev));
>      int index;
>      sPAPRDRConnector *drc;
>      CPUCore *cc = CPU_CORE(dev);
> -    int smt = kvmppc_smt_threads();
>  
>      if (!spapr_find_cpu_slot(MACHINE(hotplug_dev), cc->core_id, &index)) {
>          error_setg(errp, "Unable to find CPU core with core-id: %d",
> @@ -3296,7 +3293,7 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
>          return;
>      }
>  
> -    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * smt);
> +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * spapr->vsmt);
>      g_assert(drc);
>  
>      spapr_drc_detach(drc);
> @@ -3315,7 +3312,6 @@ static void spapr_core_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>      CPUState *cs = CPU(core->threads[0]);
>      sPAPRDRConnector *drc;
>      Error *local_err = NULL;
> -    int smt = kvmppc_smt_threads();
>      CPUArchId *core_slot;
>      int index;
>      bool hotplugged = spapr_drc_hotplugged(dev);
> @@ -3326,7 +3322,7 @@ static void spapr_core_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                     cc->core_id);
>          return;
>      }
> -    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * smt);
> +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * spapr->vsmt);
>  
>      g_assert(drc || !mc->has_hotpluggable_cpus);
>  
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id()
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 3/5] spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id() Greg Kurz
@ 2018-02-15  3:50   ` David Gibson
  0 siblings, 0 replies; 13+ messages in thread
From: David Gibson @ 2018-02-15  3:50 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-devel, qemu-ppc, Sam Bobroff, Laurent Vivier

[-- Attachment #1: Type: text/plain, Size: 4682 bytes --]

On Wed, Feb 14, 2018 at 08:40:44PM +0100, Greg Kurz wrote:
> The spapr_vcpu_id() function is an accessor actually. Let's rename it
> for symmetry with the recently added spapr_set_vcpu_id() helper.
> 
> The motivation behind this is that a later patch will consolidate
> the VCPU id formula in a function and spapr_vcpu_id looks like an
> appropriate name.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

There's some minor details I'm not sure I like about this patch and
the previous one, but nothing important enough to delay the later
parts of the series.  So, applied.

> ---
>  hw/ppc/spapr.c         |   16 ++++++++--------
>  include/hw/ppc/spapr.h |    2 +-
>  2 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 30cc48fd5264..18ebc058acdd 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -209,7 +209,7 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
>      int i, ret = 0;
>      uint32_t servers_prop[smt_threads];
>      uint32_t gservers_prop[smt_threads * 2];
> -    int index = spapr_vcpu_id(cpu);
> +    int index = spapr_get_vcpu_id(cpu);
>  
>      if (cpu->compat_pvr) {
>          ret = fdt_setprop_cell(fdt, offset, "cpu-version", cpu->compat_pvr);
> @@ -238,7 +238,7 @@ static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
>  
>  static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, PowerPCCPU *cpu)
>  {
> -    int index = spapr_vcpu_id(cpu);
> +    int index = spapr_get_vcpu_id(cpu);
>      uint32_t associativity[] = {cpu_to_be32(0x5),
>                                  cpu_to_be32(0x0),
>                                  cpu_to_be32(0x0),
> @@ -342,7 +342,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
>      CPU_FOREACH(cs) {
>          PowerPCCPU *cpu = POWERPC_CPU(cs);
>          DeviceClass *dc = DEVICE_GET_CLASS(cs);
> -        int index = spapr_vcpu_id(cpu);
> +        int index = spapr_get_vcpu_id(cpu);
>          int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
>  
>          if (index % spapr->vsmt != 0) {
> @@ -492,7 +492,7 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset,
>      PowerPCCPU *cpu = POWERPC_CPU(cs);
>      CPUPPCState *env = &cpu->env;
>      PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
> -    int index = spapr_vcpu_id(cpu);
> +    int index = spapr_get_vcpu_id(cpu);
>      uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
>                         0xffffffff, 0xffffffff};
>      uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq()
> @@ -626,7 +626,7 @@ static void spapr_populate_cpus_dt_node(void *fdt, sPAPRMachineState *spapr)
>       */
>      CPU_FOREACH_REVERSE(cs) {
>          PowerPCCPU *cpu = POWERPC_CPU(cs);
> -        int index = spapr_vcpu_id(cpu);
> +        int index = spapr_get_vcpu_id(cpu);
>          DeviceClass *dc = DEVICE_GET_CLASS(cs);
>          int offset;
>  
> @@ -3234,7 +3234,7 @@ static void *spapr_populate_hotplug_cpu_dt(CPUState *cs, int *fdt_offset,
>  {
>      PowerPCCPU *cpu = POWERPC_CPU(cs);
>      DeviceClass *dc = DEVICE_GET_CLASS(cs);
> -    int id = spapr_vcpu_id(cpu);
> +    int id = spapr_get_vcpu_id(cpu);
>      void *fdt;
>      int offset, fdt_size;
>      char *nodename;
> @@ -3791,7 +3791,7 @@ static void spapr_pic_print_info(InterruptStatsProvider *obj,
>      ics_pic_print_info(spapr->ics, mon);
>  }
>  
> -int spapr_vcpu_id(PowerPCCPU *cpu)
> +int spapr_get_vcpu_id(PowerPCCPU *cpu)
>  {
>      CPUState *cs = CPU(cpu);
>  
> @@ -3828,7 +3828,7 @@ PowerPCCPU *spapr_find_cpu(int vcpu_id)
>      CPU_FOREACH(cs) {
>          PowerPCCPU *cpu = POWERPC_CPU(cs);
>  
> -        if (spapr_vcpu_id(cpu) == vcpu_id) {
> +        if (spapr_get_vcpu_id(cpu) == vcpu_id) {
>              return cpu;
>          }
>      }
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index af19320d2f8a..36942b378daa 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -766,7 +766,7 @@ void spapr_do_system_reset_on_cpu(CPUState *cs, run_on_cpu_data arg);
>  
>  #define HTAB_SIZE(spapr)        (1ULL << ((spapr)->htab_shift))
>  
> -int spapr_vcpu_id(PowerPCCPU *cpu);
> +int spapr_get_vcpu_id(PowerPCCPU *cpu);
>  void spapr_set_vcpu_id(PowerPCCPU *cpu, int cpu_index, Error **errp);
>  PowerPCCPU *spapr_find_cpu(int vcpu_id);
>  
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] spapr: consolidate the VCPU id numbering logic in a single place
  2018-02-14 19:40 ` [Qemu-devel] [PATCH 4/5] spapr: consolidate the VCPU id numbering logic in a single place Greg Kurz
@ 2018-02-15  4:05   ` David Gibson
  0 siblings, 0 replies; 13+ messages in thread
From: David Gibson @ 2018-02-15  4:05 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-devel, qemu-ppc, Sam Bobroff, Laurent Vivier

[-- Attachment #1: Type: text/plain, Size: 4730 bytes --]

On Wed, Feb 14, 2018 at 08:40:53PM +0100, Greg Kurz wrote:
> Several places in the code need to calculate a VCPU id:
> 
>     (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads
>     (core_id / smp_threads) * spapr->vsmt (1 user)
>     index * spapr->vsmt (2 users)
> 
> or guess that the VCPU id of a given VCPU is the first thread of a virtual
> core:
> 
>     index % spapr->vsmt != 0
> 
> Even if the numbering logic isn't that complex, it is rather fragile to
> have these assumptions open-coded in several places. FWIW this was
> proved with recent issues related to VSMT.
> 
> This patch moves the VCPU id formula to a single function to be called
> everywhere the code needs to compute one. It also adds an helper to
> guess if a VCPU is the first thread of a VCORE.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Good change.  I don't like the name 'spapr_is_vcore' though - cores
are a logically different thing from thread0 of the core.  So I've
renamed it to spapr_is_thread0_of_vcore() as I've applied it.

> ---
>  hw/ppc/spapr.c |   29 ++++++++++++++++++++++-------
>  1 file changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 18ebc058acdd..800d3f001253 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -99,6 +99,20 @@
>  
>  #define PHANDLE_XICP            0x00001111
>  
> +/* These two functions implement the VCPU id numbering: one to compute them
> + * all and one to identify thread 0 of a VCORE. Any change to the first one
> + * is likely to have an impact on the second one, so let's keep them close.
> + */
> +static int spapr_vcpu_id(sPAPRMachineState *spapr, int cpu_index)
> +{
> +    return
> +        (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads;
> +}
> +static bool spapr_is_vcore(sPAPRMachineState *spapr, PowerPCCPU *cpu)
> +{
> +    return spapr_get_vcpu_id(cpu) % spapr->vsmt == 0;
> +}
> +
>  static ICSState *spapr_ics_create(sPAPRMachineState *spapr,
>                                    const char *type_ics,
>                                    int nr_irqs, Error **errp)
> @@ -345,7 +359,7 @@ static int spapr_fixup_cpu_dt(void *fdt, sPAPRMachineState *spapr)
>          int index = spapr_get_vcpu_id(cpu);
>          int compat_smt = MIN(smp_threads, ppc_compat_max_vthreads(cpu));
>  
> -        if (index % spapr->vsmt != 0) {
> +        if (!spapr_is_vcore(spapr, cpu)) {
>              continue;
>          }
>  
> @@ -630,7 +644,7 @@ static void spapr_populate_cpus_dt_node(void *fdt, sPAPRMachineState *spapr)
>          DeviceClass *dc = DEVICE_GET_CLASS(cs);
>          int offset;
>  
> -        if (index % spapr->vsmt != 0) {
> +        if (!spapr_is_vcore(spapr, cpu)) {
>              continue;
>          }
>  
> @@ -2251,7 +2265,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
>  
>          if (mc->has_hotpluggable_cpus) {
>              spapr_dr_connector_new(OBJECT(spapr), TYPE_SPAPR_DRC_CPU,
> -                                   (core_id / smp_threads) * spapr->vsmt);
> +                                   spapr_vcpu_id(spapr, core_id));
>          }
>  
>          if (i < boot_cores_nr) {
> @@ -3293,7 +3307,8 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev,
>          return;
>      }
>  
> -    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * spapr->vsmt);
> +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU,
> +                          spapr_vcpu_id(spapr, cc->core_id));
>      g_assert(drc);
>  
>      spapr_drc_detach(drc);
> @@ -3322,7 +3337,8 @@ static void spapr_core_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                     cc->core_id);
>          return;
>      }
> -    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU, index * spapr->vsmt);
> +    drc = spapr_drc_by_id(TYPE_SPAPR_DRC_CPU,
> +                          spapr_vcpu_id(spapr, cc->core_id));
>  
>      g_assert(drc || !mc->has_hotpluggable_cpus);
>  
> @@ -3807,8 +3823,7 @@ void spapr_set_vcpu_id(PowerPCCPU *cpu, int cpu_index, Error **errp)
>      sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>      int vcpu_id;
>  
> -    vcpu_id =
> -        (cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads;
> +    vcpu_id = spapr_vcpu_id(spapr, cpu_index);
>  
>      if (kvm_enabled() && !kvm_vcpu_id_is_valid(vcpu_id)) {
>          error_setg(errp, "Can't create CPU with id %d in KVM", vcpu_id);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] spapr: drop DIV_ROUND_UP() from xics_max_server_number()
  2018-02-14 19:41 ` [Qemu-devel] [PATCH 5/5] spapr: drop DIV_ROUND_UP() from xics_max_server_number() Greg Kurz
@ 2018-02-15  4:08   ` David Gibson
  2018-02-15 16:08     ` Greg Kurz
  0 siblings, 1 reply; 13+ messages in thread
From: David Gibson @ 2018-02-15  4:08 UTC (permalink / raw)
  To: Greg Kurz; +Cc: qemu-devel, qemu-ppc, Sam Bobroff, Laurent Vivier

[-- Attachment #1: Type: text/plain, Size: 2464 bytes --]

On Wed, Feb 14, 2018 at 08:41:03PM +0100, Greg Kurz wrote:
> XICS needs to know the highest VCPU id that may be presented to the
> guest plus 1. Commit f303f117fec3 "spapr: ensure we have at least one
> XICS server" changed how the maximum is computed from:
> 
>         smp_cpus * kvmppc_smt_threads() / smp_threads
> 
> to:
> 
>         DIV_ROUND_UP(smp_cpus * kvmppc_smt_threads(), smp_threads)
> 
> This was done because at the time we could pass broken CPU topologies
> to the -smp command line options, such as threads=9,cpus=1. On a POWER8
> host this would give:
> 
>         1 * 8 / 9 == 0 servers
> 
> and cause QEMU to crash later during XICS setup.
> 
> The formulat evolved a bit to accomodate CPU hot-plug and VSMT, but
> most important, stricter checks are performed on the CPU topology.
> 
> With -smp threads=9,cpus=1:
> 
> qemu-system-ppc64:
>  cpu topology: sockets (1) * cores (1) * threads (9) > maxcpus (1)
> 
> With -smp threads=9,maxcpus=1:
> 
> qemu-system-ppc64: maxcpus must be equal to or greater than smp
> 
> More generally, machine types with hotplug support (2.7 and up), no
> longer allow to set maxcpus or smp_cpus to a value that isnt't a
> multiple of smp_threads.
> 
> With -smp threads=4,cpus=6:
> 
> qemu-system-ppc64: smp_cpus (6) must be multiple of threads (4)
> 
> With -smp threads=4,maxcpus=6:
> 
> qemu-system-ppc64: max_cpus (6) must be multiple of threads (4)
> 
> This means that the division is perfect and we don't need DIV_ROUND_UP(),
> and we could do a regular division:
> 
>         max_cpus * spapr->vsmt / smp_threads
> 
> So this patch changes xics_max_server_number() to use the spapr_vcpu_id(),
> which works too since max_cpus is a multiple of smp_threads:
> 
>     (max_cpus / smp_threads ) * spapr->vsmt + max_cpus % smp_threads
> 
> It breaks migration of pre-2.7 machine types with unusual CPU topologies,
> but I guess this is an acceptable trade-off.

No, not really.  Weird topologies are still allowed on old machine
types for backwards compatibility, and we shouldn't break that.  I
like the idea of consolidating this calculation, but we can't do it by
just breaking the older machines (at least not until they're formally
deprecated).

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] spapr: drop DIV_ROUND_UP() from xics_max_server_number()
  2018-02-15  4:08   ` David Gibson
@ 2018-02-15 16:08     ` Greg Kurz
  2018-02-15 16:54       ` Daniel P. Berrangé
  0 siblings, 1 reply; 13+ messages in thread
From: Greg Kurz @ 2018-02-15 16:08 UTC (permalink / raw)
  To: David Gibson; +Cc: qemu-devel, qemu-ppc, Sam Bobroff, Laurent Vivier

On Thu, 15 Feb 2018 15:08:18 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Feb 14, 2018 at 08:41:03PM +0100, Greg Kurz wrote:

<snip>

> > 
> > It breaks migration of pre-2.7 machine types with unusual CPU topologies,
> > but I guess this is an acceptable trade-off.  
> 
> No, not really.  Weird topologies are still allowed on old machine
> types for backwards compatibility, and we shouldn't break that.  I
> like the idea of consolidating this calculation, but we can't do it by
> just breaking the older machines (at least not until they're formally
> deprecated).
> 

Heh, I had put this patch at the end because I was expecting you might
nack it :)

Per curiosity, when/how do we decide that an older machine type may be formally
deprecated ?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] spapr: drop DIV_ROUND_UP() from xics_max_server_number()
  2018-02-15 16:08     ` Greg Kurz
@ 2018-02-15 16:54       ` Daniel P. Berrangé
  2018-02-15 17:09         ` Greg Kurz
  0 siblings, 1 reply; 13+ messages in thread
From: Daniel P. Berrangé @ 2018-02-15 16:54 UTC (permalink / raw)
  To: Greg Kurz; +Cc: David Gibson, Laurent Vivier, qemu-ppc, qemu-devel, Sam Bobroff

On Thu, Feb 15, 2018 at 05:08:57PM +0100, Greg Kurz wrote:
> On Thu, 15 Feb 2018 15:08:18 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Wed, Feb 14, 2018 at 08:41:03PM +0100, Greg Kurz wrote:
> 
> <snip>
> 
> > > 
> > > It breaks migration of pre-2.7 machine types with unusual CPU topologies,
> > > but I guess this is an acceptable trade-off.  
> > 
> > No, not really.  Weird topologies are still allowed on old machine
> > types for backwards compatibility, and we shouldn't break that.  I
> > like the idea of consolidating this calculation, but we can't do it by
> > just breaking the older machines (at least not until they're formally
> > deprecated).
> > 
> 
> Heh, I had put this patch at the end because I was expecting you might
> nack it :)
> 
> Per curiosity, when/how do we decide that an older machine type may be
> formally deprecated ?

For versioned machine types we decided that we'd keep them around upstream
for as long as they were needed by a downstream vendor, *provided* that
downstream vendor is contributing to QEMU in order to mitigate the maint
burden it would entail. 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] spapr: drop DIV_ROUND_UP() from xics_max_server_number()
  2018-02-15 16:54       ` Daniel P. Berrangé
@ 2018-02-15 17:09         ` Greg Kurz
  0 siblings, 0 replies; 13+ messages in thread
From: Greg Kurz @ 2018-02-15 17:09 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: David Gibson, Laurent Vivier, qemu-ppc, qemu-devel, Sam Bobroff

On Thu, 15 Feb 2018 16:54:18 +0000
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Thu, Feb 15, 2018 at 05:08:57PM +0100, Greg Kurz wrote:
> > On Thu, 15 Feb 2018 15:08:18 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Wed, Feb 14, 2018 at 08:41:03PM +0100, Greg Kurz wrote:  
> > 
> > <snip>
> >   
> > > > 
> > > > It breaks migration of pre-2.7 machine types with unusual CPU topologies,
> > > > but I guess this is an acceptable trade-off.    
> > > 
> > > No, not really.  Weird topologies are still allowed on old machine
> > > types for backwards compatibility, and we shouldn't break that.  I
> > > like the idea of consolidating this calculation, but we can't do it by
> > > just breaking the older machines (at least not until they're formally
> > > deprecated).
> > >   
> > 
> > Heh, I had put this patch at the end because I was expecting you might
> > nack it :)
> > 
> > Per curiosity, when/how do we decide that an older machine type may be
> > formally deprecated ?  
> 
> For versioned machine types we decided that we'd keep them around upstream
> for as long as they were needed by a downstream vendor, *provided* that
> downstream vendor is contributing to QEMU in order to mitigate the maint
> burden it would entail. 
> 

Indeed I now remember having heard something like that in the past. Thanks
for the details anyway. :)

And, this is probably a dumb question, but do we have an up-to-date list
of QEMU versions still needed by a contributing vendor ?

> Regards,
> Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-02-15 17:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-14 19:40 [Qemu-devel] [PATCH 0/5] spapr: fix VCPU ids miscalculation Greg Kurz
2018-02-14 19:40 ` [Qemu-devel] [PATCH 1/5] spapr: use spapr->vsmt to compute VCPU ids Greg Kurz
2018-02-15  3:42   ` David Gibson
2018-02-14 19:40 ` [Qemu-devel] [PATCH 2/5] spapr: move VCPU calculation to core machine code Greg Kurz
2018-02-14 19:40 ` [Qemu-devel] [PATCH 3/5] spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id() Greg Kurz
2018-02-15  3:50   ` David Gibson
2018-02-14 19:40 ` [Qemu-devel] [PATCH 4/5] spapr: consolidate the VCPU id numbering logic in a single place Greg Kurz
2018-02-15  4:05   ` David Gibson
2018-02-14 19:41 ` [Qemu-devel] [PATCH 5/5] spapr: drop DIV_ROUND_UP() from xics_max_server_number() Greg Kurz
2018-02-15  4:08   ` David Gibson
2018-02-15 16:08     ` Greg Kurz
2018-02-15 16:54       ` Daniel P. Berrangé
2018-02-15 17:09         ` Greg Kurz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).