qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/4] spapr/xics: fix migration of older machine types
@ 2017-05-19 10:31 Greg Kurz
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 1/4] spapr_cpu_core: drop reference on ICP object during CPU realization Greg Kurz
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Greg Kurz @ 2017-05-19 10:31 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel
  Cc: Laurent Vivier, Bharata B Rao, Cedric Le Goater, David Gibson

v2: - some patches from v1 are already merged in ppc-for-2.10
    - added a new fix to a potential memory leak (patch 1)
    - consolidate dt_id computation (patch 3)
    - see individual changelogs for patch 2 and 4

This series is based on:

https://github.com/dgibson/qemu.git ppc-for-2.10

I could successfully migrate from QEMU 2.9 and back, including when the guest
has different threads per core than the host and hotplugging cores between
each migration attempt.

--
Greg

---

Greg Kurz (4):
      spapr_cpu_core: drop reference on ICP object during CPU realization
      spapr: fix error reporting in xics_system_init()
      target/ppc: consolidate CPU device-tree id computation in helper
      spapr: fix migration of ICP objects from/to older QEMU


 hw/ppc/spapr.c              |   57 +++++++++++++++++++++++++++++++++++--------
 hw/ppc/spapr_cpu_core.c     |   28 +++++++++++++++------
 include/hw/ppc/spapr.h      |    2 ++
 target/ppc/cpu.h            |   17 +++++++++++++
 target/ppc/translate_init.c |    3 +-
 5 files changed, 86 insertions(+), 21 deletions(-)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 1/4] spapr_cpu_core: drop reference on ICP object during CPU realization
  2017-05-19 10:31 [Qemu-devel] [PATCH v2 0/4] spapr/xics: fix migration of older machine types Greg Kurz
@ 2017-05-19 10:32 ` Greg Kurz
  2017-05-20  6:40   ` David Gibson
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 2/4] spapr: fix error reporting in xics_system_init() Greg Kurz
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 25+ messages in thread
From: Greg Kurz @ 2017-05-19 10:32 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel
  Cc: Laurent Vivier, Bharata B Rao, Cedric Le Goater, David Gibson

When a piece of code allocates an object, it implicitely gets a reference
on it. If it then makes that object a child property of another object, it
should drop its own reference at some point otherwise the child object can
never be finalized. The current code hence leaks one ICP object per CPU
when hot-removing a core.

Failing to add a newly allocated ICP object to the CPU is a bug. While here,
let's ensure QEMU aborts if this ever happens.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr_cpu_core.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 1df1404ea52d..ff7058ecc00e 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -143,7 +143,8 @@ static void spapr_cpu_core_realize_child(Object *child, Error **errp)
     Object *obj;
 
     obj = object_new(spapr->icp_type);
-    object_property_add_child(OBJECT(cpu), "icp", obj, NULL);
+    object_property_add_child(OBJECT(cpu), "icp", obj, &error_abort);
+    object_unref(obj);
     object_property_add_const_link(obj, "xics", OBJECT(spapr), &error_abort);
     object_property_set_bool(obj, true, "realized", &local_err);
     if (local_err) {

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 2/4] spapr: fix error reporting in xics_system_init()
  2017-05-19 10:31 [Qemu-devel] [PATCH v2 0/4] spapr/xics: fix migration of older machine types Greg Kurz
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 1/4] spapr_cpu_core: drop reference on ICP object during CPU realization Greg Kurz
@ 2017-05-19 10:32 ` Greg Kurz
  2017-05-20  6:45   ` David Gibson
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper Greg Kurz
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 4/4] spapr: fix migration of ICP objects from/to older QEMU Greg Kurz
  3 siblings, 1 reply; 25+ messages in thread
From: Greg Kurz @ 2017-05-19 10:32 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel
  Cc: Laurent Vivier, Bharata B Rao, Cedric Le Goater, David Gibson

If the user explicitely asked for kernel-irqchip support and "xics-kvm"
initialization fails, we shouldn't fallback to emulated "xics" as we
do now. It is also awkward to print an error message when we have an
errp pointer argument.

Let's use the errp argument to report the error and let the caller decide.
This simplifies the code as we don't need a local Error * here.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
v2: - total rewrite
---
 hw/ppc/spapr.c |   13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 91f7434861a8..75e298b4c6be 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -128,18 +128,14 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
     sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
 
     if (kvm_enabled()) {
-        Error *err = NULL;
-
         if (machine_kernel_irqchip_allowed(machine) &&
             !xics_kvm_init(spapr, errp)) {
             spapr->icp_type = TYPE_KVM_ICP;
-            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, &err);
+            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, errp);
         }
         if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
-            error_reportf_err(err,
-                              "kernel_irqchip requested but unavailable: ");
-        } else {
-            error_free(err);
+            error_prepend(errp, "kernel_irqchip requested but unavailable: ");
+            return;
         }
     }
 
@@ -147,6 +143,9 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
         xics_spapr_init(spapr);
         spapr->icp_type = TYPE_ICP;
         spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs, errp);
+        if (!spapr->ics) {
+            return;
+        }
     }
 }
 

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper
  2017-05-19 10:31 [Qemu-devel] [PATCH v2 0/4] spapr/xics: fix migration of older machine types Greg Kurz
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 1/4] spapr_cpu_core: drop reference on ICP object during CPU realization Greg Kurz
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 2/4] spapr: fix error reporting in xics_system_init() Greg Kurz
@ 2017-05-19 10:32 ` Greg Kurz
  2017-05-22  2:04   ` David Gibson
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 4/4] spapr: fix migration of ICP objects from/to older QEMU Greg Kurz
  3 siblings, 1 reply; 25+ messages in thread
From: Greg Kurz @ 2017-05-19 10:32 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel
  Cc: Laurent Vivier, Bharata B Rao, Cedric Le Goater, David Gibson

For historical reasons, we compute CPU device-tree ids with a non-trivial
logic. This patch consolidate the logic in a single helper to be used
in various places where it is currently open-coded.

It is okay to get rid of DIV_ROUND_UP() because we're sure that the number
of threads per core in the guest cannot exceed the number of threads per
core in the host.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
 hw/ppc/spapr.c              |    6 ++----
 target/ppc/cpu.h            |   17 +++++++++++++++++
 target/ppc/translate_init.c |    3 +--
 3 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 75e298b4c6be..1bb05a9a6b07 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -981,7 +981,6 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
     void *fdt;
     sPAPRPHBState *phb;
     char *buf;
-    int smt = kvmppc_smt_threads();
 
     fdt = g_malloc0(FDT_MAX_SIZE);
     _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
@@ -1021,7 +1020,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
     _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
 
     /* /interrupt controller */
-    spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, PHANDLE_XICP);
+    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
 
     ret = spapr_populate_memory(spapr, fdt);
     if (ret < 0) {
@@ -1977,7 +1976,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
     MachineState *machine = MACHINE(spapr);
     MachineClass *mc = MACHINE_GET_CLASS(machine);
     char *type = spapr_get_cpu_core_type(machine->cpu_model);
-    int smt = kvmppc_smt_threads();
     const CPUArchIdList *possible_cpus;
     int boot_cores_nr = smp_cpus / smp_threads;
     int i;
@@ -2014,7 +2012,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
             sPAPRDRConnector *drc =
                 spapr_dr_connector_new(OBJECT(spapr),
                                        SPAPR_DR_CONNECTOR_TYPE_CPU,
-                                       (core_id / smp_threads) * smt);
+                                       ppc_cpu_dt_id_from_index(core_id));
 
             qemu_register_reset(spapr_drc_reset, drc);
         }
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 401e10e7dad8..47fe6c64698f 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -2529,4 +2529,21 @@ int ppc_get_vcpu_dt_id(PowerPCCPU *cpu);
 PowerPCCPU *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
 
 void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len);
+
+#if !defined(CONFIG_USER_ONLY)
+#include "sysemu/cpus.h"
+#include "target/ppc/kvm_ppc.h"
+
+static inline int ppc_cpu_dt_id_from_index(int cpu_index)
+{
+    /* POWER HV support has an historical limitation that different threads
+     * on a single core cannot be in different guests at the same time. In
+     * order to allow KVM to assign guest threads to host cores accordingly,
+     * CPU device tree ids are spaced by the number of threads per host cores.
+     */
+    return (cpu_index / smp_threads) * kvmppc_smt_threads()
+        + (cpu_index % smp_threads);
+}
+#endif
+
 #endif /* PPC_CPU_H */
diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
index 56a0ab22cfbe..837a9a496a65 100644
--- a/target/ppc/translate_init.c
+++ b/target/ppc/translate_init.c
@@ -9851,8 +9851,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
     }
 
 #if !defined(CONFIG_USER_ONLY)
-    cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
-        + (cs->cpu_index % smp_threads);
+    cpu->cpu_dt_id = ppc_cpu_dt_id_from_index(cs->cpu_index);
 
     if (kvm_enabled() && !kvm_vcpu_id_is_valid(cpu->cpu_dt_id)) {
         error_setg(errp, "Can't create CPU with id %d in KVM", cpu->cpu_dt_id);

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [Qemu-devel] [PATCH v2 4/4] spapr: fix migration of ICP objects from/to older QEMU
  2017-05-19 10:31 [Qemu-devel] [PATCH v2 0/4] spapr/xics: fix migration of older machine types Greg Kurz
                   ` (2 preceding siblings ...)
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper Greg Kurz
@ 2017-05-19 10:32 ` Greg Kurz
  2017-05-22  2:30   ` David Gibson
  3 siblings, 1 reply; 25+ messages in thread
From: Greg Kurz @ 2017-05-19 10:32 UTC (permalink / raw)
  To: qemu-ppc, qemu-devel
  Cc: Laurent Vivier, Bharata B Rao, Cedric Le Goater, David Gibson

Commit 5bc8d26de20c ("spapr: allocate the ICPState object from under
sPAPRCPUCore") moved ICP objects from the machine to CPU cores. This
is an improvement since we no longer allocate ICP objects that will
never be used. But it has the side-effect of breaking migration of
older machine types from older QEMU versions.

This patch introduces a compat flag in the sPAPR machine class so
that all pseries machine up to 2.9 go on with the previous behavior
of pre-allocating ICP objects.

Signed-off-by: Greg Kurz <groug@kaod.org>
---
v2: - s/void* /void * in xics_system_init()
    - don't use "[*]" in the ICP object name
    - use pre_2_10_ prefix in field names
    - added xics_nr_servers() helper
---
 hw/ppc/spapr.c          |   40 +++++++++++++++++++++++++++++++++++++++-
 hw/ppc/spapr_cpu_core.c |   29 ++++++++++++++++++++---------
 include/hw/ppc/spapr.h  |    2 ++
 3 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 1bb05a9a6b07..182262257c60 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -123,9 +123,15 @@ error:
     return NULL;
 }
 
+static inline int xics_nr_servers(void)
+{
+    return ppc_cpu_dt_id_from_index(max_cpus);
+}
+
 static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
 {
     sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
+    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
 
     if (kvm_enabled()) {
         if (machine_kernel_irqchip_allowed(machine) &&
@@ -147,6 +153,35 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
             return;
         }
     }
+
+    if (smc->pre_2_10_icp_allocation) {
+        int nr_servers = xics_nr_servers();
+        Error *local_err = NULL;
+        int i;
+
+        spapr->pre_2_10_icps = g_malloc0(nr_servers * sizeof(ICPState));
+
+        for (i = 0; i < nr_servers; i++) {
+            void *obj = &spapr->pre_2_10_icps[i];
+            char *name = g_strdup_printf("icp[%d]", i);
+
+            object_initialize(obj, sizeof(ICPState), spapr->icp_type);
+            object_property_add_child(OBJECT(spapr), name, obj, &error_abort);
+            g_free(name);
+            object_unref(obj);
+            object_property_add_const_link(obj, "xics", OBJECT(spapr),
+                                           &error_abort);
+            object_property_set_bool(obj, true, "realized", &local_err);
+            if (local_err) {
+                while (i--) {
+                    object_unparent(obj);
+                }
+                g_free(spapr->pre_2_10_icps);
+                error_propagate(errp, local_err);
+                break;
+            }
+        }
+    }
 }
 
 static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
@@ -1020,7 +1055,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
     _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
 
     /* /interrupt controller */
-    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
+    spapr_dt_xics(xics_nr_servers(), fdt, PHANDLE_XICP);
 
     ret = spapr_populate_memory(spapr, fdt);
     if (ret < 0) {
@@ -3286,9 +3321,12 @@ static void spapr_machine_2_9_instance_options(MachineState *machine)
 
 static void spapr_machine_2_9_class_options(MachineClass *mc)
 {
+    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
+
     spapr_machine_2_10_class_options(mc);
     SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_9);
     mc->numa_auto_assign_ram = numa_legacy_auto_assign_ram;
+    smc->pre_2_10_icp_allocation = true;
 }
 
 DEFINE_SPAPR_MACHINE(2_9, "2.9", false);
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index ff7058ecc00e..13c4916aa5e6 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -119,6 +119,7 @@ static void spapr_cpu_core_unrealizefn(DeviceState *dev, Error **errp)
     size_t size = object_type_get_instance_size(typename);
     CPUCore *cc = CPU_CORE(dev);
     int i;
+    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
 
     for (i = 0; i < cc->nr_threads; i++) {
         void *obj = sc->threads + i * size;
@@ -127,7 +128,9 @@ static void spapr_cpu_core_unrealizefn(DeviceState *dev, Error **errp)
         PowerPCCPU *cpu = POWERPC_CPU(cs);
 
         spapr_cpu_destroy(cpu);
-        object_unparent(cpu->intc);
+        if (!spapr->pre_2_10_icps) {
+            object_unparent(cpu->intc);
+        }
         cpu_remove_sync(cs);
         object_unparent(obj);
     }
@@ -142,13 +145,19 @@ static void spapr_cpu_core_realize_child(Object *child, Error **errp)
     PowerPCCPU *cpu = POWERPC_CPU(cs);
     Object *obj;
 
-    obj = object_new(spapr->icp_type);
-    object_property_add_child(OBJECT(cpu), "icp", obj, &error_abort);
-    object_unref(obj);
-    object_property_add_const_link(obj, "xics", OBJECT(spapr), &error_abort);
-    object_property_set_bool(obj, true, "realized", &local_err);
-    if (local_err) {
-        goto error;
+    if (spapr->pre_2_10_icps) {
+        int index = cpu->parent_obj.cpu_index;
+
+        obj = OBJECT(&spapr->pre_2_10_icps[index]);
+    } else {
+        obj = object_new(spapr->icp_type);
+        object_property_add_child(OBJECT(cpu), "icp", obj, &error_abort);
+        object_property_add_const_link(obj, "xics", OBJECT(spapr),
+                                       &error_abort);
+        object_property_set_bool(obj, true, "realized", &local_err);
+        if (local_err) {
+            goto error;
+        }
     }
 
     object_property_set_bool(child, true, "realized", &local_err);
@@ -165,7 +174,9 @@ static void spapr_cpu_core_realize_child(Object *child, Error **errp)
     return;
 
 error:
-    object_unparent(obj);
+    if (!spapr->pre_2_10_icps) {
+        object_unparent(obj);
+    }
     error_propagate(errp, local_err);
 }
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index f875dc41d811..d1dcf0c8bddf 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -54,6 +54,7 @@ struct sPAPRMachineClass {
     bool use_ohci_by_default;  /* use USB-OHCI instead of XHCI */
     bool pre_2_9_cas_pvr;      /* Use old logic for PVR compat negotiation */
     const char *tcg_default_cpu; /* which (TCG) CPU to simulate by default */
+    bool pre_2_10_icp_allocation;
     void (*phb_placement)(sPAPRMachineState *spapr, uint32_t index,
                           uint64_t *buid, hwaddr *pio, 
                           hwaddr *mmio32, hwaddr *mmio64,
@@ -110,6 +111,7 @@ struct sPAPRMachineState {
     MemoryHotplugState hotplug_memory;
 
     const char *icp_type;
+    ICPState *pre_2_10_icps;
 };
 
 #define H_SUCCESS         0

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/4] spapr_cpu_core: drop reference on ICP object during CPU realization
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 1/4] spapr_cpu_core: drop reference on ICP object during CPU realization Greg Kurz
@ 2017-05-20  6:40   ` David Gibson
  0 siblings, 0 replies; 25+ messages in thread
From: David Gibson @ 2017-05-20  6:40 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 1591 bytes --]

On Fri, May 19, 2017 at 12:32:04PM +0200, Greg Kurz wrote:
> When a piece of code allocates an object, it implicitely gets a reference
> on it. If it then makes that object a child property of another object, it
> should drop its own reference at some point otherwise the child object can
> never be finalized. The current code hence leaks one ICP object per CPU
> when hot-removing a core.
> 
> Failing to add a newly allocated ICP object to the CPU is a bug. While here,
> let's ensure QEMU aborts if this ever happens.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Applied to ppc-for-2.10.

> ---
>  hw/ppc/spapr_cpu_core.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index 1df1404ea52d..ff7058ecc00e 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -143,7 +143,8 @@ static void spapr_cpu_core_realize_child(Object *child, Error **errp)
>      Object *obj;
>  
>      obj = object_new(spapr->icp_type);
> -    object_property_add_child(OBJECT(cpu), "icp", obj, NULL);
> +    object_property_add_child(OBJECT(cpu), "icp", obj, &error_abort);
> +    object_unref(obj);
>      object_property_add_const_link(obj, "xics", OBJECT(spapr), &error_abort);
>      object_property_set_bool(obj, true, "realized", &local_err);
>      if (local_err) {
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] spapr: fix error reporting in xics_system_init()
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 2/4] spapr: fix error reporting in xics_system_init() Greg Kurz
@ 2017-05-20  6:45   ` David Gibson
  2017-05-21 17:03     ` Greg Kurz
  0 siblings, 1 reply; 25+ messages in thread
From: David Gibson @ 2017-05-20  6:45 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 2591 bytes --]

On Fri, May 19, 2017 at 12:32:12PM +0200, Greg Kurz wrote:
> If the user explicitely asked for kernel-irqchip support and "xics-kvm"
> initialization fails, we shouldn't fallback to emulated "xics" as we
> do now. It is also awkward to print an error message when we have an
> errp pointer argument.
> 
> Let's use the errp argument to report the error and let the caller decide.
> This simplifies the code as we don't need a local Error * here.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>

Concept looks good, but..

> ---
> v2: - total rewrite
> ---
>  hw/ppc/spapr.c |   13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 91f7434861a8..75e298b4c6be 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -128,18 +128,14 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
>      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
>  
>      if (kvm_enabled()) {
> -        Error *err = NULL;
> -
>          if (machine_kernel_irqchip_allowed(machine) &&
>              !xics_kvm_init(spapr, errp)) {
>              spapr->icp_type = TYPE_KVM_ICP;
> -            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, &err);
> +            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, errp);

I believe there are reasons you're not supposed to just pass an errp
through to a subordinate function.  Instead you're supposed to have a
local Error * and use error_propagate().

>          }
>          if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
> -            error_reportf_err(err,
> -                              "kernel_irqchip requested but unavailable: ");
> -        } else {
> -            error_free(err);
> +            error_prepend(errp, "kernel_irqchip requested but unavailable: ");
> +            return;
>          }
>      }
>  
> @@ -147,6 +143,9 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
>          xics_spapr_init(spapr);
>          spapr->icp_type = TYPE_ICP;
>          spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs, errp);
> +        if (!spapr->ics) {

It would also be more standard to check the returned error, rather
than the other result.

> +            return;
> +        }
>      }
>  }
>  
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] spapr: fix error reporting in xics_system_init()
  2017-05-20  6:45   ` David Gibson
@ 2017-05-21 17:03     ` Greg Kurz
  2017-05-22  1:26       ` David Gibson
  2017-05-22  7:41       ` Markus Armbruster
  0 siblings, 2 replies; 25+ messages in thread
From: Greg Kurz @ 2017-05-21 17:03 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 2914 bytes --]

On Sat, 20 May 2017 16:45:09 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, May 19, 2017 at 12:32:12PM +0200, Greg Kurz wrote:
> > If the user explicitely asked for kernel-irqchip support and "xics-kvm"
> > initialization fails, we shouldn't fallback to emulated "xics" as we
> > do now. It is also awkward to print an error message when we have an
> > errp pointer argument.
> > 
> > Let's use the errp argument to report the error and let the caller decide.
> > This simplifies the code as we don't need a local Error * here.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>  
> 
> Concept looks good, but..
> 
> > ---
> > v2: - total rewrite
> > ---
> >  hw/ppc/spapr.c |   13 ++++++-------
> >  1 file changed, 6 insertions(+), 7 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 91f7434861a8..75e298b4c6be 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -128,18 +128,14 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
> >      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> >  
> >      if (kvm_enabled()) {
> > -        Error *err = NULL;
> > -
> >          if (machine_kernel_irqchip_allowed(machine) &&
> >              !xics_kvm_init(spapr, errp)) {
> >              spapr->icp_type = TYPE_KVM_ICP;
> > -            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, &err);
> > +            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, errp);  
> 
> I believe there are reasons you're not supposed to just pass an errp
> through to a subordinate function.  Instead you're supposed to have a
> local Error * and use error_propagate().
> 

You only need to have a local Error * if it is used to check the return status
of the function (ie, you cannot check *errp because errp could be NULL) as
described in error.h. This isn't the case here but...

> >          }
> >          if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
> > -            error_reportf_err(err,
> > -                              "kernel_irqchip requested but unavailable: ");
> > -        } else {
> > -            error_free(err);
> > +            error_prepend(errp, "kernel_irqchip requested but unavailable: ");
> > +            return;
> >          }
> >      }
> >  
> > @@ -147,6 +143,9 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
> >          xics_spapr_init(spapr);
> >          spapr->icp_type = TYPE_ICP;
> >          spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs, errp);
> > +        if (!spapr->ics) {  
> 
> It would also be more standard to check the returned error, rather
> than the other result.
> 

... if you prefer to use a local Error *, I'll gladly do that. :)

> > +            return;
> > +        }
> >      }
> >  }
> >  
> >   
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] spapr: fix error reporting in xics_system_init()
  2017-05-21 17:03     ` Greg Kurz
@ 2017-05-22  1:26       ` David Gibson
  2017-05-22  7:41       ` Markus Armbruster
  1 sibling, 0 replies; 25+ messages in thread
From: David Gibson @ 2017-05-22  1:26 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 3378 bytes --]

On Sun, May 21, 2017 at 07:03:33PM +0200, Greg Kurz wrote:
> On Sat, 20 May 2017 16:45:09 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Fri, May 19, 2017 at 12:32:12PM +0200, Greg Kurz wrote:
> > > If the user explicitely asked for kernel-irqchip support and "xics-kvm"
> > > initialization fails, we shouldn't fallback to emulated "xics" as we
> > > do now. It is also awkward to print an error message when we have an
> > > errp pointer argument.
> > > 
> > > Let's use the errp argument to report the error and let the caller decide.
> > > This simplifies the code as we don't need a local Error * here.
> > > 
> > > Signed-off-by: Greg Kurz <groug@kaod.org>  
> > 
> > Concept looks good, but..
> > 
> > > ---
> > > v2: - total rewrite
> > > ---
> > >  hw/ppc/spapr.c |   13 ++++++-------
> > >  1 file changed, 6 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 91f7434861a8..75e298b4c6be 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -128,18 +128,14 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
> > >      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> > >  
> > >      if (kvm_enabled()) {
> > > -        Error *err = NULL;
> > > -
> > >          if (machine_kernel_irqchip_allowed(machine) &&
> > >              !xics_kvm_init(spapr, errp)) {
> > >              spapr->icp_type = TYPE_KVM_ICP;
> > > -            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, &err);
> > > +            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, errp);  
> > 
> > I believe there are reasons you're not supposed to just pass an errp
> > through to a subordinate function.  Instead you're supposed to have a
> > local Error * and use error_propagate().
> > 
> 
> You only need to have a local Error * if it is used to check the return status
> of the function (ie, you cannot check *errp because errp could be NULL) as
> described in error.h. This isn't the case here but...

Fair point; patch applied to ppc-for-2.10.

> 
> > >          }
> > >          if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
> > > -            error_reportf_err(err,
> > > -                              "kernel_irqchip requested but unavailable: ");
> > > -        } else {
> > > -            error_free(err);
> > > +            error_prepend(errp, "kernel_irqchip requested but unavailable: ");
> > > +            return;
> > >          }
> > >      }
> > >  
> > > @@ -147,6 +143,9 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
> > >          xics_spapr_init(spapr);
> > >          spapr->icp_type = TYPE_ICP;
> > >          spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs, errp);
> > > +        if (!spapr->ics) {  
> > 
> > It would also be more standard to check the returned error, rather
> > than the other result.
> > 
> 
> ... if you prefer to use a local Error *, I'll gladly do that. :)
> 
> > > +            return;
> > > +        }
> > >      }
> > >  }
> > >  
> > >   
> > 
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper Greg Kurz
@ 2017-05-22  2:04   ` David Gibson
  2017-05-22  2:12     ` David Gibson
  2017-05-22  8:59     ` Greg Kurz
  0 siblings, 2 replies; 25+ messages in thread
From: David Gibson @ 2017-05-22  2:04 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 5026 bytes --]

On Fri, May 19, 2017 at 12:32:20PM +0200, Greg Kurz wrote:
> For historical reasons, we compute CPU device-tree ids with a non-trivial
> logic. This patch consolidate the logic in a single helper to be used
> in various places where it is currently open-coded.
> 
> It is okay to get rid of DIV_ROUND_UP() because we're sure that the number
> of threads per core in the guest cannot exceed the number of threads per
> core in the host.

However, your new logic still gives different answers in some cases.
In particular when max_cpus is not a multiple of smp_threads.  Which
is generally a bad idea, but allowed for older machine types for
compatibility.   e.g. smp_threads=4, max_cpus=6 smt=8

Old logic:
	         DIV_ROUND_UP(6 * 8, 4)
	       = ⌈48 / 4⌉ = 12

New logic gives: ⌊6 / 4⌋ * 8 + (6 % 4)
               = 1 * 8 + 2
	       = 10

In any case the DIV_ROUND_UP() isn't to handle the case where guest
threads-per-core is bigger than host threads-per-core, it's (IIRC) for
the case where guest threads-per-core is not a factor of host
threads-per-core.  Again, a bad idea, but I think allowed in some old
cases.

> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
>  hw/ppc/spapr.c              |    6 ++----
>  target/ppc/cpu.h            |   17 +++++++++++++++++
>  target/ppc/translate_init.c |    3 +--
>  3 files changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 75e298b4c6be..1bb05a9a6b07 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -981,7 +981,6 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
>      void *fdt;
>      sPAPRPHBState *phb;
>      char *buf;
> -    int smt = kvmppc_smt_threads();
>  
>      fdt = g_malloc0(FDT_MAX_SIZE);
>      _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
> @@ -1021,7 +1020,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
>      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
>  
>      /* /interrupt controller */
> -    spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, PHANDLE_XICP);
> +    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
>  
>      ret = spapr_populate_memory(spapr, fdt);
>      if (ret < 0) {
> @@ -1977,7 +1976,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
>      MachineState *machine = MACHINE(spapr);
>      MachineClass *mc = MACHINE_GET_CLASS(machine);
>      char *type = spapr_get_cpu_core_type(machine->cpu_model);
> -    int smt = kvmppc_smt_threads();
>      const CPUArchIdList *possible_cpus;
>      int boot_cores_nr = smp_cpus / smp_threads;
>      int i;
> @@ -2014,7 +2012,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
>              sPAPRDRConnector *drc =
>                  spapr_dr_connector_new(OBJECT(spapr),
>                                         SPAPR_DR_CONNECTOR_TYPE_CPU,
> -                                       (core_id / smp_threads) * smt);
> +                                       ppc_cpu_dt_id_from_index(core_id));
>  
>              qemu_register_reset(spapr_drc_reset, drc);
>          }
> diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> index 401e10e7dad8..47fe6c64698f 100644
> --- a/target/ppc/cpu.h
> +++ b/target/ppc/cpu.h
> @@ -2529,4 +2529,21 @@ int ppc_get_vcpu_dt_id(PowerPCCPU *cpu);
>  PowerPCCPU *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
>  
>  void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len);
> +
> +#if !defined(CONFIG_USER_ONLY)
> +#include "sysemu/cpus.h"
> +#include "target/ppc/kvm_ppc.h"
> +
> +static inline int ppc_cpu_dt_id_from_index(int cpu_index)
> +{
> +    /* POWER HV support has an historical limitation that different threads
> +     * on a single core cannot be in different guests at the same time. In
> +     * order to allow KVM to assign guest threads to host cores accordingly,
> +     * CPU device tree ids are spaced by the number of threads per host cores.
> +     */
> +    return (cpu_index / smp_threads) * kvmppc_smt_threads()
> +        + (cpu_index % smp_threads);
> +}
> +#endif
> +
>  #endif /* PPC_CPU_H */
> diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> index 56a0ab22cfbe..837a9a496a65 100644
> --- a/target/ppc/translate_init.c
> +++ b/target/ppc/translate_init.c
> @@ -9851,8 +9851,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
>      }
>  
>  #if !defined(CONFIG_USER_ONLY)
> -    cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
> -        + (cs->cpu_index % smp_threads);
> +    cpu->cpu_dt_id = ppc_cpu_dt_id_from_index(cs->cpu_index);
>  
>      if (kvm_enabled() && !kvm_vcpu_id_is_valid(cpu->cpu_dt_id)) {
>          error_setg(errp, "Can't create CPU with id %d in KVM", cpu->cpu_dt_id);
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper
  2017-05-22  2:04   ` David Gibson
@ 2017-05-22  2:12     ` David Gibson
  2017-05-22  9:09       ` Greg Kurz
  2017-05-22 14:33       ` Greg Kurz
  2017-05-22  8:59     ` Greg Kurz
  1 sibling, 2 replies; 25+ messages in thread
From: David Gibson @ 2017-05-22  2:12 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 5675 bytes --]

On Mon, May 22, 2017 at 12:04:13PM +1000, David Gibson wrote:
> On Fri, May 19, 2017 at 12:32:20PM +0200, Greg Kurz wrote:
> > For historical reasons, we compute CPU device-tree ids with a non-trivial
> > logic. This patch consolidate the logic in a single helper to be used
> > in various places where it is currently open-coded.
> > 
> > It is okay to get rid of DIV_ROUND_UP() because we're sure that the number
> > of threads per core in the guest cannot exceed the number of threads per
> > core in the host.
> 
> However, your new logic still gives different answers in some cases.
> In particular when max_cpus is not a multiple of smp_threads.  Which
> is generally a bad idea, but allowed for older machine types for
> compatibility.   e.g. smp_threads=4, max_cpus=6 smt=8
> 
> Old logic:
> 	         DIV_ROUND_UP(6 * 8, 4)
> 	       = ⌈48 / 4⌉ = 12
> 
> New logic gives: ⌊6 / 4⌋ * 8 + (6 % 4)
>                = 1 * 8 + 2
> 	       = 10
> 
> In any case the DIV_ROUND_UP() isn't to handle the case where guest
> threads-per-core is bigger than host threads-per-core, it's (IIRC) for
> the case where guest threads-per-core is not a factor of host
> threads-per-core.  Again, a bad idea, but I think allowed in some old
> cases.

Oh, so, the other more general point here is that I actually want to
get rid of dt_id from the cpu structure.  It's basically an abuse of
the cpu stuff to include what's really an spapr concept - dt IDs for
powernv are based on the PIR and not allocate the same way.

That said, I'm still ok with a fixed version of this patch as an
interim step.

> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  hw/ppc/spapr.c              |    6 ++----
> >  target/ppc/cpu.h            |   17 +++++++++++++++++
> >  target/ppc/translate_init.c |    3 +--
> >  3 files changed, 20 insertions(+), 6 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 75e298b4c6be..1bb05a9a6b07 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -981,7 +981,6 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> >      void *fdt;
> >      sPAPRPHBState *phb;
> >      char *buf;
> > -    int smt = kvmppc_smt_threads();
> >  
> >      fdt = g_malloc0(FDT_MAX_SIZE);
> >      _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
> > @@ -1021,7 +1020,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> >      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
> >  
> >      /* /interrupt controller */
> > -    spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, PHANDLE_XICP);
> > +    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
> >  
> >      ret = spapr_populate_memory(spapr, fdt);
> >      if (ret < 0) {
> > @@ -1977,7 +1976,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> >      MachineState *machine = MACHINE(spapr);
> >      MachineClass *mc = MACHINE_GET_CLASS(machine);
> >      char *type = spapr_get_cpu_core_type(machine->cpu_model);
> > -    int smt = kvmppc_smt_threads();
> >      const CPUArchIdList *possible_cpus;
> >      int boot_cores_nr = smp_cpus / smp_threads;
> >      int i;
> > @@ -2014,7 +2012,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> >              sPAPRDRConnector *drc =
> >                  spapr_dr_connector_new(OBJECT(spapr),
> >                                         SPAPR_DR_CONNECTOR_TYPE_CPU,
> > -                                       (core_id / smp_threads) * smt);
> > +                                       ppc_cpu_dt_id_from_index(core_id));
> >  
> >              qemu_register_reset(spapr_drc_reset, drc);
> >          }
> > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > index 401e10e7dad8..47fe6c64698f 100644
> > --- a/target/ppc/cpu.h
> > +++ b/target/ppc/cpu.h
> > @@ -2529,4 +2529,21 @@ int ppc_get_vcpu_dt_id(PowerPCCPU *cpu);
> >  PowerPCCPU *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
> >  
> >  void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len);
> > +
> > +#if !defined(CONFIG_USER_ONLY)
> > +#include "sysemu/cpus.h"
> > +#include "target/ppc/kvm_ppc.h"
> > +
> > +static inline int ppc_cpu_dt_id_from_index(int cpu_index)
> > +{
> > +    /* POWER HV support has an historical limitation that different threads
> > +     * on a single core cannot be in different guests at the same time. In
> > +     * order to allow KVM to assign guest threads to host cores accordingly,
> > +     * CPU device tree ids are spaced by the number of threads per host cores.
> > +     */
> > +    return (cpu_index / smp_threads) * kvmppc_smt_threads()
> > +        + (cpu_index % smp_threads);
> > +}
> > +#endif
> > +
> >  #endif /* PPC_CPU_H */
> > diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> > index 56a0ab22cfbe..837a9a496a65 100644
> > --- a/target/ppc/translate_init.c
> > +++ b/target/ppc/translate_init.c
> > @@ -9851,8 +9851,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> >      }
> >  
> >  #if !defined(CONFIG_USER_ONLY)
> > -    cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
> > -        + (cs->cpu_index % smp_threads);
> > +    cpu->cpu_dt_id = ppc_cpu_dt_id_from_index(cs->cpu_index);
> >  
> >      if (kvm_enabled() && !kvm_vcpu_id_is_valid(cpu->cpu_dt_id)) {
> >          error_setg(errp, "Can't create CPU with id %d in KVM", cpu->cpu_dt_id);
> > 
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/4] spapr: fix migration of ICP objects from/to older QEMU
  2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 4/4] spapr: fix migration of ICP objects from/to older QEMU Greg Kurz
@ 2017-05-22  2:30   ` David Gibson
  2017-05-22  7:20     ` Cédric Le Goater
  0 siblings, 1 reply; 25+ messages in thread
From: David Gibson @ 2017-05-22  2:30 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 8423 bytes --]

On Fri, May 19, 2017 at 12:32:27PM +0200, Greg Kurz wrote:
> Commit 5bc8d26de20c ("spapr: allocate the ICPState object from under
> sPAPRCPUCore") moved ICP objects from the machine to CPU cores. This
> is an improvement since we no longer allocate ICP objects that will
> never be used. But it has the side-effect of breaking migration of
> older machine types from older QEMU versions.
> 
> This patch introduces a compat flag in the sPAPR machine class so
> that all pseries machine up to 2.9 go on with the previous behavior
> of pre-allocating ICP objects.
> 
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
> v2: - s/void* /void * in xics_system_init()
>     - don't use "[*]" in the ICP object name
>     - use pre_2_10_ prefix in field names
>     - added xics_nr_servers() helper
> ---
>  hw/ppc/spapr.c          |   40 +++++++++++++++++++++++++++++++++++++++-
>  hw/ppc/spapr_cpu_core.c |   29 ++++++++++++++++++++---------
>  include/hw/ppc/spapr.h  |    2 ++
>  3 files changed, 61 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 1bb05a9a6b07..182262257c60 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -123,9 +123,15 @@ error:
>      return NULL;
>  }
>  
> +static inline int xics_nr_servers(void)
> +{
> +    return ppc_cpu_dt_id_from_index(max_cpus);
> +}
> +
>  static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
>  {
>      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>  
>      if (kvm_enabled()) {
>          if (machine_kernel_irqchip_allowed(machine) &&
> @@ -147,6 +153,35 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
>              return;
>          }
>      }
> +
> +    if (smc->pre_2_10_icp_allocation) {
> +        int nr_servers = xics_nr_servers();
> +        Error *local_err = NULL;
> +        int i;
> +
> +        spapr->pre_2_10_icps = g_malloc0(nr_servers * sizeof(ICPState));
> +
> +        for (i = 0; i < nr_servers; i++) {
> +            void *obj = &spapr->pre_2_10_icps[i];
> +            char *name = g_strdup_printf("icp[%d]", i);
> +
> +            object_initialize(obj, sizeof(ICPState), spapr->icp_type);
> +            object_property_add_child(OBJECT(spapr), name, obj, &error_abort);
> +            g_free(name);
> +            object_unref(obj);
> +            object_property_add_const_link(obj, "xics", OBJECT(spapr),
> +                                           &error_abort);
> +            object_property_set_bool(obj, true, "realized", &local_err);
> +            if (local_err) {
> +                while (i--) {
> +                    object_unparent(obj);
> +                }
> +                g_free(spapr->pre_2_10_icps);
> +                error_propagate(errp, local_err);
> +                break;
> +            }
> +        }
> +    }
>  }
>  
>  static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
> @@ -1020,7 +1055,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
>      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
>  
>      /* /interrupt controller */
> -    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
> +    spapr_dt_xics(xics_nr_servers(), fdt, PHANDLE_XICP);
>  
>      ret = spapr_populate_memory(spapr, fdt);
>      if (ret < 0) {
> @@ -3286,9 +3321,12 @@ static void spapr_machine_2_9_instance_options(MachineState *machine)
>  
>  static void spapr_machine_2_9_class_options(MachineClass *mc)
>  {
> +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> +
>      spapr_machine_2_10_class_options(mc);
>      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_9);
>      mc->numa_auto_assign_ram = numa_legacy_auto_assign_ram;
> +    smc->pre_2_10_icp_allocation = true;
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_9, "2.9", false);
> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> index ff7058ecc00e..13c4916aa5e6 100644
> --- a/hw/ppc/spapr_cpu_core.c
> +++ b/hw/ppc/spapr_cpu_core.c
> @@ -119,6 +119,7 @@ static void spapr_cpu_core_unrealizefn(DeviceState *dev, Error **errp)
>      size_t size = object_type_get_instance_size(typename);
>      CPUCore *cc = CPU_CORE(dev);
>      int i;
> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>  
>      for (i = 0; i < cc->nr_threads; i++) {
>          void *obj = sc->threads + i * size;
> @@ -127,7 +128,9 @@ static void spapr_cpu_core_unrealizefn(DeviceState *dev, Error **errp)
>          PowerPCCPU *cpu = POWERPC_CPU(cs);
>  
>          spapr_cpu_destroy(cpu);
> -        object_unparent(cpu->intc);
> +        if (!spapr->pre_2_10_icps) {

Hrm.  I dislike code for the core object directly reaching into the
machine to check the compat flag here (and a bunch of other places
below).  I can think of a few possible ways of avoiding this:

1) The most direct is to make another compat flag in the cpu core
object, set by the machine.  Straightforward, but ugly.

2) Use a property to optionally pass a reference to the ICP array into
the core object.  If set it will give the cpu objects ICPs from that
array (compat mode), if not it will allocate them (new style mode).

3) (Preferred, if it works)  Always have the core allocate ICPs for
each CPU.  For compat mode instead of directly allocating ICPs, the
machine sets up an array of pointers to the existing ICPs for each
CPU.  The "extra" slots that don't have ICPs in new-style allocation
get references to dummy ICP objects (maybe even all the same one).
Only the dummy ICP(s) are allocated by the machine, the rest remain
owned by the cpu.


> +            object_unparent(cpu->intc);
> +        }
>          cpu_remove_sync(cs);
>          object_unparent(obj);
>      }
> @@ -142,13 +145,19 @@ static void spapr_cpu_core_realize_child(Object *child, Error **errp)
>      PowerPCCPU *cpu = POWERPC_CPU(cs);
>      Object *obj;
>  
> -    obj = object_new(spapr->icp_type);
> -    object_property_add_child(OBJECT(cpu), "icp", obj, &error_abort);
> -    object_unref(obj);
> -    object_property_add_const_link(obj, "xics", OBJECT(spapr), &error_abort);
> -    object_property_set_bool(obj, true, "realized", &local_err);
> -    if (local_err) {
> -        goto error;
> +    if (spapr->pre_2_10_icps) {
> +        int index = cpu->parent_obj.cpu_index;
> +
> +        obj = OBJECT(&spapr->pre_2_10_icps[index]);
> +    } else {
> +        obj = object_new(spapr->icp_type);
> +        object_property_add_child(OBJECT(cpu), "icp", obj, &error_abort);
> +        object_property_add_const_link(obj, "xics", OBJECT(spapr),
> +                                       &error_abort);
> +        object_property_set_bool(obj, true, "realized", &local_err);
> +        if (local_err) {
> +            goto error;
> +        }
>      }
>  
>      object_property_set_bool(child, true, "realized", &local_err);
> @@ -165,7 +174,9 @@ static void spapr_cpu_core_realize_child(Object *child, Error **errp)
>      return;
>  
>  error:
> -    object_unparent(obj);
> +    if (!spapr->pre_2_10_icps) {
> +        object_unparent(obj);
> +    }
>      error_propagate(errp, local_err);
>  }
>  
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index f875dc41d811..d1dcf0c8bddf 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -54,6 +54,7 @@ struct sPAPRMachineClass {
>      bool use_ohci_by_default;  /* use USB-OHCI instead of XHCI */
>      bool pre_2_9_cas_pvr;      /* Use old logic for PVR compat negotiation */
>      const char *tcg_default_cpu; /* which (TCG) CPU to simulate by default */
> +    bool pre_2_10_icp_allocation;
>      void (*phb_placement)(sPAPRMachineState *spapr, uint32_t index,
>                            uint64_t *buid, hwaddr *pio, 
>                            hwaddr *mmio32, hwaddr *mmio64,
> @@ -110,6 +111,7 @@ struct sPAPRMachineState {
>      MemoryHotplugState hotplug_memory;
>  
>      const char *icp_type;
> +    ICPState *pre_2_10_icps;
>  };
>  
>  #define H_SUCCESS         0
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/4] spapr: fix migration of ICP objects from/to older QEMU
  2017-05-22  2:30   ` David Gibson
@ 2017-05-22  7:20     ` Cédric Le Goater
  2017-05-22  9:15       ` David Gibson
  0 siblings, 1 reply; 25+ messages in thread
From: Cédric Le Goater @ 2017-05-22  7:20 UTC (permalink / raw)
  To: David Gibson, Greg Kurz
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao

On 05/22/2017 04:30 AM, David Gibson wrote:
> On Fri, May 19, 2017 at 12:32:27PM +0200, Greg Kurz wrote:
>> Commit 5bc8d26de20c ("spapr: allocate the ICPState object from under
>> sPAPRCPUCore") moved ICP objects from the machine to CPU cores. This
>> is an improvement since we no longer allocate ICP objects that will
>> never be used. But it has the side-effect of breaking migration of
>> older machine types from older QEMU versions.
>>
>> This patch introduces a compat flag in the sPAPR machine class so
>> that all pseries machine up to 2.9 go on with the previous behavior
>> of pre-allocating ICP objects.
>>
>> Signed-off-by: Greg Kurz <groug@kaod.org>
>> ---
>> v2: - s/void* /void * in xics_system_init()
>>     - don't use "[*]" in the ICP object name
>>     - use pre_2_10_ prefix in field names
>>     - added xics_nr_servers() helper
>> ---
>>  hw/ppc/spapr.c          |   40 +++++++++++++++++++++++++++++++++++++++-
>>  hw/ppc/spapr_cpu_core.c |   29 ++++++++++++++++++++---------
>>  include/hw/ppc/spapr.h  |    2 ++
>>  3 files changed, 61 insertions(+), 10 deletions(-)
>>
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 1bb05a9a6b07..182262257c60 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -123,9 +123,15 @@ error:
>>      return NULL;
>>  }
>>  
>> +static inline int xics_nr_servers(void)
>> +{
>> +    return ppc_cpu_dt_id_from_index(max_cpus);
>> +}
>> +
>>  static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
>>  {
>>      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
>> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
>>  
>>      if (kvm_enabled()) {
>>          if (machine_kernel_irqchip_allowed(machine) &&
>> @@ -147,6 +153,35 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
>>              return;
>>          }
>>      }
>> +
>> +    if (smc->pre_2_10_icp_allocation) {
>> +        int nr_servers = xics_nr_servers();
>> +        Error *local_err = NULL;
>> +        int i;
>> +
>> +        spapr->pre_2_10_icps = g_malloc0(nr_servers * sizeof(ICPState));
>> +
>> +        for (i = 0; i < nr_servers; i++) {
>> +            void *obj = &spapr->pre_2_10_icps[i];
>> +            char *name = g_strdup_printf("icp[%d]", i);
>> +
>> +            object_initialize(obj, sizeof(ICPState), spapr->icp_type);
>> +            object_property_add_child(OBJECT(spapr), name, obj, &error_abort);
>> +            g_free(name);
>> +            object_unref(obj);
>> +            object_property_add_const_link(obj, "xics", OBJECT(spapr),
>> +                                           &error_abort);
>> +            object_property_set_bool(obj, true, "realized", &local_err);
>> +            if (local_err) {
>> +                while (i--) {
>> +                    object_unparent(obj);
>> +                }
>> +                g_free(spapr->pre_2_10_icps);
>> +                error_propagate(errp, local_err);
>> +                break;
>> +            }
>> +        }
>> +    }
>>  }
>>  
>>  static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
>> @@ -1020,7 +1055,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
>>      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
>>  
>>      /* /interrupt controller */
>> -    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
>> +    spapr_dt_xics(xics_nr_servers(), fdt, PHANDLE_XICP);
>>  
>>      ret = spapr_populate_memory(spapr, fdt);
>>      if (ret < 0) {
>> @@ -3286,9 +3321,12 @@ static void spapr_machine_2_9_instance_options(MachineState *machine)
>>  
>>  static void spapr_machine_2_9_class_options(MachineClass *mc)
>>  {
>> +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
>> +
>>      spapr_machine_2_10_class_options(mc);
>>      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_9);
>>      mc->numa_auto_assign_ram = numa_legacy_auto_assign_ram;
>> +    smc->pre_2_10_icp_allocation = true;
>>  }
>>  
>>  DEFINE_SPAPR_MACHINE(2_9, "2.9", false);
>> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
>> index ff7058ecc00e..13c4916aa5e6 100644
>> --- a/hw/ppc/spapr_cpu_core.c
>> +++ b/hw/ppc/spapr_cpu_core.c
>> @@ -119,6 +119,7 @@ static void spapr_cpu_core_unrealizefn(DeviceState *dev, Error **errp)
>>      size_t size = object_type_get_instance_size(typename);
>>      CPUCore *cc = CPU_CORE(dev);
>>      int i;
>> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>>  
>>      for (i = 0; i < cc->nr_threads; i++) {
>>          void *obj = sc->threads + i * size;
>> @@ -127,7 +128,9 @@ static void spapr_cpu_core_unrealizefn(DeviceState *dev, Error **errp)
>>          PowerPCCPU *cpu = POWERPC_CPU(cs);
>>  
>>          spapr_cpu_destroy(cpu);
>> -        object_unparent(cpu->intc);
>> +        if (!spapr->pre_2_10_icps) {
> 
> Hrm.  I dislike code for the core object directly reaching into the
> machine to check the compat flag here (and a bunch of other places
> below).  I can think of a few possible ways of avoiding this:
> 
> 1) The most direct is to make another compat flag in the cpu core
> object, set by the machine.  Straightforward, but ugly.
> 
> 2) Use a property to optionally pass a reference to the ICP array into
> the core object.  If set it will give the cpu objects ICPs from that
> array (compat mode), if not it will allocate them (new style mode).

for 2) we can just use a object_property_add_const_link() like
we do to pass the 'xics' object which is needed by the ICSes.

> 3) (Preferred, if it works)  Always have the core allocate ICPs for
> each CPU.  For compat mode instead of directly allocating ICPs, the
> machine sets up an array of pointers to the existing ICPs for each
> CPU.  The "extra" slots that don't have ICPs in new-style allocation
> get references to dummy ICP objects (maybe even all the same one).
> Only the dummy ICP(s) are allocated by the machine, the rest remain
> owned by the cpu.

I like this solution too as it should isolate the compat handling under
the machine, maybe even in a single routine.

Thanks,

C. 

> 
> 
>> +            object_unparent(cpu->intc);
>> +        }
>>          cpu_remove_sync(cs);
>>          object_unparent(obj);
>>      }
>> @@ -142,13 +145,19 @@ static void spapr_cpu_core_realize_child(Object *child, Error **errp)
>>      PowerPCCPU *cpu = POWERPC_CPU(cs);
>>      Object *obj;
>>  
>> -    obj = object_new(spapr->icp_type);
>> -    object_property_add_child(OBJECT(cpu), "icp", obj, &error_abort);
>> -    object_unref(obj);
>> -    object_property_add_const_link(obj, "xics", OBJECT(spapr), &error_abort);
>> -    object_property_set_bool(obj, true, "realized", &local_err);
>> -    if (local_err) {
>> -        goto error;
>> +    if (spapr->pre_2_10_icps) {
>> +        int index = cpu->parent_obj.cpu_index;
>> +
>> +        obj = OBJECT(&spapr->pre_2_10_icps[index]);
>> +    } else {
>> +        obj = object_new(spapr->icp_type);
>> +        object_property_add_child(OBJECT(cpu), "icp", obj, &error_abort);
>> +        object_property_add_const_link(obj, "xics", OBJECT(spapr),
>> +                                       &error_abort);
>> +        object_property_set_bool(obj, true, "realized", &local_err);
>> +        if (local_err) {
>> +            goto error;
>> +        }
>>      }
>>  
>>      object_property_set_bool(child, true, "realized", &local_err);
>> @@ -165,7 +174,9 @@ static void spapr_cpu_core_realize_child(Object *child, Error **errp)
>>      return;
>>  
>>  error:
>> -    object_unparent(obj);
>> +    if (!spapr->pre_2_10_icps) {
>> +        object_unparent(obj);
>> +    }
>>      error_propagate(errp, local_err);
>>  }
>>  
>> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
>> index f875dc41d811..d1dcf0c8bddf 100644
>> --- a/include/hw/ppc/spapr.h
>> +++ b/include/hw/ppc/spapr.h
>> @@ -54,6 +54,7 @@ struct sPAPRMachineClass {
>>      bool use_ohci_by_default;  /* use USB-OHCI instead of XHCI */
>>      bool pre_2_9_cas_pvr;      /* Use old logic for PVR compat negotiation */
>>      const char *tcg_default_cpu; /* which (TCG) CPU to simulate by default */
>> +    bool pre_2_10_icp_allocation;
>>      void (*phb_placement)(sPAPRMachineState *spapr, uint32_t index,
>>                            uint64_t *buid, hwaddr *pio, 
>>                            hwaddr *mmio32, hwaddr *mmio64,
>> @@ -110,6 +111,7 @@ struct sPAPRMachineState {
>>      MemoryHotplugState hotplug_memory;
>>  
>>      const char *icp_type;
>> +    ICPState *pre_2_10_icps;
>>  };
>>  
>>  #define H_SUCCESS         0
>>
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] spapr: fix error reporting in xics_system_init()
  2017-05-21 17:03     ` Greg Kurz
  2017-05-22  1:26       ` David Gibson
@ 2017-05-22  7:41       ` Markus Armbruster
  2017-05-22  9:00         ` David Gibson
  1 sibling, 1 reply; 25+ messages in thread
From: Markus Armbruster @ 2017-05-22  7:41 UTC (permalink / raw)
  To: Greg Kurz
  Cc: David Gibson, Laurent Vivier, Cedric Le Goater, qemu-ppc,
	qemu-devel, Bharata B Rao

Greg Kurz <groug@kaod.org> writes:

> On Sat, 20 May 2017 16:45:09 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
>> On Fri, May 19, 2017 at 12:32:12PM +0200, Greg Kurz wrote:
>> > If the user explicitely asked for kernel-irqchip support and "xics-kvm"
>> > initialization fails, we shouldn't fallback to emulated "xics" as we
>> > do now. It is also awkward to print an error message when we have an
>> > errp pointer argument.
>> > 
>> > Let's use the errp argument to report the error and let the caller decide.
>> > This simplifies the code as we don't need a local Error * here.
>> > 
>> > Signed-off-by: Greg Kurz <groug@kaod.org>  
>> 
>> Concept looks good, but..
>> 
>> > ---
>> > v2: - total rewrite
>> > ---
>> >  hw/ppc/spapr.c |   13 ++++++-------
>> >  1 file changed, 6 insertions(+), 7 deletions(-)
>> > 
>> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> > index 91f7434861a8..75e298b4c6be 100644
>> > --- a/hw/ppc/spapr.c
>> > +++ b/hw/ppc/spapr.c
>> > @@ -128,18 +128,14 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
>> >      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
>> >  
>> >      if (kvm_enabled()) {
>> > -        Error *err = NULL;
>> > -
>> >          if (machine_kernel_irqchip_allowed(machine) &&
>> >              !xics_kvm_init(spapr, errp)) {
>> >              spapr->icp_type = TYPE_KVM_ICP;
>> > -            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, &err);
>> > +            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, errp);  
>> 
>> I believe there are reasons you're not supposed to just pass an errp
>> through to a subordinate function.  Instead you're supposed to have a
>> local Error * and use error_propagate().
>> 
>
> You only need to have a local Error * if it is used to check the return status
> of the function (ie, you cannot check *errp because errp could be NULL) as
> described in error.h.

Correct.  Quote:

 * Receive an error and pass it on to the caller:
 *     Error *err = NULL;
 *     foo(arg, &err);
 *     if (err) {
 *         handle the error...
 *         error_propagate(errp, err);
 *     }
 * where Error **errp is a parameter, by convention the last one.
 *
 * Do *not* "optimize" this to
 *     foo(arg, errp);
 *     if (*errp) { // WRONG!
 *         handle the error...
 *     }
 * because errp may be NULL!
 *
 * But when all you do with the error is pass it on, please use
 *     foo(arg, errp);
 * for readability.

>                       This isn't the case here but...
>
>> >          }
>> >          if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
>> > -            error_reportf_err(err,
>> > -                              "kernel_irqchip requested but unavailable: ");
>> > -        } else {
>> > -            error_free(err);
>> > +            error_prepend(errp, "kernel_irqchip requested but unavailable: ");
>> > +            return;
>> >          }
>> >      }
>> >  
>> > @@ -147,6 +143,9 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
>> >          xics_spapr_init(spapr);
>> >          spapr->icp_type = TYPE_ICP;
>> >          spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs, errp);
>> > +        if (!spapr->ics) {  
>> 
>> It would also be more standard to check the returned error, rather
>> than the other result.
>> 
>
> ... if you prefer to use a local Error *, I'll gladly do that. :)

Opinions and practice vary on this one.

I prefer checking the return value because it lets me avoid the
error_propagate() boiler-plate more often.

Having both an Error parameter and an error return value begs the
question whether the two agree.

You can assert they do, but it's distracting.  We generally don't.

When there's no success value to transmit, you avoid the problem by
making the function return void.  We used to favor that, but it has
turned out not to be a success, because it leads to cumbersome code.
For what it's worth, GLib wants you to transmit success / failure in the
return value, too:

https://developer.gnome.org/glib/unstable/glib-Error-Reporting.html#gerror-rules

>> > +            return;
>> > +        }
>> >      }
>> >  }
>> >  
>> >   
>> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper
  2017-05-22  2:04   ` David Gibson
  2017-05-22  2:12     ` David Gibson
@ 2017-05-22  8:59     ` Greg Kurz
  2017-05-22 13:13       ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
  2017-05-23  2:35       ` [Qemu-devel] " David Gibson
  1 sibling, 2 replies; 25+ messages in thread
From: Greg Kurz @ 2017-05-22  8:59 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 5519 bytes --]

On Mon, 22 May 2017 12:04:13 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Fri, May 19, 2017 at 12:32:20PM +0200, Greg Kurz wrote:
> > For historical reasons, we compute CPU device-tree ids with a non-trivial
> > logic. This patch consolidate the logic in a single helper to be used
> > in various places where it is currently open-coded.
> > 
> > It is okay to get rid of DIV_ROUND_UP() because we're sure that the number
> > of threads per core in the guest cannot exceed the number of threads per
> > core in the host.  
> 
> However, your new logic still gives different answers in some cases.
> In particular when max_cpus is not a multiple of smp_threads.  Which
> is generally a bad idea, but allowed for older machine types for
> compatibility.   e.g. smp_threads=4, max_cpus=6 smt=8
> 
> Old logic:
> 	         DIV_ROUND_UP(6 * 8, 4)
> 	       = ⌈48 / 4⌉ = 12
> 
> New logic gives: ⌊6 / 4⌋ * 8 + (6 % 4)
>                = 1 * 8 + 2
> 	       = 10
> 

I now realize this two formulas are hardly reconcilable... this
probably means that this patch shouldn't try to consolidate the
logic in hw/ppc/spapr.c with the one in target/ppc/translate_init.c.

> In any case the DIV_ROUND_UP() isn't to handle the case where guest
> threads-per-core is bigger than host threads-per-core, it's (IIRC) for
> the case where guest threads-per-core is not a factor of host
> threads-per-core.  Again, a bad idea, but I think allowed in some old
> cases.
> 

FWIW, DIV_ROUND_UP() comes from this commit:

f303f117fec3 spapr: ensure we have at least one XICS server

but I agree that this was a bad idea...

> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  hw/ppc/spapr.c              |    6 ++----
> >  target/ppc/cpu.h            |   17 +++++++++++++++++
> >  target/ppc/translate_init.c |    3 +--
> >  3 files changed, 20 insertions(+), 6 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 75e298b4c6be..1bb05a9a6b07 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -981,7 +981,6 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> >      void *fdt;
> >      sPAPRPHBState *phb;
> >      char *buf;
> > -    int smt = kvmppc_smt_threads();
> >  
> >      fdt = g_malloc0(FDT_MAX_SIZE);
> >      _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
> > @@ -1021,7 +1020,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> >      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
> >  
> >      /* /interrupt controller */
> > -    spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, PHANDLE_XICP);
> > +    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
> >  
> >      ret = spapr_populate_memory(spapr, fdt);
> >      if (ret < 0) {
> > @@ -1977,7 +1976,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> >      MachineState *machine = MACHINE(spapr);
> >      MachineClass *mc = MACHINE_GET_CLASS(machine);
> >      char *type = spapr_get_cpu_core_type(machine->cpu_model);
> > -    int smt = kvmppc_smt_threads();
> >      const CPUArchIdList *possible_cpus;
> >      int boot_cores_nr = smp_cpus / smp_threads;
> >      int i;
> > @@ -2014,7 +2012,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> >              sPAPRDRConnector *drc =
> >                  spapr_dr_connector_new(OBJECT(spapr),
> >                                         SPAPR_DR_CONNECTOR_TYPE_CPU,
> > -                                       (core_id / smp_threads) * smt);
> > +                                       ppc_cpu_dt_id_from_index(core_id));
> >  
> >              qemu_register_reset(spapr_drc_reset, drc);
> >          }
> > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > index 401e10e7dad8..47fe6c64698f 100644
> > --- a/target/ppc/cpu.h
> > +++ b/target/ppc/cpu.h
> > @@ -2529,4 +2529,21 @@ int ppc_get_vcpu_dt_id(PowerPCCPU *cpu);
> >  PowerPCCPU *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
> >  
> >  void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len);
> > +
> > +#if !defined(CONFIG_USER_ONLY)
> > +#include "sysemu/cpus.h"
> > +#include "target/ppc/kvm_ppc.h"
> > +
> > +static inline int ppc_cpu_dt_id_from_index(int cpu_index)
> > +{
> > +    /* POWER HV support has an historical limitation that different threads
> > +     * on a single core cannot be in different guests at the same time. In
> > +     * order to allow KVM to assign guest threads to host cores accordingly,
> > +     * CPU device tree ids are spaced by the number of threads per host cores.
> > +     */
> > +    return (cpu_index / smp_threads) * kvmppc_smt_threads()
> > +        + (cpu_index % smp_threads);
> > +}
> > +#endif
> > +
> >  #endif /* PPC_CPU_H */
> > diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> > index 56a0ab22cfbe..837a9a496a65 100644
> > --- a/target/ppc/translate_init.c
> > +++ b/target/ppc/translate_init.c
> > @@ -9851,8 +9851,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> >      }
> >  
> >  #if !defined(CONFIG_USER_ONLY)
> > -    cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
> > -        + (cs->cpu_index % smp_threads);
> > +    cpu->cpu_dt_id = ppc_cpu_dt_id_from_index(cs->cpu_index);
> >  
> >      if (kvm_enabled() && !kvm_vcpu_id_is_valid(cpu->cpu_dt_id)) {
> >          error_setg(errp, "Can't create CPU with id %d in KVM", cpu->cpu_dt_id);
> >   
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/4] spapr: fix error reporting in xics_system_init()
  2017-05-22  7:41       ` Markus Armbruster
@ 2017-05-22  9:00         ` David Gibson
  0 siblings, 0 replies; 25+ messages in thread
From: David Gibson @ 2017-05-22  9:00 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Greg Kurz, Laurent Vivier, Cedric Le Goater, qemu-ppc, qemu-devel,
	Bharata B Rao

[-- Attachment #1: Type: text/plain, Size: 5182 bytes --]

On Mon, May 22, 2017 at 09:41:48AM +0200, Markus Armbruster wrote:
> Greg Kurz <groug@kaod.org> writes:
> 
> > On Sat, 20 May 2017 16:45:09 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >
> >> On Fri, May 19, 2017 at 12:32:12PM +0200, Greg Kurz wrote:
> >> > If the user explicitely asked for kernel-irqchip support and "xics-kvm"
> >> > initialization fails, we shouldn't fallback to emulated "xics" as we
> >> > do now. It is also awkward to print an error message when we have an
> >> > errp pointer argument.
> >> > 
> >> > Let's use the errp argument to report the error and let the caller decide.
> >> > This simplifies the code as we don't need a local Error * here.
> >> > 
> >> > Signed-off-by: Greg Kurz <groug@kaod.org>  
> >> 
> >> Concept looks good, but..
> >> 
> >> > ---
> >> > v2: - total rewrite
> >> > ---
> >> >  hw/ppc/spapr.c |   13 ++++++-------
> >> >  1 file changed, 6 insertions(+), 7 deletions(-)
> >> > 
> >> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> > index 91f7434861a8..75e298b4c6be 100644
> >> > --- a/hw/ppc/spapr.c
> >> > +++ b/hw/ppc/spapr.c
> >> > @@ -128,18 +128,14 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
> >> >      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> >> >  
> >> >      if (kvm_enabled()) {
> >> > -        Error *err = NULL;
> >> > -
> >> >          if (machine_kernel_irqchip_allowed(machine) &&
> >> >              !xics_kvm_init(spapr, errp)) {
> >> >              spapr->icp_type = TYPE_KVM_ICP;
> >> > -            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, &err);
> >> > +            spapr->ics = spapr_ics_create(spapr, TYPE_ICS_KVM, nr_irqs, errp);  
> >> 
> >> I believe there are reasons you're not supposed to just pass an errp
> >> through to a subordinate function.  Instead you're supposed to have a
> >> local Error * and use error_propagate().
> >> 
> >
> > You only need to have a local Error * if it is used to check the return status
> > of the function (ie, you cannot check *errp because errp could be NULL) as
> > described in error.h.
> 
> Correct.  Quote:
> 
>  * Receive an error and pass it on to the caller:
>  *     Error *err = NULL;
>  *     foo(arg, &err);
>  *     if (err) {
>  *         handle the error...
>  *         error_propagate(errp, err);
>  *     }
>  * where Error **errp is a parameter, by convention the last one.
>  *
>  * Do *not* "optimize" this to
>  *     foo(arg, errp);
>  *     if (*errp) { // WRONG!
>  *         handle the error...
>  *     }
>  * because errp may be NULL!
>  *
>  * But when all you do with the error is pass it on, please use
>  *     foo(arg, errp);
>  * for readability.

So, I already merged based on Greg's comment, but it's nice to have
confirmation; thanks Markus.


> >                       This isn't the case here but...
> >
> >> >          }
> >> >          if (machine_kernel_irqchip_required(machine) && !spapr->ics) {
> >> > -            error_reportf_err(err,
> >> > -                              "kernel_irqchip requested but unavailable: ");
> >> > -        } else {
> >> > -            error_free(err);
> >> > +            error_prepend(errp, "kernel_irqchip requested but unavailable: ");
> >> > +            return;
> >> >          }
> >> >      }
> >> >  
> >> > @@ -147,6 +143,9 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
> >> >          xics_spapr_init(spapr);
> >> >          spapr->icp_type = TYPE_ICP;
> >> >          spapr->ics = spapr_ics_create(spapr, TYPE_ICS_SIMPLE, nr_irqs, errp);
> >> > +        if (!spapr->ics) {  
> >> 
> >> It would also be more standard to check the returned error, rather
> >> than the other result.
> >> 
> >
> > ... if you prefer to use a local Error *, I'll gladly do that. :)
> 
> Opinions and practice vary on this one.
> 
> I prefer checking the return value because it lets me avoid the
> error_propagate() boiler-plate more often.

Noted for future reference.

> Having both an Error parameter and an error return value begs the
> question whether the two agree.

[Irrelevant aside: this is not what "begging the question" means.  Or
 at least, it's not what it used to mean; it's probably a lost cause
 at this point, even with those who don't get a free pass for being
 non-native speakers.  https://en.wikipedia.org/wiki/Begging_the_question]
 
> You can assert they do, but it's distracting.  We generally don't.
> 
> When there's no success value to transmit, you avoid the problem by
> making the function return void.  We used to favor that, but it has
> turned out not to be a success, because it leads to cumbersome code.
> For what it's worth, GLib wants you to transmit success / failure in the
> return value, too:
> 
> https://developer.gnome.org/glib/unstable/glib-Error-Reporting.html#gerror-rules

Also noted, thanks.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper
  2017-05-22  2:12     ` David Gibson
@ 2017-05-22  9:09       ` Greg Kurz
  2017-05-22 14:33       ` Greg Kurz
  1 sibling, 0 replies; 25+ messages in thread
From: Greg Kurz @ 2017-05-22  9:09 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 5933 bytes --]

On Mon, 22 May 2017 12:12:46 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Mon, May 22, 2017 at 12:04:13PM +1000, David Gibson wrote:
> > On Fri, May 19, 2017 at 12:32:20PM +0200, Greg Kurz wrote:  
> > > For historical reasons, we compute CPU device-tree ids with a non-trivial
> > > logic. This patch consolidate the logic in a single helper to be used
> > > in various places where it is currently open-coded.
> > > 
> > > It is okay to get rid of DIV_ROUND_UP() because we're sure that the number
> > > of threads per core in the guest cannot exceed the number of threads per
> > > core in the host.  
> > 
> > However, your new logic still gives different answers in some cases.
> > In particular when max_cpus is not a multiple of smp_threads.  Which
> > is generally a bad idea, but allowed for older machine types for
> > compatibility.   e.g. smp_threads=4, max_cpus=6 smt=8
> > 
> > Old logic:
> > 	         DIV_ROUND_UP(6 * 8, 4)
> > 	       = ⌈48 / 4⌉ = 12
> > 
> > New logic gives: ⌊6 / 4⌋ * 8 + (6 % 4)
> >                = 1 * 8 + 2
> > 	       = 10
> > 
> > In any case the DIV_ROUND_UP() isn't to handle the case where guest
> > threads-per-core is bigger than host threads-per-core, it's (IIRC) for
> > the case where guest threads-per-core is not a factor of host
> > threads-per-core.  Again, a bad idea, but I think allowed in some old
> > cases.  
> 
> Oh, so, the other more general point here is that I actually want to
> get rid of dt_id from the cpu structure.  It's basically an abuse of
> the cpu stuff to include what's really an spapr concept - dt IDs for
> powernv are based on the PIR and not allocate the same way.
> 

Agreed.

> That said, I'm still ok with a fixed version of this patch as an
> interim step.
> 

Well... I'm not sure anymore I need this patch to fix the migration
breakage.

> > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > ---
> > >  hw/ppc/spapr.c              |    6 ++----
> > >  target/ppc/cpu.h            |   17 +++++++++++++++++
> > >  target/ppc/translate_init.c |    3 +--
> > >  3 files changed, 20 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 75e298b4c6be..1bb05a9a6b07 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -981,7 +981,6 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > >      void *fdt;
> > >      sPAPRPHBState *phb;
> > >      char *buf;
> > > -    int smt = kvmppc_smt_threads();
> > >  
> > >      fdt = g_malloc0(FDT_MAX_SIZE);
> > >      _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
> > > @@ -1021,7 +1020,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > >      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
> > >  
> > >      /* /interrupt controller */
> > > -    spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, PHANDLE_XICP);
> > > +    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
> > >  
> > >      ret = spapr_populate_memory(spapr, fdt);
> > >      if (ret < 0) {
> > > @@ -1977,7 +1976,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > >      MachineState *machine = MACHINE(spapr);
> > >      MachineClass *mc = MACHINE_GET_CLASS(machine);
> > >      char *type = spapr_get_cpu_core_type(machine->cpu_model);
> > > -    int smt = kvmppc_smt_threads();
> > >      const CPUArchIdList *possible_cpus;
> > >      int boot_cores_nr = smp_cpus / smp_threads;
> > >      int i;
> > > @@ -2014,7 +2012,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > >              sPAPRDRConnector *drc =
> > >                  spapr_dr_connector_new(OBJECT(spapr),
> > >                                         SPAPR_DR_CONNECTOR_TYPE_CPU,
> > > -                                       (core_id / smp_threads) * smt);
> > > +                                       ppc_cpu_dt_id_from_index(core_id));
> > >  
> > >              qemu_register_reset(spapr_drc_reset, drc);
> > >          }
> > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > > index 401e10e7dad8..47fe6c64698f 100644
> > > --- a/target/ppc/cpu.h
> > > +++ b/target/ppc/cpu.h
> > > @@ -2529,4 +2529,21 @@ int ppc_get_vcpu_dt_id(PowerPCCPU *cpu);
> > >  PowerPCCPU *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
> > >  
> > >  void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len);
> > > +
> > > +#if !defined(CONFIG_USER_ONLY)
> > > +#include "sysemu/cpus.h"
> > > +#include "target/ppc/kvm_ppc.h"
> > > +
> > > +static inline int ppc_cpu_dt_id_from_index(int cpu_index)
> > > +{
> > > +    /* POWER HV support has an historical limitation that different threads
> > > +     * on a single core cannot be in different guests at the same time. In
> > > +     * order to allow KVM to assign guest threads to host cores accordingly,
> > > +     * CPU device tree ids are spaced by the number of threads per host cores.
> > > +     */
> > > +    return (cpu_index / smp_threads) * kvmppc_smt_threads()
> > > +        + (cpu_index % smp_threads);
> > > +}
> > > +#endif
> > > +
> > >  #endif /* PPC_CPU_H */
> > > diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> > > index 56a0ab22cfbe..837a9a496a65 100644
> > > --- a/target/ppc/translate_init.c
> > > +++ b/target/ppc/translate_init.c
> > > @@ -9851,8 +9851,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> > >      }
> > >  
> > >  #if !defined(CONFIG_USER_ONLY)
> > > -    cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
> > > -        + (cs->cpu_index % smp_threads);
> > > +    cpu->cpu_dt_id = ppc_cpu_dt_id_from_index(cs->cpu_index);
> > >  
> > >      if (kvm_enabled() && !kvm_vcpu_id_is_valid(cpu->cpu_dt_id)) {
> > >          error_setg(errp, "Can't create CPU with id %d in KVM", cpu->cpu_dt_id);
> > >   
> >   
> 
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/4] spapr: fix migration of ICP objects from/to older QEMU
  2017-05-22  7:20     ` Cédric Le Goater
@ 2017-05-22  9:15       ` David Gibson
  2017-05-22 15:04         ` Greg Kurz
  0 siblings, 1 reply; 25+ messages in thread
From: David Gibson @ 2017-05-22  9:15 UTC (permalink / raw)
  To: Cédric Le Goater
  Cc: Greg Kurz, qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao

[-- Attachment #1: Type: text/plain, Size: 6997 bytes --]

On Mon, May 22, 2017 at 09:20:42AM +0200, Cédric Le Goater wrote:
> On 05/22/2017 04:30 AM, David Gibson wrote:
> > On Fri, May 19, 2017 at 12:32:27PM +0200, Greg Kurz wrote:
> >> Commit 5bc8d26de20c ("spapr: allocate the ICPState object from under
> >> sPAPRCPUCore") moved ICP objects from the machine to CPU cores. This
> >> is an improvement since we no longer allocate ICP objects that will
> >> never be used. But it has the side-effect of breaking migration of
> >> older machine types from older QEMU versions.
> >>
> >> This patch introduces a compat flag in the sPAPR machine class so
> >> that all pseries machine up to 2.9 go on with the previous behavior
> >> of pre-allocating ICP objects.
> >>
> >> Signed-off-by: Greg Kurz <groug@kaod.org>
> >> ---
> >> v2: - s/void* /void * in xics_system_init()
> >>     - don't use "[*]" in the ICP object name
> >>     - use pre_2_10_ prefix in field names
> >>     - added xics_nr_servers() helper
> >> ---
> >>  hw/ppc/spapr.c          |   40 +++++++++++++++++++++++++++++++++++++++-
> >>  hw/ppc/spapr_cpu_core.c |   29 ++++++++++++++++++++---------
> >>  include/hw/ppc/spapr.h  |    2 ++
> >>  3 files changed, 61 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >> index 1bb05a9a6b07..182262257c60 100644
> >> --- a/hw/ppc/spapr.c
> >> +++ b/hw/ppc/spapr.c
> >> @@ -123,9 +123,15 @@ error:
> >>      return NULL;
> >>  }
> >>  
> >> +static inline int xics_nr_servers(void)
> >> +{
> >> +    return ppc_cpu_dt_id_from_index(max_cpus);
> >> +}
> >> +
> >>  static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
> >>  {
> >>      sPAPRMachineState *spapr = SPAPR_MACHINE(machine);
> >> +    sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> >>  
> >>      if (kvm_enabled()) {
> >>          if (machine_kernel_irqchip_allowed(machine) &&
> >> @@ -147,6 +153,35 @@ static void xics_system_init(MachineState *machine, int nr_irqs, Error **errp)
> >>              return;
> >>          }
> >>      }
> >> +
> >> +    if (smc->pre_2_10_icp_allocation) {
> >> +        int nr_servers = xics_nr_servers();
> >> +        Error *local_err = NULL;
> >> +        int i;
> >> +
> >> +        spapr->pre_2_10_icps = g_malloc0(nr_servers * sizeof(ICPState));
> >> +
> >> +        for (i = 0; i < nr_servers; i++) {
> >> +            void *obj = &spapr->pre_2_10_icps[i];
> >> +            char *name = g_strdup_printf("icp[%d]", i);
> >> +
> >> +            object_initialize(obj, sizeof(ICPState), spapr->icp_type);
> >> +            object_property_add_child(OBJECT(spapr), name, obj, &error_abort);
> >> +            g_free(name);
> >> +            object_unref(obj);
> >> +            object_property_add_const_link(obj, "xics", OBJECT(spapr),
> >> +                                           &error_abort);
> >> +            object_property_set_bool(obj, true, "realized", &local_err);
> >> +            if (local_err) {
> >> +                while (i--) {
> >> +                    object_unparent(obj);
> >> +                }
> >> +                g_free(spapr->pre_2_10_icps);
> >> +                error_propagate(errp, local_err);
> >> +                break;
> >> +            }
> >> +        }
> >> +    }
> >>  }
> >>  
> >>  static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
> >> @@ -1020,7 +1055,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> >>      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
> >>  
> >>      /* /interrupt controller */
> >> -    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
> >> +    spapr_dt_xics(xics_nr_servers(), fdt, PHANDLE_XICP);
> >>  
> >>      ret = spapr_populate_memory(spapr, fdt);
> >>      if (ret < 0) {
> >> @@ -3286,9 +3321,12 @@ static void spapr_machine_2_9_instance_options(MachineState *machine)
> >>  
> >>  static void spapr_machine_2_9_class_options(MachineClass *mc)
> >>  {
> >> +    sPAPRMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
> >> +
> >>      spapr_machine_2_10_class_options(mc);
> >>      SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_9);
> >>      mc->numa_auto_assign_ram = numa_legacy_auto_assign_ram;
> >> +    smc->pre_2_10_icp_allocation = true;
> >>  }
> >>  
> >>  DEFINE_SPAPR_MACHINE(2_9, "2.9", false);
> >> diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
> >> index ff7058ecc00e..13c4916aa5e6 100644
> >> --- a/hw/ppc/spapr_cpu_core.c
> >> +++ b/hw/ppc/spapr_cpu_core.c
> >> @@ -119,6 +119,7 @@ static void spapr_cpu_core_unrealizefn(DeviceState *dev, Error **errp)
> >>      size_t size = object_type_get_instance_size(typename);
> >>      CPUCore *cc = CPU_CORE(dev);
> >>      int i;
> >> +    sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> >>  
> >>      for (i = 0; i < cc->nr_threads; i++) {
> >>          void *obj = sc->threads + i * size;
> >> @@ -127,7 +128,9 @@ static void spapr_cpu_core_unrealizefn(DeviceState *dev, Error **errp)
> >>          PowerPCCPU *cpu = POWERPC_CPU(cs);
> >>  
> >>          spapr_cpu_destroy(cpu);
> >> -        object_unparent(cpu->intc);
> >> +        if (!spapr->pre_2_10_icps) {
> > 
> > Hrm.  I dislike code for the core object directly reaching into the
> > machine to check the compat flag here (and a bunch of other places
> > below).  I can think of a few possible ways of avoiding this:
> > 
> > 1) The most direct is to make another compat flag in the cpu core
> > object, set by the machine.  Straightforward, but ugly.
> > 
> > 2) Use a property to optionally pass a reference to the ICP array into
> > the core object.  If set it will give the cpu objects ICPs from that
> > array (compat mode), if not it will allocate them (new style mode).
> 
> for 2) we can just use a object_property_add_const_link() like
> we do to pass the 'xics' object which is needed by the ICSes.

Yes.  Though you'll need to pass one for each thread down to the core
object, which will be fiddly.

> > 3) (Preferred, if it works)  Always have the core allocate ICPs for
> > each CPU.  For compat mode instead of directly allocating ICPs, the
> > machine sets up an array of pointers to the existing ICPs for each
> > CPU.  The "extra" slots that don't have ICPs in new-style allocation
> > get references to dummy ICP objects (maybe even all the same one).
> > Only the dummy ICP(s) are allocated by the machine, the rest remain
> > owned by the cpu.
> 
> I like this solution too as it should isolate the compat handling under
> the machine, maybe even in a single routine.

Right, that's the hope.  Plus it means we reduce the difference in
runtime QOM structure between the two modes, which is best when
possible.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper
  2017-05-22  8:59     ` Greg Kurz
@ 2017-05-22 13:13       ` Greg Kurz
  2017-05-23  6:37         ` David Gibson
  2017-05-23  2:35       ` [Qemu-devel] " David Gibson
  1 sibling, 1 reply; 25+ messages in thread
From: Greg Kurz @ 2017-05-22 13:13 UTC (permalink / raw)
  To: David Gibson
  Cc: Laurent Vivier, Cedric Le Goater, qemu-ppc, qemu-devel,
	Bharata B Rao

[-- Attachment #1: Type: text/plain, Size: 6967 bytes --]

On Mon, 22 May 2017 10:59:50 +0200
Greg Kurz <groug@kaod.org> wrote:

> On Mon, 22 May 2017 12:04:13 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Fri, May 19, 2017 at 12:32:20PM +0200, Greg Kurz wrote:  
> > > For historical reasons, we compute CPU device-tree ids with a non-trivial
> > > logic. This patch consolidate the logic in a single helper to be used
> > > in various places where it is currently open-coded.
> > > 
> > > It is okay to get rid of DIV_ROUND_UP() because we're sure that the number
> > > of threads per core in the guest cannot exceed the number of threads per
> > > core in the host.    
> > 
> > However, your new logic still gives different answers in some cases.
> > In particular when max_cpus is not a multiple of smp_threads.  Which
> > is generally a bad idea, but allowed for older machine types for
> > compatibility.   e.g. smp_threads=4, max_cpus=6 smt=8
> > 

FWIW, this topology was never supported for pseries >= 2.7 since commit
94a94e4c4919 ("spapr: convert boot CPUs into CPU core devices", QEMU 2.7):

qemu-system-ppc64: max_cpus (6) must be multiple of threads (4)

The same happens goes with smp_cpus.

If we care for compat with pre-2.7 machine types (ie, no support for CPU
hotplug), this topology isn't valid anymore since QEMU 2.9, with these
commits:

0c86d0fd92aa ("pseries: Always use core objects for CPU construction") which
causes the following error if we only set max_cpus:

qemu-system-ppc64: This machine version does not support CPU hotplug

8149e2992f78 ("pseries: Enforce homogeneous threads-per-core") which
causes the following error if we set smp_cpus (or smp_cpus == max_cpus):

qemu-system-ppc64: invalid nr-threads 2, must be 4

So in the end, we already enforce max_cpus and smp_cpus to be multiple
of smp_threads for all machine types. In this case...

> > Old logic:
> > 	         DIV_ROUND_UP(6 * 8, 4)
> > 	       = ⌈48 / 4⌉ = 12
> > 
> > New logic gives: ⌊6 / 4⌋ * 8 + (6 % 4)
> >                = 1 * 8 + 2
> > 	       = 10
> >   
> 
> I now realize this two formulas are hardly reconcilable... this
> probably means that this patch shouldn't try to consolidate the
> logic in hw/ppc/spapr.c with the one in target/ppc/translate_init.c.
> 

... both formulas are equivalent, unless I'm missing something. Or
do we really want to re-allow the funky topologies for older machines ?

> > In any case the DIV_ROUND_UP() isn't to handle the case where guest
> > threads-per-core is bigger than host threads-per-core, it's (IIRC) for
> > the case where guest threads-per-core is not a factor of host
> > threads-per-core.  Again, a bad idea, but I think allowed in some old
> > cases.
> >   
> 
> FWIW, DIV_ROUND_UP() comes from this commit:
> 
> f303f117fec3 spapr: ensure we have at least one XICS server
> 
> but I agree that this was a bad idea...
> 
> > > 
> > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > ---
> > >  hw/ppc/spapr.c              |    6 ++----
> > >  target/ppc/cpu.h            |   17 +++++++++++++++++
> > >  target/ppc/translate_init.c |    3 +--
> > >  3 files changed, 20 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 75e298b4c6be..1bb05a9a6b07 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -981,7 +981,6 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > >      void *fdt;
> > >      sPAPRPHBState *phb;
> > >      char *buf;
> > > -    int smt = kvmppc_smt_threads();
> > >  
> > >      fdt = g_malloc0(FDT_MAX_SIZE);
> > >      _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
> > > @@ -1021,7 +1020,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > >      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
> > >  
> > >      /* /interrupt controller */
> > > -    spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, PHANDLE_XICP);
> > > +    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
> > >  
> > >      ret = spapr_populate_memory(spapr, fdt);
> > >      if (ret < 0) {
> > > @@ -1977,7 +1976,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > >      MachineState *machine = MACHINE(spapr);
> > >      MachineClass *mc = MACHINE_GET_CLASS(machine);
> > >      char *type = spapr_get_cpu_core_type(machine->cpu_model);
> > > -    int smt = kvmppc_smt_threads();
> > >      const CPUArchIdList *possible_cpus;
> > >      int boot_cores_nr = smp_cpus / smp_threads;
> > >      int i;
> > > @@ -2014,7 +2012,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > >              sPAPRDRConnector *drc =
> > >                  spapr_dr_connector_new(OBJECT(spapr),
> > >                                         SPAPR_DR_CONNECTOR_TYPE_CPU,
> > > -                                       (core_id / smp_threads) * smt);
> > > +                                       ppc_cpu_dt_id_from_index(core_id));
> > >  
> > >              qemu_register_reset(spapr_drc_reset, drc);
> > >          }
> > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > > index 401e10e7dad8..47fe6c64698f 100644
> > > --- a/target/ppc/cpu.h
> > > +++ b/target/ppc/cpu.h
> > > @@ -2529,4 +2529,21 @@ int ppc_get_vcpu_dt_id(PowerPCCPU *cpu);
> > >  PowerPCCPU *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
> > >  
> > >  void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len);
> > > +
> > > +#if !defined(CONFIG_USER_ONLY)
> > > +#include "sysemu/cpus.h"
> > > +#include "target/ppc/kvm_ppc.h"
> > > +
> > > +static inline int ppc_cpu_dt_id_from_index(int cpu_index)
> > > +{
> > > +    /* POWER HV support has an historical limitation that different threads
> > > +     * on a single core cannot be in different guests at the same time. In
> > > +     * order to allow KVM to assign guest threads to host cores accordingly,
> > > +     * CPU device tree ids are spaced by the number of threads per host cores.
> > > +     */
> > > +    return (cpu_index / smp_threads) * kvmppc_smt_threads()
> > > +        + (cpu_index % smp_threads);
> > > +}
> > > +#endif
> > > +
> > >  #endif /* PPC_CPU_H */
> > > diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> > > index 56a0ab22cfbe..837a9a496a65 100644
> > > --- a/target/ppc/translate_init.c
> > > +++ b/target/ppc/translate_init.c
> > > @@ -9851,8 +9851,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> > >      }
> > >  
> > >  #if !defined(CONFIG_USER_ONLY)
> > > -    cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
> > > -        + (cs->cpu_index % smp_threads);
> > > +    cpu->cpu_dt_id = ppc_cpu_dt_id_from_index(cs->cpu_index);
> > >  
> > >      if (kvm_enabled() && !kvm_vcpu_id_is_valid(cpu->cpu_dt_id)) {
> > >          error_setg(errp, "Can't create CPU with id %d in KVM", cpu->cpu_dt_id);
> > >     
> >   
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper
  2017-05-22  2:12     ` David Gibson
  2017-05-22  9:09       ` Greg Kurz
@ 2017-05-22 14:33       ` Greg Kurz
  2017-05-23  2:37         ` David Gibson
  1 sibling, 1 reply; 25+ messages in thread
From: Greg Kurz @ 2017-05-22 14:33 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 6123 bytes --]

On Mon, 22 May 2017 12:12:46 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Mon, May 22, 2017 at 12:04:13PM +1000, David Gibson wrote:
> > On Fri, May 19, 2017 at 12:32:20PM +0200, Greg Kurz wrote:  
> > > For historical reasons, we compute CPU device-tree ids with a non-trivial
> > > logic. This patch consolidate the logic in a single helper to be used
> > > in various places where it is currently open-coded.
> > > 
> > > It is okay to get rid of DIV_ROUND_UP() because we're sure that the number
> > > of threads per core in the guest cannot exceed the number of threads per
> > > core in the host.  
> > 
> > However, your new logic still gives different answers in some cases.
> > In particular when max_cpus is not a multiple of smp_threads.  Which
> > is generally a bad idea, but allowed for older machine types for
> > compatibility.   e.g. smp_threads=4, max_cpus=6 smt=8
> > 
> > Old logic:
> > 	         DIV_ROUND_UP(6 * 8, 4)
> > 	       = ⌈48 / 4⌉ = 12
> > 
> > New logic gives: ⌊6 / 4⌋ * 8 + (6 % 4)
> >                = 1 * 8 + 2
> > 	       = 10
> > 
> > In any case the DIV_ROUND_UP() isn't to handle the case where guest
> > threads-per-core is bigger than host threads-per-core, it's (IIRC) for
> > the case where guest threads-per-core is not a factor of host
> > threads-per-core.  Again, a bad idea, but I think allowed in some old
> > cases.  
> 
> Oh, so, the other more general point here is that I actually want to
> get rid of dt_id from the cpu structure.  It's basically an abuse of
> the cpu stuff to include what's really an spapr concept - dt IDs for
> powernv are based on the PIR and not allocate the same way.
> 

Yeah, I agree. I guess this calls for the introduction of a sPAPRCPUThread
object type derived from PowerPCCPU. It probably deserves to be addressed
in a separate patchset.

> That said, I'm still ok with a fixed version of this patch as an
> interim step.
> 

Given that the logic change doesn't break anything in the end (see my other
mail), then we're good ?

> > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > ---
> > >  hw/ppc/spapr.c              |    6 ++----
> > >  target/ppc/cpu.h            |   17 +++++++++++++++++
> > >  target/ppc/translate_init.c |    3 +--
> > >  3 files changed, 20 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 75e298b4c6be..1bb05a9a6b07 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -981,7 +981,6 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > >      void *fdt;
> > >      sPAPRPHBState *phb;
> > >      char *buf;
> > > -    int smt = kvmppc_smt_threads();
> > >  
> > >      fdt = g_malloc0(FDT_MAX_SIZE);
> > >      _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
> > > @@ -1021,7 +1020,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > >      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
> > >  
> > >      /* /interrupt controller */
> > > -    spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, PHANDLE_XICP);
> > > +    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
> > >  
> > >      ret = spapr_populate_memory(spapr, fdt);
> > >      if (ret < 0) {
> > > @@ -1977,7 +1976,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > >      MachineState *machine = MACHINE(spapr);
> > >      MachineClass *mc = MACHINE_GET_CLASS(machine);
> > >      char *type = spapr_get_cpu_core_type(machine->cpu_model);
> > > -    int smt = kvmppc_smt_threads();
> > >      const CPUArchIdList *possible_cpus;
> > >      int boot_cores_nr = smp_cpus / smp_threads;
> > >      int i;
> > > @@ -2014,7 +2012,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > >              sPAPRDRConnector *drc =
> > >                  spapr_dr_connector_new(OBJECT(spapr),
> > >                                         SPAPR_DR_CONNECTOR_TYPE_CPU,
> > > -                                       (core_id / smp_threads) * smt);
> > > +                                       ppc_cpu_dt_id_from_index(core_id));
> > >  
> > >              qemu_register_reset(spapr_drc_reset, drc);
> > >          }
> > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > > index 401e10e7dad8..47fe6c64698f 100644
> > > --- a/target/ppc/cpu.h
> > > +++ b/target/ppc/cpu.h
> > > @@ -2529,4 +2529,21 @@ int ppc_get_vcpu_dt_id(PowerPCCPU *cpu);
> > >  PowerPCCPU *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
> > >  
> > >  void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len);
> > > +
> > > +#if !defined(CONFIG_USER_ONLY)
> > > +#include "sysemu/cpus.h"
> > > +#include "target/ppc/kvm_ppc.h"
> > > +
> > > +static inline int ppc_cpu_dt_id_from_index(int cpu_index)
> > > +{
> > > +    /* POWER HV support has an historical limitation that different threads
> > > +     * on a single core cannot be in different guests at the same time. In
> > > +     * order to allow KVM to assign guest threads to host cores accordingly,
> > > +     * CPU device tree ids are spaced by the number of threads per host cores.
> > > +     */
> > > +    return (cpu_index / smp_threads) * kvmppc_smt_threads()
> > > +        + (cpu_index % smp_threads);
> > > +}
> > > +#endif
> > > +
> > >  #endif /* PPC_CPU_H */
> > > diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> > > index 56a0ab22cfbe..837a9a496a65 100644
> > > --- a/target/ppc/translate_init.c
> > > +++ b/target/ppc/translate_init.c
> > > @@ -9851,8 +9851,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> > >      }
> > >  
> > >  #if !defined(CONFIG_USER_ONLY)
> > > -    cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
> > > -        + (cs->cpu_index % smp_threads);
> > > +    cpu->cpu_dt_id = ppc_cpu_dt_id_from_index(cs->cpu_index);
> > >  
> > >      if (kvm_enabled() && !kvm_vcpu_id_is_valid(cpu->cpu_dt_id)) {
> > >          error_setg(errp, "Can't create CPU with id %d in KVM", cpu->cpu_dt_id);
> > >   
> >   
> 
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 4/4] spapr: fix migration of ICP objects from/to older QEMU
  2017-05-22  9:15       ` David Gibson
@ 2017-05-22 15:04         ` Greg Kurz
  0 siblings, 0 replies; 25+ messages in thread
From: Greg Kurz @ 2017-05-22 15:04 UTC (permalink / raw)
  To: David Gibson
  Cc: Cédric Le Goater, qemu-ppc, qemu-devel, Laurent Vivier,
	Bharata B Rao

[-- Attachment #1: Type: text/plain, Size: 1750 bytes --]

On Mon, 22 May 2017 19:15:36 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:
[...]
> > > 
> > > Hrm.  I dislike code for the core object directly reaching into the
> > > machine to check the compat flag here (and a bunch of other places
> > > below).  I can think of a few possible ways of avoiding this:
> > > 
> > > 1) The most direct is to make another compat flag in the cpu core
> > > object, set by the machine.  Straightforward, but ugly.
> > > 
> > > 2) Use a property to optionally pass a reference to the ICP array into
> > > the core object.  If set it will give the cpu objects ICPs from that
> > > array (compat mode), if not it will allocate them (new style mode).  
> > 
> > for 2) we can just use a object_property_add_const_link() like
> > we do to pass the 'xics' object which is needed by the ICSes.  
> 
> Yes.  Though you'll need to pass one for each thread down to the core
> object, which will be fiddly.
> 
> > > 3) (Preferred, if it works)  Always have the core allocate ICPs for
> > > each CPU.  For compat mode instead of directly allocating ICPs, the
> > > machine sets up an array of pointers to the existing ICPs for each
> > > CPU.  The "extra" slots that don't have ICPs in new-style allocation
> > > get references to dummy ICP objects (maybe even all the same one).
> > > Only the dummy ICP(s) are allocated by the machine, the rest remain
> > > owned by the cpu.  
> > 
> > I like this solution too as it should isolate the compat handling under
> > the machine, maybe even in a single routine.  
> 
> Right, that's the hope.  Plus it means we reduce the difference in
> runtime QOM structure between the two modes, which is best when
> possible.
> 

I'll try to do 3)

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper
  2017-05-22  8:59     ` Greg Kurz
  2017-05-22 13:13       ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
@ 2017-05-23  2:35       ` David Gibson
  2017-05-23  6:57         ` Greg Kurz
  1 sibling, 1 reply; 25+ messages in thread
From: David Gibson @ 2017-05-23  2:35 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 6424 bytes --]

On Mon, May 22, 2017 at 10:59:50AM +0200, Greg Kurz wrote:
> On Mon, 22 May 2017 12:04:13 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Fri, May 19, 2017 at 12:32:20PM +0200, Greg Kurz wrote:
> > > For historical reasons, we compute CPU device-tree ids with a non-trivial
> > > logic. This patch consolidate the logic in a single helper to be used
> > > in various places where it is currently open-coded.
> > > 
> > > It is okay to get rid of DIV_ROUND_UP() because we're sure that the number
> > > of threads per core in the guest cannot exceed the number of threads per
> > > core in the host.  
> > 
> > However, your new logic still gives different answers in some cases.
> > In particular when max_cpus is not a multiple of smp_threads.  Which
> > is generally a bad idea, but allowed for older machine types for
> > compatibility.   e.g. smp_threads=4, max_cpus=6 smt=8
> > 
> > Old logic:
> > 	         DIV_ROUND_UP(6 * 8, 4)
> > 	       = ⌈48 / 4⌉ = 12
> > 
> > New logic gives: ⌊6 / 4⌋ * 8 + (6 % 4)
> >                = 1 * 8 + 2
> > 	       = 10
> > 
> 
> I now realize this two formulas are hardly reconcilable... this
> probably means that this patch shouldn't try to consolidate the
> logic in hw/ppc/spapr.c with the one in target/ppc/translate_init.c.

Ok.

> > In any case the DIV_ROUND_UP() isn't to handle the case where guest
> > threads-per-core is bigger than host threads-per-core, it's (IIRC) for
> > the case where guest threads-per-core is not a factor of host
> > threads-per-core.  Again, a bad idea, but I think allowed in some old
> > cases.
> > 
> 
> FWIW, DIV_ROUND_UP() comes from this commit:
> 
> f303f117fec3 spapr: ensure we have at least one XICS server

Ah, yes, I see your point.  Hrm.  I thought even then that guest
threads > host threads was definitely incorrect; but I'm wondering if
the change was just because the check for guest threads > host threads
came later during init and we didn't want to crash before we got to
it.

> but I agree that this was a bad idea...

But yeah, looks like we'll be taking a different approach so it's kind
of moot anyway.

> 
> > > 
> > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > ---
> > >  hw/ppc/spapr.c              |    6 ++----
> > >  target/ppc/cpu.h            |   17 +++++++++++++++++
> > >  target/ppc/translate_init.c |    3 +--
> > >  3 files changed, 20 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > index 75e298b4c6be..1bb05a9a6b07 100644
> > > --- a/hw/ppc/spapr.c
> > > +++ b/hw/ppc/spapr.c
> > > @@ -981,7 +981,6 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > >      void *fdt;
> > >      sPAPRPHBState *phb;
> > >      char *buf;
> > > -    int smt = kvmppc_smt_threads();
> > >  
> > >      fdt = g_malloc0(FDT_MAX_SIZE);
> > >      _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
> > > @@ -1021,7 +1020,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > >      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
> > >  
> > >      /* /interrupt controller */
> > > -    spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, PHANDLE_XICP);
> > > +    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
> > >  
> > >      ret = spapr_populate_memory(spapr, fdt);
> > >      if (ret < 0) {
> > > @@ -1977,7 +1976,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > >      MachineState *machine = MACHINE(spapr);
> > >      MachineClass *mc = MACHINE_GET_CLASS(machine);
> > >      char *type = spapr_get_cpu_core_type(machine->cpu_model);
> > > -    int smt = kvmppc_smt_threads();
> > >      const CPUArchIdList *possible_cpus;
> > >      int boot_cores_nr = smp_cpus / smp_threads;
> > >      int i;
> > > @@ -2014,7 +2012,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > >              sPAPRDRConnector *drc =
> > >                  spapr_dr_connector_new(OBJECT(spapr),
> > >                                         SPAPR_DR_CONNECTOR_TYPE_CPU,
> > > -                                       (core_id / smp_threads) * smt);
> > > +                                       ppc_cpu_dt_id_from_index(core_id));
> > >  
> > >              qemu_register_reset(spapr_drc_reset, drc);
> > >          }
> > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > > index 401e10e7dad8..47fe6c64698f 100644
> > > --- a/target/ppc/cpu.h
> > > +++ b/target/ppc/cpu.h
> > > @@ -2529,4 +2529,21 @@ int ppc_get_vcpu_dt_id(PowerPCCPU *cpu);
> > >  PowerPCCPU *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
> > >  
> > >  void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len);
> > > +
> > > +#if !defined(CONFIG_USER_ONLY)
> > > +#include "sysemu/cpus.h"
> > > +#include "target/ppc/kvm_ppc.h"
> > > +
> > > +static inline int ppc_cpu_dt_id_from_index(int cpu_index)
> > > +{
> > > +    /* POWER HV support has an historical limitation that different threads
> > > +     * on a single core cannot be in different guests at the same time. In
> > > +     * order to allow KVM to assign guest threads to host cores accordingly,
> > > +     * CPU device tree ids are spaced by the number of threads per host cores.
> > > +     */
> > > +    return (cpu_index / smp_threads) * kvmppc_smt_threads()
> > > +        + (cpu_index % smp_threads);
> > > +}
> > > +#endif
> > > +
> > >  #endif /* PPC_CPU_H */
> > > diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> > > index 56a0ab22cfbe..837a9a496a65 100644
> > > --- a/target/ppc/translate_init.c
> > > +++ b/target/ppc/translate_init.c
> > > @@ -9851,8 +9851,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> > >      }
> > >  
> > >  #if !defined(CONFIG_USER_ONLY)
> > > -    cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
> > > -        + (cs->cpu_index % smp_threads);
> > > +    cpu->cpu_dt_id = ppc_cpu_dt_id_from_index(cs->cpu_index);
> > >  
> > >      if (kvm_enabled() && !kvm_vcpu_id_is_valid(cpu->cpu_dt_id)) {
> > >          error_setg(errp, "Can't create CPU with id %d in KVM", cpu->cpu_dt_id);
> > >   
> > 
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper
  2017-05-22 14:33       ` Greg Kurz
@ 2017-05-23  2:37         ` David Gibson
  0 siblings, 0 replies; 25+ messages in thread
From: David Gibson @ 2017-05-23  2:37 UTC (permalink / raw)
  To: Greg Kurz
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 6878 bytes --]

On Mon, May 22, 2017 at 04:33:36PM +0200, Greg Kurz wrote:
> On Mon, 22 May 2017 12:12:46 +1000
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Mon, May 22, 2017 at 12:04:13PM +1000, David Gibson wrote:
> > > On Fri, May 19, 2017 at 12:32:20PM +0200, Greg Kurz wrote:  
> > > > For historical reasons, we compute CPU device-tree ids with a non-trivial
> > > > logic. This patch consolidate the logic in a single helper to be used
> > > > in various places where it is currently open-coded.
> > > > 
> > > > It is okay to get rid of DIV_ROUND_UP() because we're sure that the number
> > > > of threads per core in the guest cannot exceed the number of threads per
> > > > core in the host.  
> > > 
> > > However, your new logic still gives different answers in some cases.
> > > In particular when max_cpus is not a multiple of smp_threads.  Which
> > > is generally a bad idea, but allowed for older machine types for
> > > compatibility.   e.g. smp_threads=4, max_cpus=6 smt=8
> > > 
> > > Old logic:
> > > 	         DIV_ROUND_UP(6 * 8, 4)
> > > 	       = ⌈48 / 4⌉ = 12
> > > 
> > > New logic gives: ⌊6 / 4⌋ * 8 + (6 % 4)
> > >                = 1 * 8 + 2
> > > 	       = 10
> > > 
> > > In any case the DIV_ROUND_UP() isn't to handle the case where guest
> > > threads-per-core is bigger than host threads-per-core, it's (IIRC) for
> > > the case where guest threads-per-core is not a factor of host
> > > threads-per-core.  Again, a bad idea, but I think allowed in some old
> > > cases.  
> > 
> > Oh, so, the other more general point here is that I actually want to
> > get rid of dt_id from the cpu structure.  It's basically an abuse of
> > the cpu stuff to include what's really an spapr concept - dt IDs for
> > powernv are based on the PIR and not allocate the same way.
> > 
> 
> Yeah, I agree. I guess this calls for the introduction of a sPAPRCPUThread
> object type derived from PowerPCCPU. It probably deserves to be addressed
> in a separate patchset.

I don't think that will be necessary.  It would also be mucky, since
we'd need a whole slew of them for each CPU type.  I think just
converting between spapr dt id and cpu index at runtime should be
sufficient.

> > That said, I'm still ok with a fixed version of this patch as an
> > interim step.
> > 
> 
> Given that the logic change doesn't break anything in the end (see my other
> mail), then we're good ?
> 
> > > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > > ---
> > > >  hw/ppc/spapr.c              |    6 ++----
> > > >  target/ppc/cpu.h            |   17 +++++++++++++++++
> > > >  target/ppc/translate_init.c |    3 +--
> > > >  3 files changed, 20 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > index 75e298b4c6be..1bb05a9a6b07 100644
> > > > --- a/hw/ppc/spapr.c
> > > > +++ b/hw/ppc/spapr.c
> > > > @@ -981,7 +981,6 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > > >      void *fdt;
> > > >      sPAPRPHBState *phb;
> > > >      char *buf;
> > > > -    int smt = kvmppc_smt_threads();
> > > >  
> > > >      fdt = g_malloc0(FDT_MAX_SIZE);
> > > >      _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
> > > > @@ -1021,7 +1020,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > > >      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
> > > >  
> > > >      /* /interrupt controller */
> > > > -    spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, PHANDLE_XICP);
> > > > +    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
> > > >  
> > > >      ret = spapr_populate_memory(spapr, fdt);
> > > >      if (ret < 0) {
> > > > @@ -1977,7 +1976,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > > >      MachineState *machine = MACHINE(spapr);
> > > >      MachineClass *mc = MACHINE_GET_CLASS(machine);
> > > >      char *type = spapr_get_cpu_core_type(machine->cpu_model);
> > > > -    int smt = kvmppc_smt_threads();
> > > >      const CPUArchIdList *possible_cpus;
> > > >      int boot_cores_nr = smp_cpus / smp_threads;
> > > >      int i;
> > > > @@ -2014,7 +2012,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > > >              sPAPRDRConnector *drc =
> > > >                  spapr_dr_connector_new(OBJECT(spapr),
> > > >                                         SPAPR_DR_CONNECTOR_TYPE_CPU,
> > > > -                                       (core_id / smp_threads) * smt);
> > > > +                                       ppc_cpu_dt_id_from_index(core_id));
> > > >  
> > > >              qemu_register_reset(spapr_drc_reset, drc);
> > > >          }
> > > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > > > index 401e10e7dad8..47fe6c64698f 100644
> > > > --- a/target/ppc/cpu.h
> > > > +++ b/target/ppc/cpu.h
> > > > @@ -2529,4 +2529,21 @@ int ppc_get_vcpu_dt_id(PowerPCCPU *cpu);
> > > >  PowerPCCPU *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
> > > >  
> > > >  void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len);
> > > > +
> > > > +#if !defined(CONFIG_USER_ONLY)
> > > > +#include "sysemu/cpus.h"
> > > > +#include "target/ppc/kvm_ppc.h"
> > > > +
> > > > +static inline int ppc_cpu_dt_id_from_index(int cpu_index)
> > > > +{
> > > > +    /* POWER HV support has an historical limitation that different threads
> > > > +     * on a single core cannot be in different guests at the same time. In
> > > > +     * order to allow KVM to assign guest threads to host cores accordingly,
> > > > +     * CPU device tree ids are spaced by the number of threads per host cores.
> > > > +     */
> > > > +    return (cpu_index / smp_threads) * kvmppc_smt_threads()
> > > > +        + (cpu_index % smp_threads);
> > > > +}
> > > > +#endif
> > > > +
> > > >  #endif /* PPC_CPU_H */
> > > > diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> > > > index 56a0ab22cfbe..837a9a496a65 100644
> > > > --- a/target/ppc/translate_init.c
> > > > +++ b/target/ppc/translate_init.c
> > > > @@ -9851,8 +9851,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> > > >      }
> > > >  
> > > >  #if !defined(CONFIG_USER_ONLY)
> > > > -    cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
> > > > -        + (cs->cpu_index % smp_threads);
> > > > +    cpu->cpu_dt_id = ppc_cpu_dt_id_from_index(cs->cpu_index);
> > > >  
> > > >      if (kvm_enabled() && !kvm_vcpu_id_is_valid(cpu->cpu_dt_id)) {
> > > >          error_setg(errp, "Can't create CPU with id %d in KVM", cpu->cpu_dt_id);
> > > >   
> > >   
> > 
> > 
> > 
> 



-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [Qemu-ppc] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper
  2017-05-22 13:13       ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
@ 2017-05-23  6:37         ` David Gibson
  0 siblings, 0 replies; 25+ messages in thread
From: David Gibson @ 2017-05-23  6:37 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Laurent Vivier, Cedric Le Goater, qemu-ppc, qemu-devel,
	Bharata B Rao

[-- Attachment #1: Type: text/plain, Size: 3272 bytes --]

On Mon, May 22, 2017 at 03:13:25PM +0200, Greg Kurz wrote:
> On Mon, 22 May 2017 10:59:50 +0200
> Greg Kurz <groug@kaod.org> wrote:
> 
> > On Mon, 22 May 2017 12:04:13 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> > 
> > > On Fri, May 19, 2017 at 12:32:20PM +0200, Greg Kurz wrote:  
> > > > For historical reasons, we compute CPU device-tree ids with a non-trivial
> > > > logic. This patch consolidate the logic in a single helper to be used
> > > > in various places where it is currently open-coded.
> > > > 
> > > > It is okay to get rid of DIV_ROUND_UP() because we're sure that the number
> > > > of threads per core in the guest cannot exceed the number of threads per
> > > > core in the host.    
> > > 
> > > However, your new logic still gives different answers in some cases.
> > > In particular when max_cpus is not a multiple of smp_threads.  Which
> > > is generally a bad idea, but allowed for older machine types for
> > > compatibility.   e.g. smp_threads=4, max_cpus=6 smt=8
> > > 
> 
> FWIW, this topology was never supported for pseries >= 2.7 since commit
> 94a94e4c4919 ("spapr: convert boot CPUs into CPU core devices", QEMU 2.7):
> 
> qemu-system-ppc64: max_cpus (6) must be multiple of threads (4)
> 
> The same happens goes with smp_cpus.

Yes, pre-2.7 is what I meant by older machine types.

> If we care for compat with pre-2.7 machine types (ie, no support for CPU
> hotplug),

We do.. RHEL 7.3 is still 2.6 based, for one thing.

> this topology isn't valid anymore since QEMU 2.9, with these
> commits:
> 
> 0c86d0fd92aa ("pseries: Always use core objects for CPU construction") which
> causes the following error if we only set max_cpus:
> 
> qemu-system-ppc64: This machine version does not support CPU hotplug

That patch has explicit provision for allowing the last core to have a
not-full complement of threads.

> 8149e2992f78 ("pseries: Enforce homogeneous threads-per-core") which
> causes the following error if we set smp_cpus (or smp_cpus == max_cpus):
> 
> qemu-system-ppc64: invalid nr-threads 2, must be 4

This one does indeed do what you say - but that's a bug :(.  It means
migration from older versions may be broken.

> So in the end, we already enforce max_cpus and smp_cpus to be multiple
> of smp_threads for all machine types. In this case...
> 
> > > Old logic:
> > > 	         DIV_ROUND_UP(6 * 8, 4)
> > > 	       = ⌈48 / 4⌉ = 12
> > > 
> > > New logic gives: ⌊6 / 4⌋ * 8 + (6 % 4)
> > >                = 1 * 8 + 2
> > > 	       = 10
> > >   
> > 
> > I now realize this two formulas are hardly reconcilable... this
> > probably means that this patch shouldn't try to consolidate the
> > logic in hw/ppc/spapr.c with the one in target/ppc/translate_init.c.
> 
> ... both formulas are equivalent, unless I'm missing something. Or
> do we really want to re-allow the funky topologies for older
> machines ?

Want to?  Not really.  Have to for compatibility?  Yes, absolutely.

I've just sent a patch to address this.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper
  2017-05-23  2:35       ` [Qemu-devel] " David Gibson
@ 2017-05-23  6:57         ` Greg Kurz
  0 siblings, 0 replies; 25+ messages in thread
From: Greg Kurz @ 2017-05-23  6:57 UTC (permalink / raw)
  To: David Gibson
  Cc: qemu-ppc, qemu-devel, Laurent Vivier, Bharata B Rao,
	Cedric Le Goater

[-- Attachment #1: Type: text/plain, Size: 6866 bytes --]

On Tue, 23 May 2017 12:35:54 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Mon, May 22, 2017 at 10:59:50AM +0200, Greg Kurz wrote:
> > On Mon, 22 May 2017 12:04:13 +1000
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Fri, May 19, 2017 at 12:32:20PM +0200, Greg Kurz wrote:  
> > > > For historical reasons, we compute CPU device-tree ids with a non-trivial
> > > > logic. This patch consolidate the logic in a single helper to be used
> > > > in various places where it is currently open-coded.
> > > > 
> > > > It is okay to get rid of DIV_ROUND_UP() because we're sure that the number
> > > > of threads per core in the guest cannot exceed the number of threads per
> > > > core in the host.    
> > > 
> > > However, your new logic still gives different answers in some cases.
> > > In particular when max_cpus is not a multiple of smp_threads.  Which
> > > is generally a bad idea, but allowed for older machine types for
> > > compatibility.   e.g. smp_threads=4, max_cpus=6 smt=8
> > > 
> > > Old logic:
> > > 	         DIV_ROUND_UP(6 * 8, 4)
> > > 	       = ⌈48 / 4⌉ = 12
> > > 
> > > New logic gives: ⌊6 / 4⌋ * 8 + (6 % 4)
> > >                = 1 * 8 + 2
> > > 	       = 10
> > >   
> > 
> > I now realize this two formulas are hardly reconcilable... this
> > probably means that this patch shouldn't try to consolidate the
> > logic in hw/ppc/spapr.c with the one in target/ppc/translate_init.c.  
> 
> Ok.
> 
> > > In any case the DIV_ROUND_UP() isn't to handle the case where guest
> > > threads-per-core is bigger than host threads-per-core, it's (IIRC) for
> > > the case where guest threads-per-core is not a factor of host
> > > threads-per-core.  Again, a bad idea, but I think allowed in some old
> > > cases.
> > >   
> > 
> > FWIW, DIV_ROUND_UP() comes from this commit:
> > 
> > f303f117fec3 spapr: ensure we have at least one XICS server  
> 
> Ah, yes, I see your point.  Hrm.  I thought even then that guest
> threads > host threads was definitely incorrect; but I'm wondering if
> the change was just because the check for guest threads > host threads
> came later during init and we didn't want to crash before we got to
> it.

AFAICR, this was the only motivation... but I hadn't realized the trickiness
of CPU id computations in ppc at the time. Otherwise I would have added
another smp_threads > kvmppc_smt_threads() sanity check instead. :-\

> 
> > but I agree that this was a bad idea...  
> 
> But yeah, looks like we'll be taking a different approach so it's kind
> of moot anyway.
> 
> >   
> > > > 
> > > > Signed-off-by: Greg Kurz <groug@kaod.org>
> > > > ---
> > > >  hw/ppc/spapr.c              |    6 ++----
> > > >  target/ppc/cpu.h            |   17 +++++++++++++++++
> > > >  target/ppc/translate_init.c |    3 +--
> > > >  3 files changed, 20 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > > > index 75e298b4c6be..1bb05a9a6b07 100644
> > > > --- a/hw/ppc/spapr.c
> > > > +++ b/hw/ppc/spapr.c
> > > > @@ -981,7 +981,6 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > > >      void *fdt;
> > > >      sPAPRPHBState *phb;
> > > >      char *buf;
> > > > -    int smt = kvmppc_smt_threads();
> > > >  
> > > >      fdt = g_malloc0(FDT_MAX_SIZE);
> > > >      _FDT((fdt_create_empty_tree(fdt, FDT_MAX_SIZE)));
> > > > @@ -1021,7 +1020,7 @@ static void *spapr_build_fdt(sPAPRMachineState *spapr,
> > > >      _FDT(fdt_setprop_cell(fdt, 0, "#size-cells", 2));
> > > >  
> > > >      /* /interrupt controller */
> > > > -    spapr_dt_xics(DIV_ROUND_UP(max_cpus * smt, smp_threads), fdt, PHANDLE_XICP);
> > > > +    spapr_dt_xics(ppc_cpu_dt_id_from_index(max_cpus), fdt, PHANDLE_XICP);
> > > >  
> > > >      ret = spapr_populate_memory(spapr, fdt);
> > > >      if (ret < 0) {
> > > > @@ -1977,7 +1976,6 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > > >      MachineState *machine = MACHINE(spapr);
> > > >      MachineClass *mc = MACHINE_GET_CLASS(machine);
> > > >      char *type = spapr_get_cpu_core_type(machine->cpu_model);
> > > > -    int smt = kvmppc_smt_threads();
> > > >      const CPUArchIdList *possible_cpus;
> > > >      int boot_cores_nr = smp_cpus / smp_threads;
> > > >      int i;
> > > > @@ -2014,7 +2012,7 @@ static void spapr_init_cpus(sPAPRMachineState *spapr)
> > > >              sPAPRDRConnector *drc =
> > > >                  spapr_dr_connector_new(OBJECT(spapr),
> > > >                                         SPAPR_DR_CONNECTOR_TYPE_CPU,
> > > > -                                       (core_id / smp_threads) * smt);
> > > > +                                       ppc_cpu_dt_id_from_index(core_id));
> > > >  
> > > >              qemu_register_reset(spapr_drc_reset, drc);
> > > >          }
> > > > diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
> > > > index 401e10e7dad8..47fe6c64698f 100644
> > > > --- a/target/ppc/cpu.h
> > > > +++ b/target/ppc/cpu.h
> > > > @@ -2529,4 +2529,21 @@ int ppc_get_vcpu_dt_id(PowerPCCPU *cpu);
> > > >  PowerPCCPU *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
> > > >  
> > > >  void ppc_maybe_bswap_register(CPUPPCState *env, uint8_t *mem_buf, int len);
> > > > +
> > > > +#if !defined(CONFIG_USER_ONLY)
> > > > +#include "sysemu/cpus.h"
> > > > +#include "target/ppc/kvm_ppc.h"
> > > > +
> > > > +static inline int ppc_cpu_dt_id_from_index(int cpu_index)
> > > > +{
> > > > +    /* POWER HV support has an historical limitation that different threads
> > > > +     * on a single core cannot be in different guests at the same time. In
> > > > +     * order to allow KVM to assign guest threads to host cores accordingly,
> > > > +     * CPU device tree ids are spaced by the number of threads per host cores.
> > > > +     */
> > > > +    return (cpu_index / smp_threads) * kvmppc_smt_threads()
> > > > +        + (cpu_index % smp_threads);
> > > > +}
> > > > +#endif
> > > > +
> > > >  #endif /* PPC_CPU_H */
> > > > diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> > > > index 56a0ab22cfbe..837a9a496a65 100644
> > > > --- a/target/ppc/translate_init.c
> > > > +++ b/target/ppc/translate_init.c
> > > > @@ -9851,8 +9851,7 @@ static void ppc_cpu_realizefn(DeviceState *dev, Error **errp)
> > > >      }
> > > >  
> > > >  #if !defined(CONFIG_USER_ONLY)
> > > > -    cpu->cpu_dt_id = (cs->cpu_index / smp_threads) * max_smt
> > > > -        + (cs->cpu_index % smp_threads);
> > > > +    cpu->cpu_dt_id = ppc_cpu_dt_id_from_index(cs->cpu_index);
> > > >  
> > > >      if (kvm_enabled() && !kvm_vcpu_id_is_valid(cpu->cpu_dt_id)) {
> > > >          error_setg(errp, "Can't create CPU with id %d in KVM", cpu->cpu_dt_id);
> > > >     
> > >   
> >   
> 
> 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-05-23  6:58 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-19 10:31 [Qemu-devel] [PATCH v2 0/4] spapr/xics: fix migration of older machine types Greg Kurz
2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 1/4] spapr_cpu_core: drop reference on ICP object during CPU realization Greg Kurz
2017-05-20  6:40   ` David Gibson
2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 2/4] spapr: fix error reporting in xics_system_init() Greg Kurz
2017-05-20  6:45   ` David Gibson
2017-05-21 17:03     ` Greg Kurz
2017-05-22  1:26       ` David Gibson
2017-05-22  7:41       ` Markus Armbruster
2017-05-22  9:00         ` David Gibson
2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 3/4] target/ppc: consolidate CPU device-tree id computation in helper Greg Kurz
2017-05-22  2:04   ` David Gibson
2017-05-22  2:12     ` David Gibson
2017-05-22  9:09       ` Greg Kurz
2017-05-22 14:33       ` Greg Kurz
2017-05-23  2:37         ` David Gibson
2017-05-22  8:59     ` Greg Kurz
2017-05-22 13:13       ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2017-05-23  6:37         ` David Gibson
2017-05-23  2:35       ` [Qemu-devel] " David Gibson
2017-05-23  6:57         ` Greg Kurz
2017-05-19 10:32 ` [Qemu-devel] [PATCH v2 4/4] spapr: fix migration of ICP objects from/to older QEMU Greg Kurz
2017-05-22  2:30   ` David Gibson
2017-05-22  7:20     ` Cédric Le Goater
2017-05-22  9:15       ` David Gibson
2017-05-22 15:04         ` Greg Kurz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).