qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Cédric Le Goater" <clg@kaod.org>
To: qemu-ppc@nongnu.org, qemu-devel@nongnu.org
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
	"Daniel Henrique Barboza" <danielhb413@gmail.com>,
	"Cédric Le Goater" <clg@kaod.org>,
	"Shivaprasad G Bhat" <sbhat@linux.ibm.com>
Subject: [PULL 03/39] spapr: nvdimm: Introduce spapr-nvdimm device
Date: Fri, 18 Feb 2022 11:37:51 +0100	[thread overview]
Message-ID: <20220218103827.682032-4-clg@kaod.org> (raw)
In-Reply-To: <20220218103827.682032-1-clg@kaod.org>

From: Shivaprasad G Bhat <sbhat@linux.ibm.com>

If the device backend is not persistent memory for the nvdimm, there is
need for explicit IO flushes on the backend to ensure persistence.

On SPAPR, the issue is addressed by adding a new hcall to request for
an explicit flush from the guest when the backend is not pmem. So, the
approach here is to convey when the hcall flush is required in a device
tree property. The guest once it knows the device backend is not pmem,
makes the hcall whenever flush is required.

To set the device tree property, a new PAPR specific device type inheriting
the nvdimm device is implemented. When the backend doesn't have pmem=on
the device tree property "ibm,hcall-flush-required" is set, and the guest
makes hcall H_SCM_FLUSH requesting for an explicit flush. The new device
has boolean property pmem-override which when "on" advertises the device
tree property even when pmem=on for the backend. The flush function
invokes the fdatasync or pmem_persist() based on the type of backend.

The vmstate structures are made part of the spapr-nvdimm device object.
The patch attempts to keep the migration compatibility between source and
destination while rejecting the incompatibles ones with failures.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <164396256092.109112.17933240273840803354.stgit@ltczzess4.aus.stglabs.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
 hw/ppc/spapr_nvdimm.c | 132 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 132 insertions(+)

diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c
index ac44e0015317..c4c97da5de98 100644
--- a/hw/ppc/spapr_nvdimm.c
+++ b/hw/ppc/spapr_nvdimm.c
@@ -34,6 +34,7 @@
 #include "block/thread-pool.h"
 #include "migration/vmstate.h"
 #include "qemu/pmem.h"
+#include "hw/qdev-properties.h"
 
 /* DIMM health bitmap bitmap indicators. Taken from kernel's papr_scm.c */
 /* SCM device is unable to persist memory contents */
@@ -57,6 +58,10 @@ OBJECT_DECLARE_TYPE(SpaprNVDIMMDevice, SPAPRNVDIMMClass, SPAPR_NVDIMM)
 struct SPAPRNVDIMMClass {
     /* private */
     NVDIMMClass parent_class;
+
+    /* public */
+    void (*realize)(NVDIMMDevice *dimm, Error **errp);
+    void (*unrealize)(NVDIMMDevice *dimm, Error **errp);
 };
 
 bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, NVDIMMDevice *nvdimm,
@@ -64,6 +69,8 @@ bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, NVDIMMDevice *nvdimm,
 {
     const MachineClass *mc = MACHINE_GET_CLASS(hotplug_dev);
     const MachineState *ms = MACHINE(hotplug_dev);
+    PCDIMMDevice *dimm = PC_DIMM(nvdimm);
+    MemoryRegion *mr = host_memory_backend_get_memory(dimm->hostmem);
     g_autofree char *uuidstr = NULL;
     QemuUUID uuid;
     int ret;
@@ -101,6 +108,14 @@ bool spapr_nvdimm_validate(HotplugHandler *hotplug_dev, NVDIMMDevice *nvdimm,
         return false;
     }
 
+    if (object_dynamic_cast(OBJECT(nvdimm), TYPE_SPAPR_NVDIMM) &&
+        (memory_region_get_fd(mr) < 0)) {
+        error_setg(errp, "spapr-nvdimm device requires the "
+                   "memdev %s to be of memory-backend-file type",
+                   object_get_canonical_path_component(OBJECT(dimm->hostmem)));
+        return false;
+    }
+
     return true;
 }
 
@@ -172,6 +187,20 @@ static int spapr_dt_nvdimm(SpaprMachineState *spapr, void *fdt,
                              "operating-system")));
     _FDT(fdt_setprop(fdt, child_offset, "ibm,cache-flush-required", NULL, 0));
 
+    if (object_dynamic_cast(OBJECT(nvdimm), TYPE_SPAPR_NVDIMM)) {
+        bool is_pmem = false, pmem_override = false;
+        PCDIMMDevice *dimm = PC_DIMM(nvdimm);
+        HostMemoryBackend *hostmem = dimm->hostmem;
+
+        is_pmem = object_property_get_bool(OBJECT(hostmem), "pmem", NULL);
+        pmem_override = object_property_get_bool(OBJECT(nvdimm),
+                                                 "pmem-override", NULL);
+        if (!is_pmem || pmem_override) {
+            _FDT(fdt_setprop(fdt, child_offset, "ibm,hcall-flush-required",
+                             NULL, 0));
+        }
+    }
+
     return child_offset;
 }
 
@@ -397,11 +426,21 @@ typedef struct SpaprNVDIMMDeviceFlushState {
 
 typedef struct SpaprNVDIMMDevice SpaprNVDIMMDevice;
 struct SpaprNVDIMMDevice {
+    /* private */
     NVDIMMDevice parent_obj;
 
+    bool hcall_flush_required;
     uint64_t nvdimm_flush_token;
     QLIST_HEAD(, SpaprNVDIMMDeviceFlushState) pending_nvdimm_flush_states;
     QLIST_HEAD(, SpaprNVDIMMDeviceFlushState) completed_nvdimm_flush_states;
+
+    /* public */
+
+    /*
+     * The 'on' value for this property forced the qemu to enable the hcall
+     * flush for the nvdimm device even if the backend is a pmem
+     */
+    bool pmem_override;
 };
 
 static int flush_worker_cb(void *opaque)
@@ -448,6 +487,24 @@ static int spapr_nvdimm_flush_post_load(void *opaque, int version_id)
     SpaprNVDIMMDevice *s_nvdimm = (SpaprNVDIMMDevice *)opaque;
     SpaprNVDIMMDeviceFlushState *state;
     ThreadPool *pool = aio_get_thread_pool(qemu_get_aio_context());
+    HostMemoryBackend *backend = MEMORY_BACKEND(PC_DIMM(s_nvdimm)->hostmem);
+    bool is_pmem = object_property_get_bool(OBJECT(backend), "pmem", NULL);
+    bool pmem_override = object_property_get_bool(OBJECT(s_nvdimm),
+                                                  "pmem-override", NULL);
+    bool dest_hcall_flush_required = pmem_override || !is_pmem;
+
+    if (!s_nvdimm->hcall_flush_required && dest_hcall_flush_required) {
+        error_report("The file backend for the spapr-nvdimm device %s at "
+                     "source is a pmem, use pmem=on and pmem-override=off to "
+                     "continue.", DEVICE(s_nvdimm)->id);
+        return -EINVAL;
+    }
+    if (s_nvdimm->hcall_flush_required && !dest_hcall_flush_required) {
+        error_report("The guest expects hcall-flush support for the "
+                     "spapr-nvdimm device %s, use pmem_override=on to "
+                     "continue.", DEVICE(s_nvdimm)->id);
+        return -EINVAL;
+    }
 
     QLIST_FOREACH(state, &s_nvdimm->pending_nvdimm_flush_states, node) {
         thread_pool_submit_aio(pool, flush_worker_cb, state,
@@ -475,6 +532,7 @@ const VMStateDescription vmstate_spapr_nvdimm_states = {
     .minimum_version_id = 1,
     .post_load = spapr_nvdimm_flush_post_load,
     .fields = (VMStateField[]) {
+        VMSTATE_BOOL(hcall_flush_required, SpaprNVDIMMDevice),
         VMSTATE_UINT64(nvdimm_flush_token, SpaprNVDIMMDevice),
         VMSTATE_QLIST_V(completed_nvdimm_flush_states, SpaprNVDIMMDevice, 1,
                         vmstate_spapr_nvdimm_flush_state,
@@ -605,7 +663,11 @@ static target_ulong h_scm_flush(PowerPCCPU *cpu, SpaprMachineState *spapr,
     }
 
     dimm = PC_DIMM(drc->dev);
+    if (!object_dynamic_cast(OBJECT(dimm), TYPE_SPAPR_NVDIMM)) {
+        return H_PARAMETER;
+    }
     if (continue_token == 0) {
+        bool is_pmem = false, pmem_override = false;
         backend = MEMORY_BACKEND(dimm->hostmem);
         fd = memory_region_get_fd(&backend->mr);
 
@@ -613,6 +675,13 @@ static target_ulong h_scm_flush(PowerPCCPU *cpu, SpaprMachineState *spapr,
             return H_UNSUPPORTED;
         }
 
+        is_pmem = object_property_get_bool(OBJECT(backend), "pmem", NULL);
+        pmem_override = object_property_get_bool(OBJECT(dimm),
+                                                "pmem-override", NULL);
+        if (is_pmem && !pmem_override) {
+            return H_UNSUPPORTED;
+        }
+
         state = spapr_nvdimm_init_new_flush_state(SPAPR_NVDIMM(dimm));
         if (!state) {
             return H_HARDWARE;
@@ -786,3 +855,66 @@ static void spapr_scm_register_types(void)
 }
 
 type_init(spapr_scm_register_types)
+
+static void spapr_nvdimm_realize(NVDIMMDevice *dimm, Error **errp)
+{
+    SpaprNVDIMMDevice *s_nvdimm = SPAPR_NVDIMM(dimm);
+    HostMemoryBackend *backend = MEMORY_BACKEND(PC_DIMM(dimm)->hostmem);
+    bool is_pmem = object_property_get_bool(OBJECT(backend),  "pmem", NULL);
+    bool pmem_override = object_property_get_bool(OBJECT(dimm), "pmem-override",
+                                             NULL);
+    if (!is_pmem || pmem_override) {
+        s_nvdimm->hcall_flush_required = true;
+    }
+
+    vmstate_register(NULL, VMSTATE_INSTANCE_ID_ANY,
+                     &vmstate_spapr_nvdimm_states, dimm);
+}
+
+static void spapr_nvdimm_unrealize(NVDIMMDevice *dimm)
+{
+    vmstate_unregister(NULL, &vmstate_spapr_nvdimm_states, dimm);
+}
+
+static Property spapr_nvdimm_properties[] = {
+#ifdef CONFIG_LIBPMEM
+    DEFINE_PROP_BOOL("pmem-override", SpaprNVDIMMDevice, pmem_override, false),
+#endif
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void spapr_nvdimm_class_init(ObjectClass *oc, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(oc);
+    NVDIMMClass *nvc = NVDIMM_CLASS(oc);
+
+    nvc->realize = spapr_nvdimm_realize;
+    nvc->unrealize = spapr_nvdimm_unrealize;
+
+    device_class_set_props(dc, spapr_nvdimm_properties);
+}
+
+static void spapr_nvdimm_init(Object *obj)
+{
+    SpaprNVDIMMDevice *s_nvdimm = SPAPR_NVDIMM(obj);
+
+    s_nvdimm->hcall_flush_required = false;
+    QLIST_INIT(&s_nvdimm->pending_nvdimm_flush_states);
+    QLIST_INIT(&s_nvdimm->completed_nvdimm_flush_states);
+}
+
+static TypeInfo spapr_nvdimm_info = {
+    .name          = TYPE_SPAPR_NVDIMM,
+    .parent        = TYPE_NVDIMM,
+    .class_init    = spapr_nvdimm_class_init,
+    .class_size    = sizeof(SPAPRNVDIMMClass),
+    .instance_size = sizeof(SpaprNVDIMMDevice),
+    .instance_init = spapr_nvdimm_init,
+};
+
+static void spapr_nvdimm_register_types(void)
+{
+    type_register_static(&spapr_nvdimm_info);
+}
+
+type_init(spapr_nvdimm_register_types)
-- 
2.34.1



  parent reply	other threads:[~2022-02-18 11:06 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-18 10:37 [PULL 00/39] ppc queue Cédric Le Goater
2022-02-18 10:37 ` [PULL 01/39] nvdimm: Add realize, unrealize callbacks to NVDIMMDevice class Cédric Le Goater
2022-02-18 10:37 ` [PULL 02/39] spapr: nvdimm: Implement H_SCM_FLUSH hcall Cédric Le Goater
2022-02-18 10:37 ` Cédric Le Goater [this message]
2022-02-18 10:37 ` [PULL 04/39] target/ppc: raise HV interrupts for partition table entry problems Cédric Le Goater
2022-02-18 10:37 ` [PULL 05/39] spapr: prevent hdec timer being set up under virtual hypervisor Cédric Le Goater
2022-02-18 10:37 ` [PULL 06/39] ppc: allow the hdecr timer to be created/destroyed Cédric Le Goater
2022-02-18 10:37 ` [PULL 07/39] target/ppc: add vhyp addressing mode helper for radix MMU Cédric Le Goater
2022-02-18 10:37 ` [PULL 08/39] target/ppc: make vhyp get_pate method take lpid and return success Cédric Le Goater
2022-02-18 10:37 ` [PULL 09/39] target/ppc: add helper for books vhyp hypercall handler Cédric Le Goater
2022-02-18 10:37 ` [PULL 10/39] target/ppc: Add powerpc_reset_excp_state helper Cédric Le Goater
2022-02-18 10:37 ` [PULL 11/39] target/ppc: Introduce a vhyp framework for nested HV support Cédric Le Goater
2022-02-18 10:38 ` [PULL 12/39] spapr: implement nested-hv capability for the virtual hypervisor Cédric Le Goater
2022-02-18 10:38 ` [PULL 13/39] target/ppc: cpu_init: Remove not implemented comments Cédric Le Goater
2022-02-18 10:38 ` [PULL 14/39] target/ppc: cpu_init: Remove G2LE init code Cédric Le Goater
2022-02-18 10:38 ` [PULL 15/39] target/ppc: cpu_init: Group registration of generic SPRs Cédric Le Goater
2022-02-18 10:38 ` [PULL 16/39] target/ppc: cpu_init: Move Timebase registration into the common function Cédric Le Goater
2022-02-18 10:38 ` [PULL 17/39] target/ppc: cpu_init: Avoid nested SPR register functions Cédric Le Goater
2022-02-18 10:38 ` [PULL 18/39] target/ppc: cpu_init: Move 405 SPRs into register_405_sprs Cédric Le Goater
2022-02-18 10:38 ` [PULL 19/39] target/ppc: cpu_init: Move G2 SPRs into register_G2_sprs Cédric Le Goater
2022-02-18 10:38 ` [PULL 20/39] target/ppc: cpu_init: Decouple G2 SPR registration from 755 Cédric Le Goater
2022-02-18 10:38 ` [PULL 21/39] target/ppc: cpu_init: Decouple 74xx SPR registration from 7xx Cédric Le Goater
2022-02-18 10:38 ` [PULL 22/39] target/ppc: cpu_init: Deduplicate 440 SPR registration Cédric Le Goater
2022-02-18 10:38 ` [PULL 23/39] target/ppc: cpu_init: Deduplicate 603 " Cédric Le Goater
2022-02-18 10:38 ` [PULL 24/39] target/ppc: cpu_init: Deduplicate 604 " Cédric Le Goater
2022-02-18 10:38 ` [PULL 25/39] target/ppc: cpu_init: Deduplicate 745/755 " Cédric Le Goater
2022-02-18 10:38 ` [PULL 26/39] target/ppc: cpu_init: Deduplicate 7xx " Cédric Le Goater
2022-02-18 10:38 ` [PULL 27/39] target/ppc: cpu_init: Move 755 L2 cache SPRs into a function Cédric Le Goater
2022-02-18 10:38 ` [PULL 28/39] target/ppc: cpu_init: Move e300 SPR registration " Cédric Le Goater
2022-02-18 10:38 ` [PULL 29/39] target/ppc: cpu_init: Move 604e " Cédric Le Goater
2022-02-18 10:38 ` [PULL 30/39] target/ppc: cpu_init: Reuse init_proc_603 for the e300 Cédric Le Goater
2022-02-18 10:38 ` [PULL 31/39] target/ppc: cpu_init: Reuse init_proc_604 for the 604e Cédric Le Goater
2022-02-18 10:38 ` [PULL 32/39] target/ppc: cpu_init: Reuse init_proc_745 for the 755 Cédric Le Goater
2022-02-18 10:38 ` [PULL 33/39] target/ppc: cpu_init: Rename register_ne_601_sprs Cédric Le Goater
2022-02-18 10:38 ` [PULL 34/39] target/ppc: cpu_init: Remove register_usprg3_sprs Cédric Le Goater
2022-02-18 10:38 ` [PULL 35/39] target/ppc: Rename spr_tcg.h to spr_common.h Cédric Le Goater
2022-02-18 10:38 ` [PULL 36/39] target/ppc: cpu_init: Expose some SPR registration helpers Cédric Le Goater
2022-02-18 10:38 ` [PULL 37/39] target/ppc: cpu_init: Move SPR registration macros to a header Cédric Le Goater
2022-02-18 10:38 ` [PULL 38/39] target/ppc: cpu_init: Move check_pow and QOM " Cédric Le Goater
2022-02-18 10:38 ` [PULL 39/39] target/ppc: Move common SPR functions out of cpu_init Cédric Le Goater
2022-02-20 19:28 ` [PULL 00/39] ppc queue Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220218103827.682032-4-clg@kaod.org \
    --to=clg@kaod.org \
    --cc=danielhb413@gmail.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=sbhat@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).