From: Eduardo Habkost <ehabkost@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>, qemu-devel@nongnu.org
Cc: Marcel Apfelbaum <marcel@redhat.com>,
Haozhong Zhang <haozhong.zhang@intel.com>
Subject: [Qemu-devel] [PULL 17/19] nvdimm: add 'unarmed' option
Date: Thu, 18 Jan 2018 00:09:58 -0200 [thread overview]
Message-ID: <20180118021000.27203-18-ehabkost@redhat.com> (raw)
In-Reply-To: <20180118021000.27203-1-ehabkost@redhat.com>
From: Haozhong Zhang <haozhong.zhang@intel.com>
Currently the only vNVDIMM backend can guarantee the guest write
persistence is device DAX on Linux, because no host-side kernel cache
is involved in the guest access to it. The approach to detect whether
the backend is device DAX needs to access sysfs, which may not work
with SELinux.
Instead, we add the 'unarmed' option to device 'nvdimm', so that users
or management utils, which have enough knowledge about the backend,
can control the unarmed flag in guest ACPI NFIT via this option. The
guest Linux NVDIMM driver, for example, will mark the corresponding
vNVDIMM device read-only if the unarmed flag in guest NFIT is set.
The default value of 'unarmed' option is 'off' in order to keep the
backwards compatibility.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-4-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
docs/nvdimm.txt | 15 +++++++++++++++
include/hw/mem/nvdimm.h | 9 +++++++++
hw/acpi/nvdimm.c | 7 +++++++
hw/mem/nvdimm.c | 26 ++++++++++++++++++++++++++
4 files changed, 57 insertions(+)
diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt
index 21249dd062..e903d8bb09 100644
--- a/docs/nvdimm.txt
+++ b/docs/nvdimm.txt
@@ -138,3 +138,18 @@ backend of vNVDIMM:
-object memory-backend-file,id=mem1,share=on,mem-path=/dev/dax0.0,size=4G,align=2M
-device nvdimm,id=nvdimm1,memdev=mem1
+
+Guest Data Persistence
+----------------------
+
+Though QEMU supports multiple types of vNVDIMM backends on Linux,
+currently the only one that can guarantee the guest write persistence
+is the device DAX on the real NVDIMM device (e.g., /dev/dax0.0), to
+which all guest access do not involve any host-side kernel cache.
+
+When using other types of backends, it's suggested to set 'unarmed'
+option of '-device nvdimm' to 'on', which sets the unarmed flag of the
+guest NVDIMM region mapping structure. This unarmed flag indicates
+guest software that this vNVDIMM device contains a region that cannot
+accept persistent writes. In result, for example, the guest Linux
+NVDIMM driver, marks such vNVDIMM device as read-only.
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index 28e68ddf59..7fd87c4e1c 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -49,6 +49,7 @@
TYPE_NVDIMM)
#define NVDIMM_LABLE_SIZE_PROP "label-size"
+#define NVDIMM_UNARMED_PROP "unarmed"
struct NVDIMMDevice {
/* private */
@@ -74,6 +75,14 @@ struct NVDIMMDevice {
* guest via ACPI NFIT and _FIT method if NVDIMM hotplug is supported.
*/
MemoryRegion nvdimm_mr;
+
+ /*
+ * The 'on' value results in the unarmed flag set in ACPI NFIT,
+ * which can be used to notify guest implicitly that the host
+ * backend (e.g., files on HDD, /dev/pmemX, etc.) cannot guarantee
+ * the guest write persistence.
+ */
+ bool unarmed;
};
typedef struct NVDIMMDevice NVDIMMDevice;
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 6ceea196e7..59d6e4254c 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -138,6 +138,8 @@ struct NvdimmNfitMemDev {
} QEMU_PACKED;
typedef struct NvdimmNfitMemDev NvdimmNfitMemDev;
+#define ACPI_NFIT_MEM_NOT_ARMED (1 << 3)
+
/*
* NVDIMM Control Region Structure
*
@@ -284,6 +286,7 @@ static void
nvdimm_build_structure_memdev(GArray *structures, DeviceState *dev)
{
NvdimmNfitMemDev *nfit_memdev;
+ NVDIMMDevice *nvdimm = NVDIMM(OBJECT(dev));
uint64_t size = object_property_get_uint(OBJECT(dev), PC_DIMM_SIZE_PROP,
NULL);
int slot = object_property_get_int(OBJECT(dev), PC_DIMM_SLOT_PROP,
@@ -312,6 +315,10 @@ nvdimm_build_structure_memdev(GArray *structures, DeviceState *dev)
/* Only one interleave for PMEM. */
nfit_memdev->interleave_ways = cpu_to_le16(1);
+
+ if (nvdimm->unarmed) {
+ nfit_memdev->flags |= cpu_to_le16(ACPI_NFIT_MEM_NOT_ARMED);
+ }
}
/*
diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
index 618c3d677b..61e677f92f 100644
--- a/hw/mem/nvdimm.c
+++ b/hw/mem/nvdimm.c
@@ -25,6 +25,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qapi/visitor.h"
+#include "qapi-visit.h"
#include "hw/mem/nvdimm.h"
static void nvdimm_get_label_size(Object *obj, Visitor *v, const char *name,
@@ -64,11 +65,36 @@ out:
error_propagate(errp, local_err);
}
+static bool nvdimm_get_unarmed(Object *obj, Error **errp)
+{
+ NVDIMMDevice *nvdimm = NVDIMM(obj);
+
+ return nvdimm->unarmed;
+}
+
+static void nvdimm_set_unarmed(Object *obj, bool value, Error **errp)
+{
+ NVDIMMDevice *nvdimm = NVDIMM(obj);
+ Error *local_err = NULL;
+
+ if (memory_region_size(&nvdimm->nvdimm_mr)) {
+ error_setg(&local_err, "cannot change property value");
+ goto out;
+ }
+
+ nvdimm->unarmed = value;
+
+ out:
+ error_propagate(errp, local_err);
+}
+
static void nvdimm_init(Object *obj)
{
object_property_add(obj, NVDIMM_LABLE_SIZE_PROP, "int",
nvdimm_get_label_size, nvdimm_set_label_size, NULL,
NULL, NULL);
+ object_property_add_bool(obj, NVDIMM_UNARMED_PROP,
+ nvdimm_get_unarmed, nvdimm_set_unarmed, NULL);
}
static MemoryRegion *nvdimm_get_memory_region(PCDIMMDevice *dimm, Error **errp)
--
2.14.3
next prev parent reply other threads:[~2018-01-18 2:11 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-18 2:09 [Qemu-devel] [PULL 00/19] machine queue, 2018-01-18 Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 01/19] memfd: split qemu_memfd_alloc() Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 02/19] memfd: remove needless include Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 03/19] qemu-options: document missing memory-backend-file options Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 04/19] qemu-options: document memory-backend-ram Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 05/19] numa: fix missing '-numa cpu' in '-help' output Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 06/19] machine: Replace has_dynamic_sysbus with list of allowed devices Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 07/19] hw/arm/virt: Allow only supported dynamic sysbus devices Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 08/19] ppc: e500: " Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 09/19] spapr: " Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 10/19] xen: Add only xen-sysdev to dynamic sysbus device list Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 11/19] q35: Allow only supported dynamic sysbus devices Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 12/19] qdev_monitor: Simplify error handling in qdev_device_add() Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 13/19] qdev: Check for the availability of a hotplug controller before adding a device Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 14/19] scripts: Remove fixed entries from the device-crash-test Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 15/19] hostmem-file: add "align" option Eduardo Habkost
2018-01-18 2:09 ` [Qemu-devel] [PULL 16/19] nvdimm: add a macro for property "label-size" Eduardo Habkost
2018-01-18 2:09 ` Eduardo Habkost [this message]
2018-01-18 2:09 ` [Qemu-devel] [PULL 18/19] possible_cpus: add CPUArchId::type field Eduardo Habkost
2018-01-18 2:10 ` [Qemu-devel] [PULL 19/19] fw_cfg: fix memory corruption when all fw_cfg slots are used Eduardo Habkost
2018-01-18 15:22 ` [Qemu-devel] [PULL 00/19] machine queue, 2018-01-18 Peter Maydell
2018-01-18 15:36 ` Philippe Mathieu-Daudé
2018-01-18 19:59 ` Eduardo Habkost
2018-01-18 20:14 ` Eduardo Habkost
2018-01-18 21:20 ` Eduardo Habkost
2018-01-19 0:47 ` Haozhong Zhang
2018-01-19 13:15 ` Eduardo Habkost
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180118021000.27203-18-ehabkost@redhat.com \
--to=ehabkost@redhat.com \
--cc=haozhong.zhang@intel.com \
--cc=marcel@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).