From: Haozhong Zhang <haozhong.zhang@intel.com>
To: qemu-devel@nongnu.org
Cc: Eduardo Habkost <ehabkost@redhat.com>,
Igor Mammedov <imammedo@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
mst@redhat.com, dgilbert@redhat.com,
Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Dan Williams <dan.j.williams@intel.com>,
Haozhong Zhang <haozhong.zhang@intel.com>
Subject: [Qemu-devel] [PATCH v4 4/6] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()
Date: Wed, 31 Jan 2018 14:02:27 +0800 [thread overview]
Message-ID: <20180131060229.9294-5-haozhong.zhang@intel.com> (raw)
In-Reply-To: <20180131060229.9294-1-haozhong.zhang@intel.com>
When a file supporting DAX is used as vNVDIMM backend, mmap it with
MAP_SYNC flag in addition can guarantee the persistence of guest write
to the backend file without other QEMU actions (e.g., periodic fsync()
by QEMU).
A set of QEMU_RAM_SYNC_{AUTO,ON,OFF} flags are added to qemu_ram_mmap():
- If QEMU_RAM_SYNC_ON is present, qemu_ram_mmap() will try to pass
MAP_SYNC to mmap(). It will then fail if the host OS or the backend
file do not support MAP_SYNC, or MAP_SYNC is conflict with other
flags.
- If QEMU_RAM_SYNC_OFF is present, qemu_ram_mmap() will never pass
MAP_SYNC to mmap().
- If QEMU_RAM_SYNC_AUTO is present, and
* if the host OS and the backend file support MAP_SYNC, and MAP_SYNC
is not conflict with other flags, qemu_ram_mmap() will work as if
QEMU_RAM_SYNC_ON is present;
* otherwise, qemu_ram_mmap() will work as if QEMU_RAM_SYNC_OFF is
present.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
include/exec/memory.h | 26 ++++++++++++++++++++++
include/exec/ram_addr.h | 4 ++++
include/qemu/mmap-alloc.h | 4 ++++
include/standard-headers/linux/mman.h | 42 +++++++++++++++++++++++++++++++++++
util/mmap-alloc.c | 23 ++++++++++++++++++-
5 files changed, 98 insertions(+), 1 deletion(-)
create mode 100644 include/standard-headers/linux/mman.h
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 6b547da6a3..96a60e9c1d 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -458,6 +458,28 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr,
#define QEMU_RAM_SHARE (1UL << 0)
+#define QEMU_RAM_SYNC_SHIFT 1
+#define QEMU_RAM_SYNC_MASK 0x6
+#define QEMU_RAM_SYNC_OFF ((0UL << QEMU_RAM_SYNC_SHIFT) & QEMU_RAM_SYNC_MASK)
+#define QEMU_RAM_SYNC_ON ((1UL << QEMU_RAM_SYNC_SHIFT) & QEMU_RAM_SYNC_MASK)
+#define QEMU_RAM_SYNC_AUTO ((2UL << QEMU_RAM_SYNC_SHIFT) & QEMU_RAM_SYNC_MASK)
+
+static inline uint64_t qemu_ram_sync_flags(OnOffAuto v)
+{
+ return v == ON_OFF_AUTO_OFF ? QEMU_RAM_SYNC_OFF :
+ v == ON_OFF_AUTO_ON ? QEMU_RAM_SYNC_ON : QEMU_RAM_SYNC_AUTO;
+}
+
+static inline OnOffAuto qemu_ram_sync_val(uint64_t flags)
+{
+ unsigned int v = (flags & QEMU_RAM_SYNC_MASK) >> QEMU_RAM_SYNC_SHIFT;
+
+ assert(v < 3);
+
+ return v == 0 ? ON_OFF_AUTO_OFF :
+ v == 1 ? ON_OFF_AUTO_ON : ON_OFF_AUTO_AUTO;
+}
+
#ifdef __linux__
/**
* memory_region_init_ram_from_file: Initialize RAM memory region with a
@@ -473,6 +495,10 @@ void memory_region_init_resizeable_ram(MemoryRegion *mr,
* @flags: specify properties of this memory region, which can be one or bit-or
* of following values:
* - QEMU_RAM_SHARE: memory must be mmaped with the MAP_SHARED flag
+ * - One of
+ * QEMU_RAM_SYNC_ON: mmap with MAP_SYNC flag
+ * QEMU_RAM_SYNC_OFF: do not mmap with MAP_SYNC flag
+ * QEMU_RAM_SYNC_AUTO: automatically decide the use of MAP_SYNC flag
* Other bits are ignored.
* @path: the path in which to allocate the RAM.
* @errp: pointer to Error*, to store an error if it happens.
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index e24aae75a2..a2cc5a9f60 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -84,6 +84,10 @@ unsigned long last_ram_page(void);
* @flags: specify the properties of the ram block, which can be one
* or bit-or of following values
* - QEMU_RAM_SHARE: mmap the back file or device with MAP_SHARED
+ * - One of
+ * QEMU_RAM_SYNC_ON: mmap with MAP_SYNC flag
+ * QEMU_RAM_SYNC_OFF: do not mmap with MAP_SYNC flag
+ * QEMU_RAM_SYNC_AUTO: automatically decide the use of MAP_SYNC flag
* Other bits are ignored.
* @mem_path or @fd: specify the back file or device
* @errp: pointer to Error*, to store an error if it happens
diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h
index dc5e8b5efb..74346bdd3a 100644
--- a/include/qemu/mmap-alloc.h
+++ b/include/qemu/mmap-alloc.h
@@ -18,6 +18,10 @@ size_t qemu_mempath_getpagesize(const char *mem_path);
* @flags: specifies additional properties of the mapping, which can be one or
* bit-or of following values
* - QEMU_RAM_SHARE: mmap with MAP_SHARED flag
+ * - One of
+ * QEMU_RAM_SYNC_ON: mmap with MAP_SYNC flag
+ * QEMU_RAM_SYNC_OFF: do not mmap with MAP_SYNC flag
+ * QEMU_RAM_SYNC_AUTO: automatically decide the use of MAP_SYNC flag
* Other bits are ignored.
*
* Return:
diff --git a/include/standard-headers/linux/mman.h b/include/standard-headers/linux/mman.h
new file mode 100644
index 0000000000..033332ad4f
--- /dev/null
+++ b/include/standard-headers/linux/mman.h
@@ -0,0 +1,42 @@
+/*
+ * Definitions of Linux-specific mmap flags.
+ *
+ * Copyright Intel Corporation, 2018
+ *
+ * Author: Haozhong Zhang <haozhong.zhang@intel.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later. See the COPYING file in the top-level directory.
+ */
+
+#ifndef _LINUX_MMAN_H
+#define _LINUX_MMAN_H
+
+/*
+ * MAP_SHARED_VALIDATE and MAP_SYNC are introduced in Linux kernel
+ * 4.15, so they may not be defined when compiling on older kernels.
+ */
+#ifdef CONFIG_LINUX
+
+#include <sys/mman.h>
+
+#ifndef MAP_SHARED_VALIDATE
+#define MAP_SHARED_VALIDATE 0x3
+#endif
+
+#ifndef MAP_SYNC
+#define MAP_SYNC 0x80000
+#endif
+
+#define QEMU_HAS_MAP_SYNC true
+
+#else /* !CONFIG_LINUX */
+
+#define MAP_SHARED_VALIDATE 0x0
+#define MAP_SYNC 0x0
+
+#define QEMU_HAS_MAP_SYNC false
+
+#endif /* CONFIG_LINUX */
+
+#endif /* !_LINUX_MMAN_H */
diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index cd95566800..6df2f6d2c4 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -14,6 +14,7 @@
#include "qemu/mmap-alloc.h"
#include "qemu/host-utils.h"
#include "exec/memory.h"
+#include "standard-headers/linux/mman.h"
#define HUGETLBFS_MAGIC 0x958458f6
@@ -97,6 +98,8 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, uint64_t flags)
void *ptr = mmap(0, total, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
#endif
bool shared = flags & QEMU_RAM_SHARE;
+ OnOffAuto sync = qemu_ram_sync_val(flags);
+ int mmap_xflags = 0;
size_t offset;
void *ptr1;
@@ -108,13 +111,31 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, uint64_t flags)
/* Always align to host page size */
assert(align >= getpagesize());
+ if (!QEMU_HAS_MAP_SYNC || !shared) {
+ if (sync == ON_OFF_AUTO_ON) {
+ return MAP_FAILED;
+ }
+ sync = ON_OFF_AUTO_OFF;
+ }
+ if (sync != ON_OFF_AUTO_OFF) {
+ /* MAP_SYNC is only available with MAP_SHARED_VALIDATE. */
+ mmap_xflags |= MAP_SYNC | MAP_SHARED_VALIDATE;
+ }
+
offset = QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr;
+ retry_mmap_fd:
ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE,
MAP_FIXED |
(fd == -1 ? MAP_ANONYMOUS : 0) |
- (shared ? MAP_SHARED : MAP_PRIVATE),
+ (shared ? MAP_SHARED : MAP_PRIVATE) | mmap_xflags,
fd, 0);
if (ptr1 == MAP_FAILED) {
+ if (sync == ON_OFF_AUTO_AUTO) {
+ mmap_xflags &= ~(MAP_SYNC | MAP_SHARED_VALIDATE);
+ sync = ON_OFF_AUTO_OFF;
+ goto retry_mmap_fd;
+ }
+
munmap(ptr, total);
return MAP_FAILED;
}
--
2.14.1
next prev parent reply other threads:[~2018-01-31 6:03 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-31 6:02 [Qemu-devel] [PATCH v4 0/6] nvdimm: support MAP_SYNC for memory-backend-file Haozhong Zhang
2018-01-31 6:02 ` [Qemu-devel] [PATCH v4 1/6] util/mmap-alloc: switch qemu_ram_mmap() to 'flags' parameter Haozhong Zhang
2018-01-31 6:02 ` [Qemu-devel] [PATCH v4 2/6] exec: switch qemu_ram_alloc_from_{file, fd} to the " Haozhong Zhang
2018-01-31 6:02 ` [Qemu-devel] [PATCH v4 3/6] memory: switch memory_region_init_ram_from_file() to " Haozhong Zhang
2018-01-31 6:02 ` Haozhong Zhang [this message]
2018-01-31 6:02 ` [Qemu-devel] [PATCH v4 5/6] hostmem: add more information in error messages Haozhong Zhang
2018-01-31 6:02 ` [Qemu-devel] [PATCH v4 6/6] hostmem-file: add 'sync' option Haozhong Zhang
2018-01-31 22:25 ` [Qemu-devel] [PATCH v4 0/6] nvdimm: support MAP_SYNC for memory-backend-file Dan Williams
2018-02-01 0:02 ` Haozhong Zhang
2018-02-01 0:08 ` Dan Williams
2018-02-01 0:24 ` Haozhong Zhang
2018-02-01 0:32 ` Dan Williams
2018-02-01 2:29 ` Haozhong Zhang
2018-02-01 3:02 ` Dan Williams
2018-02-01 3:11 ` Dan Williams
2018-02-01 10:17 ` Haozhong Zhang
2018-02-01 17:43 ` Alex Williamson
2018-02-01 17:05 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180131060229.9294-5-haozhong.zhang@intel.com \
--to=haozhong.zhang@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dgilbert@redhat.com \
--cc=ehabkost@redhat.com \
--cc=imammedo@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=xiaoguangrong.eric@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).