qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Haozhong Zhang <haozhong.zhang@intel.com>
To: qemu-devel@nongnu.org
Cc: Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	mst@redhat.com, Eduardo Habkost <ehabkost@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Haozhong Zhang <haozhong.zhang@intel.com>
Subject: [Qemu-devel] [PATCH] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()
Date: Wed, 27 Dec 2017 14:56:20 +0800	[thread overview]
Message-ID: <20171227065620.20889-1-haozhong.zhang@intel.com> (raw)

When a file supporting DAX is used as vNVDIMM backend, mmap it with
MAP_SYNC flag in addition can guarantee the persistence of guest write
to the backend file without other QEMU actions (e.g., periodic fsync()
by QEMU).

By using MAP_SHARED_VALIDATE flag with MAP_SYNC, we can ensure mmap
with MAP_SYNC fails if MAP_SYNC is not supported by the kernel or the
backend file. On such failures, QEMU retries mmap without MAP_SYNC and
MAP_SHARED_VALIDATE.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 util/mmap-alloc.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index 2fd8cbcc6f..37b302f057 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -18,7 +18,18 @@
 
 #ifdef CONFIG_LINUX
 #include <sys/vfs.h>
+
+/*
+ * MAP_SHARED_VALIDATE and MAP_SYNC were introduced in 4.15 kernel, so
+ * they may not be defined when compiling on older kernels.
+ */
+#ifndef MAP_SHARED_VALIDATE
+#define MAP_SHARED_VALIDATE   0x3
 #endif
+#ifndef MAP_SYNC
+#define MAP_SYNC              0x80000
+#endif
+#endif /* CONFIG_LINUX */
 
 size_t qemu_fd_getpagesize(int fd)
 {
@@ -97,6 +108,7 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
 #endif
     size_t offset;
     void *ptr1;
+    int xflags = 0;
 
     if (ptr == MAP_FAILED) {
         return MAP_FAILED;
@@ -107,12 +119,34 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared)
     assert(align >= getpagesize());
 
     offset = QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr;
+
+#if defined(__linux__)
+    /*
+     * If 'fd' refers to a file supporting DAX, mmap it with MAP_SYNC
+     * will guarantee the guest write persistence without other
+     * actions in QEMU (e.g., fsync() in QEMU).
+     *
+     * MAP_SHARED_VALIDATE ensures mmap with MAP_SYNC fails if
+     * MAP_SYNC is not supported by the kernel or the file.
+     *
+     * On failures of mmap with xflags, QEMU will retry mmap without
+     * xflags.
+     */
+    xflags = shared ? (MAP_SHARED_VALIDATE | MAP_SYNC) : 0;
+#endif
+
+ retry_mmap_fd:
     ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE,
                 MAP_FIXED |
                 (fd == -1 ? MAP_ANONYMOUS : 0) |
-                (shared ? MAP_SHARED : MAP_PRIVATE),
+                (shared ? MAP_SHARED : MAP_PRIVATE) | xflags,
                 fd, 0);
     if (ptr1 == MAP_FAILED) {
+        if (xflags) {
+            xflags = 0;
+            goto retry_mmap_fd;
+        }
+
         munmap(ptr, total);
         return MAP_FAILED;
     }
-- 
2.14.1

             reply	other threads:[~2017-12-27  6:57 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-27  6:56 Haozhong Zhang [this message]
2018-01-02 16:02 ` [Qemu-devel] [PATCH] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap() Michael S. Tsirkin
2018-01-03  3:16   ` Haozhong Zhang
2018-01-03 13:45     ` Eduardo Habkost
2018-01-04  1:23       ` Haozhong Zhang
2018-01-04 11:57         ` Eduardo Habkost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171227065620.20889-1-haozhong.zhang@intel.com \
    --to=haozhong.zhang@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=ehabkost@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).