From: Eduardo Habkost <ehabkost@redhat.com>
To: "Zhang, Yi" <yi.z.zhang@linux.intel.com>
Cc: xiaoguangrong.eric@gmail.com, stefanha@redhat.com,
pbonzini@redhat.com, pagupta@redhat.com,
yu.c.zhang@linux.intel.com, richardw.yang@linux.intel.com,
mst@redhat.com, imammedo@redhat.com, dan.j.williams@intel.com,
qemu-devel@nongnu.org,
Murilo Opsfelder Araujo <muriloo@linux.ibm.com>,
Greg Kurz <groug@kaod.org>,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH V13 4/5] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()
Date: Thu, 18 Apr 2019 19:05:16 -0300 [thread overview]
Message-ID: <20190418220516.GP25134@habkost.net> (raw)
In-Reply-To: <5d07bf7e9a3e576f5a87e81d786e8886fb2bb551.1549555521.git.yi.z.zhang@linux.intel.com>
Hi,
I found out that this series missed QEMU 4.0 and I was going to
queue for 4.1, but unfortunately this patch conflicts with:
commit 2044c3e7116eeac0449dcb4a4130cc8f8b9310da
Author: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Date: Wed Jan 30 21:36:04 2019 -0200
mmap-alloc: unfold qemu_ram_mmap()
Unfold parts of qemu_ram_mmap() for the sake of understanding, moving
declarations to the top, and keeping architecture-specifics in the
ifdef-else blocks. No changes in the function behaviour.
Give ptr and ptr1 meaningful names:
ptr -> guardptr : pointer to the PROT_NONE guard region
ptr1 -> ptr : pointer to the mapped memory returned to caller
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
I'm queueing patches 1-3 into machine-next[1], so only patches 4
and 5 need to be refreshed and resubmitted.
[1] https://github.com/ehabkost/qemu.git machine-next
On Fri, Feb 08, 2019 at 06:11:11PM +0800, Zhang, Yi wrote:
> From: Zhang Yi <yi.z.zhang@linux.intel.com>
>
> When a file supporting DAX is used as vNVDIMM backend, mmap it with
> MAP_SYNC flag in addition which can ensure file system metadata
> synced in each guest writes to the backend file, without other QEMU
> actions (e.g., periodic fsync() by QEMU).
>
> Current, We have below different possible use cases:
>
> 1. pmem=on is set, shared=on is set, MAP_SYNC supported:
> a: backend is a dax supporting file.
> - MAP_SYNC will active.
> b: backend is not a dax supporting file.
> - mmap will trigger a warning. then MAP_SYNC flag will be ignored
>
> 2. The rest of cases:
> - we will never pass the MAP_SYNC to mmap2
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com>
> ---
> util/mmap-alloc.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
> index 97bbeed..2f21efd 100644
> --- a/util/mmap-alloc.c
> +++ b/util/mmap-alloc.c
> @@ -10,6 +10,13 @@
> * later. See the COPYING file in the top-level directory.
> */
>
> +#ifdef CONFIG_LINUX
> +#include <linux/mman.h>
> +#else /* !CONFIG_LINUX */
> +#define MAP_SYNC 0x0
> +#define MAP_SHARED_VALIDATE 0x0
> +#endif /* CONFIG_LINUX */
> +
> #include "qemu/osdep.h"
> #include "qemu/mmap-alloc.h"
> #include "qemu/host-utils.h"
> @@ -101,6 +108,8 @@ void *qemu_ram_mmap(int fd,
> #else
> void *ptr = mmap(0, total, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> #endif
> + int mmap_flags;
> + int map_sync_flags = 0;
> size_t offset;
> void *ptr1;
>
> @@ -111,13 +120,47 @@ void *qemu_ram_mmap(int fd,
> assert(is_power_of_2(align));
> /* Always align to host page size */
> assert(align >= getpagesize());
> + mmap_flags = shared ? MAP_SHARED : MAP_PRIVATE;
> + if (shared && is_pmem) {
> + map_sync_flags = MAP_SYNC | MAP_SHARED_VALIDATE;
> + mmap_flags |= map_sync_flags;
> + }
>
> offset = QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr;
> ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE,
> MAP_FIXED |
> (fd == -1 ? MAP_ANONYMOUS : 0) |
> - (shared ? MAP_SHARED : MAP_PRIVATE),
> + mmap_flags,
> fd, 0);
> +
> +
> + if (ptr1 == MAP_FAILED && map_sync_flags) {
> + if (errno == ENOTSUP) {
> + char *proc_link, *file_name;
> + int len;
> + proc_link = g_strdup_printf("/proc/self/fd/%d", fd);
> + file_name = g_malloc0(PATH_MAX);
> + len = readlink(proc_link, file_name, PATH_MAX - 1);
> + if (len < 0) {
> + len = 0;
> + }
> + file_name[len] = '\0';
> + fprintf(stderr, "Warning: requesting persistence across crashes "
> + "for backend file %s failed. Proceeding without "
> + "persistence, data might become corrupted in case of host "
> + "crash.\n", file_name);
> + g_free(proc_link);
> + g_free(file_name);
> + }
> + /* if map failed with MAP_SHARED_VALIDATE | MAP_SYNC,
> + * we will remove these flags to handle compatibility.
> + */
> + ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE,
> + MAP_FIXED |
> + (fd == -1 ? MAP_ANONYMOUS : 0) |
> + MAP_SHARED,
> + fd, 0);
> + }
> if (ptr1 == MAP_FAILED) {
> munmap(ptr, total);
> return MAP_FAILED;
> --
> 2.7.4
>
>
--
Eduardo
WARNING: multiple messages have this Message-ID (diff)
From: Eduardo Habkost <ehabkost@redhat.com>
To: "Zhang, Yi" <yi.z.zhang@linux.intel.com>
Cc: pagupta@redhat.com, xiaoguangrong.eric@gmail.com, mst@redhat.com,
Murilo Opsfelder Araujo <muriloo@linux.ibm.com>,
qemu-devel@nongnu.org, Greg Kurz <groug@kaod.org>,
yu.c.zhang@linux.intel.com, richardw.yang@linux.intel.com,
stefanha@redhat.com, imammedo@redhat.com, pbonzini@redhat.com,
dan.j.williams@intel.com,
David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH V13 4/5] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()
Date: Thu, 18 Apr 2019 19:05:16 -0300 [thread overview]
Message-ID: <20190418220516.GP25134@habkost.net> (raw)
Message-ID: <20190418220516.sXBFkSxMIcYvtweDkyzNLwUPTshaGVhf2gge4_TVnK4@z> (raw)
In-Reply-To: <5d07bf7e9a3e576f5a87e81d786e8886fb2bb551.1549555521.git.yi.z.zhang@linux.intel.com>
Hi,
I found out that this series missed QEMU 4.0 and I was going to
queue for 4.1, but unfortunately this patch conflicts with:
commit 2044c3e7116eeac0449dcb4a4130cc8f8b9310da
Author: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Date: Wed Jan 30 21:36:04 2019 -0200
mmap-alloc: unfold qemu_ram_mmap()
Unfold parts of qemu_ram_mmap() for the sake of understanding, moving
declarations to the top, and keeping architecture-specifics in the
ifdef-else blocks. No changes in the function behaviour.
Give ptr and ptr1 meaningful names:
ptr -> guardptr : pointer to the PROT_NONE guard region
ptr1 -> ptr : pointer to the mapped memory returned to caller
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
I'm queueing patches 1-3 into machine-next[1], so only patches 4
and 5 need to be refreshed and resubmitted.
[1] https://github.com/ehabkost/qemu.git machine-next
On Fri, Feb 08, 2019 at 06:11:11PM +0800, Zhang, Yi wrote:
> From: Zhang Yi <yi.z.zhang@linux.intel.com>
>
> When a file supporting DAX is used as vNVDIMM backend, mmap it with
> MAP_SYNC flag in addition which can ensure file system metadata
> synced in each guest writes to the backend file, without other QEMU
> actions (e.g., periodic fsync() by QEMU).
>
> Current, We have below different possible use cases:
>
> 1. pmem=on is set, shared=on is set, MAP_SYNC supported:
> a: backend is a dax supporting file.
> - MAP_SYNC will active.
> b: backend is not a dax supporting file.
> - mmap will trigger a warning. then MAP_SYNC flag will be ignored
>
> 2. The rest of cases:
> - we will never pass the MAP_SYNC to mmap2
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com>
> ---
> util/mmap-alloc.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
> index 97bbeed..2f21efd 100644
> --- a/util/mmap-alloc.c
> +++ b/util/mmap-alloc.c
> @@ -10,6 +10,13 @@
> * later. See the COPYING file in the top-level directory.
> */
>
> +#ifdef CONFIG_LINUX
> +#include <linux/mman.h>
> +#else /* !CONFIG_LINUX */
> +#define MAP_SYNC 0x0
> +#define MAP_SHARED_VALIDATE 0x0
> +#endif /* CONFIG_LINUX */
> +
> #include "qemu/osdep.h"
> #include "qemu/mmap-alloc.h"
> #include "qemu/host-utils.h"
> @@ -101,6 +108,8 @@ void *qemu_ram_mmap(int fd,
> #else
> void *ptr = mmap(0, total, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> #endif
> + int mmap_flags;
> + int map_sync_flags = 0;
> size_t offset;
> void *ptr1;
>
> @@ -111,13 +120,47 @@ void *qemu_ram_mmap(int fd,
> assert(is_power_of_2(align));
> /* Always align to host page size */
> assert(align >= getpagesize());
> + mmap_flags = shared ? MAP_SHARED : MAP_PRIVATE;
> + if (shared && is_pmem) {
> + map_sync_flags = MAP_SYNC | MAP_SHARED_VALIDATE;
> + mmap_flags |= map_sync_flags;
> + }
>
> offset = QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr;
> ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE,
> MAP_FIXED |
> (fd == -1 ? MAP_ANONYMOUS : 0) |
> - (shared ? MAP_SHARED : MAP_PRIVATE),
> + mmap_flags,
> fd, 0);
> +
> +
> + if (ptr1 == MAP_FAILED && map_sync_flags) {
> + if (errno == ENOTSUP) {
> + char *proc_link, *file_name;
> + int len;
> + proc_link = g_strdup_printf("/proc/self/fd/%d", fd);
> + file_name = g_malloc0(PATH_MAX);
> + len = readlink(proc_link, file_name, PATH_MAX - 1);
> + if (len < 0) {
> + len = 0;
> + }
> + file_name[len] = '\0';
> + fprintf(stderr, "Warning: requesting persistence across crashes "
> + "for backend file %s failed. Proceeding without "
> + "persistence, data might become corrupted in case of host "
> + "crash.\n", file_name);
> + g_free(proc_link);
> + g_free(file_name);
> + }
> + /* if map failed with MAP_SHARED_VALIDATE | MAP_SYNC,
> + * we will remove these flags to handle compatibility.
> + */
> + ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE,
> + MAP_FIXED |
> + (fd == -1 ? MAP_ANONYMOUS : 0) |
> + MAP_SHARED,
> + fd, 0);
> + }
> if (ptr1 == MAP_FAILED) {
> munmap(ptr, total);
> return MAP_FAILED;
> --
> 2.7.4
>
>
--
Eduardo
next prev parent reply other threads:[~2019-04-18 22:05 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-08 10:10 [Qemu-devel] [PATCH V13 0/5] support MAP_SYNC for memory-backend-file Zhang, Yi
2019-02-08 3:01 ` Michael S. Tsirkin
2019-02-08 4:52 ` Pankaj Gupta
2019-02-08 10:10 ` [Qemu-devel] [PATCH V13 1/5] util/mmap-alloc: Add a 'is_pmem' parameter to qemu_ram_mmap Zhang, Yi
2019-02-08 10:10 ` [Qemu-devel] [PATCH V13 2/5] scripts/update-linux-headers: add linux/mman.h Zhang, Yi
2019-02-08 10:11 ` [Qemu-devel] [PATCH V13 3/5] linux-headers: " Zhang, Yi
2019-02-08 10:11 ` [Qemu-devel] [PATCH V13 4/5] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap() Zhang, Yi
2019-04-18 22:05 ` Eduardo Habkost [this message]
2019-04-18 22:05 ` Eduardo Habkost
2019-04-18 22:33 ` Eduardo Habkost
2019-04-18 22:33 ` Eduardo Habkost
2019-04-22 0:37 ` Wei Yang
2019-04-22 0:37 ` Wei Yang
2019-02-08 10:11 ` [Qemu-devel] [PATCH V13 5/5] docs: Added MAP_SYNC documentation Zhang, Yi
2019-02-08 3:00 ` Michael S. Tsirkin
2019-02-08 13:07 ` Yi Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190418220516.GP25134@habkost.net \
--to=ehabkost@redhat.com \
--cc=dan.j.williams@intel.com \
--cc=david@gibson.dropbear.id.au \
--cc=groug@kaod.org \
--cc=imammedo@redhat.com \
--cc=mst@redhat.com \
--cc=muriloo@linux.ibm.com \
--cc=pagupta@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=richardw.yang@linux.intel.com \
--cc=stefanha@redhat.com \
--cc=xiaoguangrong.eric@gmail.com \
--cc=yi.z.zhang@linux.intel.com \
--cc=yu.c.zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.