All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eduardo Habkost <ehabkost@redhat.com>
To: "Zhang, Yi" <yi.z.zhang@linux.intel.com>
Cc: xiaoguangrong.eric@gmail.com, stefanha@redhat.com,
	pbonzini@redhat.com, pagupta@redhat.com,
	yu.c.zhang@linux.intel.com, richardw.yang@linux.intel.com,
	mst@redhat.com, imammedo@redhat.com, dan.j.williams@intel.com,
	qemu-devel@nongnu.org,
	Murilo Opsfelder Araujo <muriloo@linux.ibm.com>,
	Greg Kurz <groug@kaod.org>,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH V13 4/5] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()
Date: Thu, 18 Apr 2019 19:05:16 -0300	[thread overview]
Message-ID: <20190418220516.GP25134@habkost.net> (raw)
In-Reply-To: <5d07bf7e9a3e576f5a87e81d786e8886fb2bb551.1549555521.git.yi.z.zhang@linux.intel.com>

Hi,

I found out that this series missed QEMU 4.0 and I was going to
queue for 4.1, but unfortunately this patch conflicts with:

commit 2044c3e7116eeac0449dcb4a4130cc8f8b9310da
Author: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Date:   Wed Jan 30 21:36:04 2019 -0200

    mmap-alloc: unfold qemu_ram_mmap()
    
    Unfold parts of qemu_ram_mmap() for the sake of understanding, moving
    declarations to the top, and keeping architecture-specifics in the
    ifdef-else blocks.  No changes in the function behaviour.
    
    Give ptr and ptr1 meaningful names:
      ptr  -> guardptr : pointer to the PROT_NONE guard region
      ptr1 -> ptr      : pointer to the mapped memory returned to caller
    
    Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
    Reviewed-by: Greg Kurz <groug@kaod.org>
    Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

I'm queueing patches 1-3 into machine-next[1], so only patches 4
and 5 need to be refreshed and resubmitted.

[1] https://github.com/ehabkost/qemu.git machine-next


On Fri, Feb 08, 2019 at 06:11:11PM +0800, Zhang, Yi wrote:
> From: Zhang Yi <yi.z.zhang@linux.intel.com>
> 
> When a file supporting DAX is used as vNVDIMM backend, mmap it with
> MAP_SYNC flag in addition which can ensure file system metadata
> synced in each guest writes to the backend file, without other QEMU
> actions (e.g., periodic fsync() by QEMU).
> 
> Current, We have below different possible use cases:
> 
> 1. pmem=on is set, shared=on is set, MAP_SYNC supported:
>    a: backend is a dax supporting file.
>     - MAP_SYNC will active.
>    b: backend is not a dax supporting file.
>     - mmap will trigger a warning. then MAP_SYNC flag will be ignored
> 
> 2. The rest of cases:
>    - we will never pass the MAP_SYNC to mmap2
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com>
> ---
>  util/mmap-alloc.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 44 insertions(+), 1 deletion(-)
> 
> diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
> index 97bbeed..2f21efd 100644
> --- a/util/mmap-alloc.c
> +++ b/util/mmap-alloc.c
> @@ -10,6 +10,13 @@
>   * later.  See the COPYING file in the top-level directory.
>   */
>  
> +#ifdef CONFIG_LINUX
> +#include <linux/mman.h>
> +#else  /* !CONFIG_LINUX */
> +#define MAP_SYNC              0x0
> +#define MAP_SHARED_VALIDATE   0x0
> +#endif /* CONFIG_LINUX */
> +
>  #include "qemu/osdep.h"
>  #include "qemu/mmap-alloc.h"
>  #include "qemu/host-utils.h"
> @@ -101,6 +108,8 @@ void *qemu_ram_mmap(int fd,
>  #else
>      void *ptr = mmap(0, total, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
>  #endif
> +    int mmap_flags;
> +    int map_sync_flags = 0;
>      size_t offset;
>      void *ptr1;
>  
> @@ -111,13 +120,47 @@ void *qemu_ram_mmap(int fd,
>      assert(is_power_of_2(align));
>      /* Always align to host page size */
>      assert(align >= getpagesize());
> +    mmap_flags = shared ? MAP_SHARED : MAP_PRIVATE;
> +    if (shared && is_pmem) {
> +        map_sync_flags = MAP_SYNC | MAP_SHARED_VALIDATE;
> +        mmap_flags |= map_sync_flags;
> +    }
>  
>      offset = QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr;
>      ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE,
>                  MAP_FIXED |
>                  (fd == -1 ? MAP_ANONYMOUS : 0) |
> -                (shared ? MAP_SHARED : MAP_PRIVATE),
> +                mmap_flags,
>                  fd, 0);
> +
> +
> +    if (ptr1 == MAP_FAILED && map_sync_flags) {
> +        if (errno == ENOTSUP) {
> +            char *proc_link, *file_name;
> +            int len;
> +            proc_link = g_strdup_printf("/proc/self/fd/%d", fd);
> +            file_name = g_malloc0(PATH_MAX);
> +            len = readlink(proc_link, file_name, PATH_MAX - 1);
> +            if (len < 0) {
> +                len = 0;
> +            }
> +            file_name[len] = '\0';
> +            fprintf(stderr, "Warning: requesting persistence across crashes "
> +                    "for backend file %s failed. Proceeding without "
> +                    "persistence, data might become corrupted in case of host "
> +                    "crash.\n", file_name);
> +            g_free(proc_link);
> +            g_free(file_name);
> +        }
> +        /* if map failed with MAP_SHARED_VALIDATE | MAP_SYNC,
> +         * we will remove these flags to handle compatibility.
> +         */
> +        ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE,
> +                    MAP_FIXED |
> +                    (fd == -1 ? MAP_ANONYMOUS : 0) |
> +                    MAP_SHARED,
> +                    fd, 0);
> +    }
>      if (ptr1 == MAP_FAILED) {
>          munmap(ptr, total);
>          return MAP_FAILED;
> -- 
> 2.7.4
> 
> 

-- 
Eduardo

WARNING: multiple messages have this Message-ID (diff)
From: Eduardo Habkost <ehabkost@redhat.com>
To: "Zhang, Yi" <yi.z.zhang@linux.intel.com>
Cc: pagupta@redhat.com, xiaoguangrong.eric@gmail.com, mst@redhat.com,
	Murilo Opsfelder Araujo <muriloo@linux.ibm.com>,
	qemu-devel@nongnu.org, Greg Kurz <groug@kaod.org>,
	yu.c.zhang@linux.intel.com, richardw.yang@linux.intel.com,
	stefanha@redhat.com, imammedo@redhat.com, pbonzini@redhat.com,
	dan.j.williams@intel.com,
	David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [Qemu-devel] [PATCH V13 4/5] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()
Date: Thu, 18 Apr 2019 19:05:16 -0300	[thread overview]
Message-ID: <20190418220516.GP25134@habkost.net> (raw)
Message-ID: <20190418220516.sXBFkSxMIcYvtweDkyzNLwUPTshaGVhf2gge4_TVnK4@z> (raw)
In-Reply-To: <5d07bf7e9a3e576f5a87e81d786e8886fb2bb551.1549555521.git.yi.z.zhang@linux.intel.com>

Hi,

I found out that this series missed QEMU 4.0 and I was going to
queue for 4.1, but unfortunately this patch conflicts with:

commit 2044c3e7116eeac0449dcb4a4130cc8f8b9310da
Author: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Date:   Wed Jan 30 21:36:04 2019 -0200

    mmap-alloc: unfold qemu_ram_mmap()
    
    Unfold parts of qemu_ram_mmap() for the sake of understanding, moving
    declarations to the top, and keeping architecture-specifics in the
    ifdef-else blocks.  No changes in the function behaviour.
    
    Give ptr and ptr1 meaningful names:
      ptr  -> guardptr : pointer to the PROT_NONE guard region
      ptr1 -> ptr      : pointer to the mapped memory returned to caller
    
    Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
    Reviewed-by: Greg Kurz <groug@kaod.org>
    Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

I'm queueing patches 1-3 into machine-next[1], so only patches 4
and 5 need to be refreshed and resubmitted.

[1] https://github.com/ehabkost/qemu.git machine-next


On Fri, Feb 08, 2019 at 06:11:11PM +0800, Zhang, Yi wrote:
> From: Zhang Yi <yi.z.zhang@linux.intel.com>
> 
> When a file supporting DAX is used as vNVDIMM backend, mmap it with
> MAP_SYNC flag in addition which can ensure file system metadata
> synced in each guest writes to the backend file, without other QEMU
> actions (e.g., periodic fsync() by QEMU).
> 
> Current, We have below different possible use cases:
> 
> 1. pmem=on is set, shared=on is set, MAP_SYNC supported:
>    a: backend is a dax supporting file.
>     - MAP_SYNC will active.
>    b: backend is not a dax supporting file.
>     - mmap will trigger a warning. then MAP_SYNC flag will be ignored
> 
> 2. The rest of cases:
>    - we will never pass the MAP_SYNC to mmap2
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> Signed-off-by: Zhang Yi <yi.z.zhang@linux.intel.com>
> ---
>  util/mmap-alloc.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 44 insertions(+), 1 deletion(-)
> 
> diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
> index 97bbeed..2f21efd 100644
> --- a/util/mmap-alloc.c
> +++ b/util/mmap-alloc.c
> @@ -10,6 +10,13 @@
>   * later.  See the COPYING file in the top-level directory.
>   */
>  
> +#ifdef CONFIG_LINUX
> +#include <linux/mman.h>
> +#else  /* !CONFIG_LINUX */
> +#define MAP_SYNC              0x0
> +#define MAP_SHARED_VALIDATE   0x0
> +#endif /* CONFIG_LINUX */
> +
>  #include "qemu/osdep.h"
>  #include "qemu/mmap-alloc.h"
>  #include "qemu/host-utils.h"
> @@ -101,6 +108,8 @@ void *qemu_ram_mmap(int fd,
>  #else
>      void *ptr = mmap(0, total, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
>  #endif
> +    int mmap_flags;
> +    int map_sync_flags = 0;
>      size_t offset;
>      void *ptr1;
>  
> @@ -111,13 +120,47 @@ void *qemu_ram_mmap(int fd,
>      assert(is_power_of_2(align));
>      /* Always align to host page size */
>      assert(align >= getpagesize());
> +    mmap_flags = shared ? MAP_SHARED : MAP_PRIVATE;
> +    if (shared && is_pmem) {
> +        map_sync_flags = MAP_SYNC | MAP_SHARED_VALIDATE;
> +        mmap_flags |= map_sync_flags;
> +    }
>  
>      offset = QEMU_ALIGN_UP((uintptr_t)ptr, align) - (uintptr_t)ptr;
>      ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE,
>                  MAP_FIXED |
>                  (fd == -1 ? MAP_ANONYMOUS : 0) |
> -                (shared ? MAP_SHARED : MAP_PRIVATE),
> +                mmap_flags,
>                  fd, 0);
> +
> +
> +    if (ptr1 == MAP_FAILED && map_sync_flags) {
> +        if (errno == ENOTSUP) {
> +            char *proc_link, *file_name;
> +            int len;
> +            proc_link = g_strdup_printf("/proc/self/fd/%d", fd);
> +            file_name = g_malloc0(PATH_MAX);
> +            len = readlink(proc_link, file_name, PATH_MAX - 1);
> +            if (len < 0) {
> +                len = 0;
> +            }
> +            file_name[len] = '\0';
> +            fprintf(stderr, "Warning: requesting persistence across crashes "
> +                    "for backend file %s failed. Proceeding without "
> +                    "persistence, data might become corrupted in case of host "
> +                    "crash.\n", file_name);
> +            g_free(proc_link);
> +            g_free(file_name);
> +        }
> +        /* if map failed with MAP_SHARED_VALIDATE | MAP_SYNC,
> +         * we will remove these flags to handle compatibility.
> +         */
> +        ptr1 = mmap(ptr + offset, size, PROT_READ | PROT_WRITE,
> +                    MAP_FIXED |
> +                    (fd == -1 ? MAP_ANONYMOUS : 0) |
> +                    MAP_SHARED,
> +                    fd, 0);
> +    }
>      if (ptr1 == MAP_FAILED) {
>          munmap(ptr, total);
>          return MAP_FAILED;
> -- 
> 2.7.4
> 
> 

-- 
Eduardo


  reply	other threads:[~2019-04-18 22:05 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-08 10:10 [Qemu-devel] [PATCH V13 0/5] support MAP_SYNC for memory-backend-file Zhang, Yi
2019-02-08  3:01 ` Michael S. Tsirkin
2019-02-08  4:52 ` Pankaj Gupta
2019-02-08 10:10 ` [Qemu-devel] [PATCH V13 1/5] util/mmap-alloc: Add a 'is_pmem' parameter to qemu_ram_mmap Zhang, Yi
2019-02-08 10:10 ` [Qemu-devel] [PATCH V13 2/5] scripts/update-linux-headers: add linux/mman.h Zhang, Yi
2019-02-08 10:11 ` [Qemu-devel] [PATCH V13 3/5] linux-headers: " Zhang, Yi
2019-02-08 10:11 ` [Qemu-devel] [PATCH V13 4/5] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap() Zhang, Yi
2019-04-18 22:05   ` Eduardo Habkost [this message]
2019-04-18 22:05     ` Eduardo Habkost
2019-04-18 22:33     ` Eduardo Habkost
2019-04-18 22:33       ` Eduardo Habkost
2019-04-22  0:37       ` Wei Yang
2019-04-22  0:37         ` Wei Yang
2019-02-08 10:11 ` [Qemu-devel] [PATCH V13 5/5] docs: Added MAP_SYNC documentation Zhang, Yi
2019-02-08  3:00   ` Michael S. Tsirkin
2019-02-08 13:07     ` Yi Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190418220516.GP25134@habkost.net \
    --to=ehabkost@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=groug@kaod.org \
    --cc=imammedo@redhat.com \
    --cc=mst@redhat.com \
    --cc=muriloo@linux.ibm.com \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richardw.yang@linux.intel.com \
    --cc=stefanha@redhat.com \
    --cc=xiaoguangrong.eric@gmail.com \
    --cc=yi.z.zhang@linux.intel.com \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.