From: Haozhong Zhang <haozhong.zhang@intel.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: qemu-devel@nongnu.org, Eduardo Habkost <ehabkost@redhat.com>,
Igor Mammedov <imammedo@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
dgilbert@redhat.com,
Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Dan Williams <dan.j.williams@intel.com>
Subject: Re: [Qemu-devel] [PATCH v3 3/3] hostmem-file: add 'sync' option
Date: Thu, 25 Jan 2018 08:24:35 +0800 [thread overview]
Message-ID: <20180125002435.r3gjfo765obx3bv2@hz-desktop> (raw)
In-Reply-To: <20180124222235-mutt-send-email-mst@kernel.org>
On 01/24/18 22:23 +0200, Michael S. Tsirkin wrote:
> On Wed, Jan 17, 2018 at 04:13:25PM +0800, Haozhong Zhang wrote:
> > This option controls whether QEMU mmap(2) the memory backend file with
> > MAP_SYNC flag, which can fully guarantee the guest write persistence
> > to the backend, if MAP_SYNC flag is supported by the host kernel
> > (Linux kernel 4.15 and later) and the backend is a file supporting
> > DAX (e.g., file on ext4/xfs file system mounted with '-o dax').
> >
> > It can take one of following values:
> > - on: try to pass MAP_SYNC to mmap(2); if MAP_SYNC is not supported or
> > 'share=off', QEMU will abort
> > - off: never pass MAP_SYNC to mmap(2)
> > - auto (default): if MAP_SYNC is supported and 'share=on', work as if
> > 'sync=on'; otherwise, work as if 'sync=off'
> >
> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> > Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
[..]
> >
> > @table @option
> >
> > -@item -object memory-backend-file,id=@var{id},size=@var{size},mem-path=@var{dir},share=@var{on|off},discard-data=@var{on|off},merge=@var{on|off},dump=@var{on|off},prealloc=@var{on|off},host-nodes=@var{host-nodes},policy=@var{default|preferred|bind|interleave},align=@var{align}
> > +@item -object memory-backend-file,id=@var{id},size=@var{size},mem-path=@var{dir},share=@var{on|off},discard-data=@var{on|off},merge=@var{on|off},dump=@var{on|off},prealloc=@var{on|off},host-nodes=@var{host-nodes},policy=@var{default|preferred|bind|interleave},align=@var{align},sync=@var{on|off|auto}
> >
> > Creates a memory file backend object, which can be used to back
> > the guest RAM with huge pages.
> > @@ -4034,6 +4034,25 @@ requires an alignment different than the default one used by QEMU, eg
> > the device DAX /dev/dax0.0 requires 2M alignment rather than 4K. In
> > such cases, users can specify the required alignment via this option.
> >
> > +The @option{sync} option specifies whether QEMU mmap(2) @option{mem-path}
> > +with MAP_SYNC flag, which can fully guarantee the guest write
> > +persistence to @option{mem-path}.
>
> I would add ... even in case of a host power loss.
> Here and wherever you say "fully".
Without MAP_SYNC, QEMU can only guarantee the guest data is written to
the host NVDIMM after, for example, guest clwb+sfence. However, if
some host file system meta data of the mapped file have not been
written back to the host NVDIMM when a host power failure happens, the
mapped file may be broken though all its data may be still there.
Anyway, I'll remove the confusing word "fully" and add your suggestion.
Thanks,
Haozhong
>
> > MAP_SYNC requires supports from both
> > +the host kernel (since Linux kernel 4.15) and @option{mem-path} (only
> > +files supporting DAX). It can take one of following values:
> > +
> > +@table @option
> > +@item @var{on}
> > +try to pass MAP_SYNC to mmap(2); if MAP_SYNC is not supported or
> > +@option{share}=@var{off}, QEMU will abort
> > +
> > +@item @var{off}
> > +never pass MAP_SYNC to mmap(2)
> > +
> > +@item @var{auto} (default)
> > +if MAP_SYNC is supported and @option{share}=@var{on}, work as if
> > +@option{sync}=@var{on}; otherwise, work as if @option{sync}=@var{off}
> > +@end table
> > +
> > @item -object memory-backend-ram,id=@var{id},merge=@var{on|off},dump=@var{on|off},prealloc=@var{on|off},size=@var{size},host-nodes=@var{host-nodes},policy=@var{default|preferred|bind|interleave}
> >
> > Creates a memory backend object, which can be used to back the guest RAM.
> > --
> > 2.14.1
next prev parent reply other threads:[~2018-01-25 0:24 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-17 8:13 [Qemu-devel] [PATCH v3 0/3] nvdimm: support MAP_SYNC for memory-backend-file Haozhong Zhang
2018-01-17 8:13 ` [Qemu-devel] [PATCH v3 1/3] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap() Haozhong Zhang
2018-01-24 20:20 ` Michael S. Tsirkin
2018-01-25 0:14 ` Haozhong Zhang
2018-01-17 8:13 ` [Qemu-devel] [PATCH v3 2/3] hostmem: add more information in error messages Haozhong Zhang
2018-01-24 20:23 ` Michael S. Tsirkin
2018-01-17 8:13 ` [Qemu-devel] [PATCH v3 3/3] hostmem-file: add 'sync' option Haozhong Zhang
2018-01-24 20:22 ` Michael S. Tsirkin
2018-01-24 20:23 ` Michael S. Tsirkin
2018-01-25 0:24 ` Haozhong Zhang [this message]
2018-01-24 8:12 ` [Qemu-devel] [PATCH v3 0/3] nvdimm: support MAP_SYNC for memory-backend-file Haozhong Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180125002435.r3gjfo765obx3bv2@hz-desktop \
--to=haozhong.zhang@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dgilbert@redhat.com \
--cc=ehabkost@redhat.com \
--cc=imammedo@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=xiaoguangrong.eric@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).