From: Eduardo Habkost <ehabkost@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>,
qemu-devel@nongnu.org,
Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
Igor Mammedov <imammedo@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Dan Williams <dan.j.williams@intel.com>
Subject: Re: [Qemu-devel] [PATCH] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap()
Date: Thu, 4 Jan 2018 09:57:07 -0200 [thread overview]
Message-ID: <20180104115707.GC3143@localhost.localdomain> (raw)
In-Reply-To: <20180104012328.j4n2vuiq27rix3ss@hz-desktop>
On Thu, Jan 04, 2018 at 09:23:28AM +0800, Haozhong Zhang wrote:
> On 01/03/18 11:45 -0200, Eduardo Habkost wrote:
> > On Wed, Jan 03, 2018 at 11:16:39AM +0800, Haozhong Zhang wrote:
> > > On 01/02/18 18:02 +0200, Michael S. Tsirkin wrote:
> > > > On Wed, Dec 27, 2017 at 02:56:20PM +0800, Haozhong Zhang wrote:
> > > > > When a file supporting DAX is used as vNVDIMM backend, mmap it with
> > > > > MAP_SYNC flag in addition can guarantee the persistence of guest write
> > > > > to the backend file without other QEMU actions (e.g., periodic fsync()
> > > > > by QEMU).
> > > > >
> > > > > By using MAP_SHARED_VALIDATE flag with MAP_SYNC, we can ensure mmap
> > > > > with MAP_SYNC fails if MAP_SYNC is not supported by the kernel or the
> > > > > backend file. On such failures, QEMU retries mmap without MAP_SYNC and
> > > > > MAP_SHARED_VALIDATE.
> > > > >
> > > > > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> > > >
> > > > If users rely on MAP_SYNC then don't you need to fail allocation
> > > > if you can't use it?
> > >
> > > MAP_SYNC is supported since Linux kernel 4.15 and only needed for mmap
> > > files on nvdimm. qemu_ram_mmap() has no way to check whether its
> > > parameter 'fd' points to files on nvdimm, except by looking up
> > > sysfs. However, accessing sysfs may be denied by certain SELinux
> > > policies.
> > >
> > > The missing of MAP_SYNC should not affect the primary functionality of
> > > vNVDIMM when using files on host nvdimm as backend, except the
> > > guarantee of write persistence in case of qemu/host crash.
> > >
> > > We may check the kernel support of MAP_SYNC and the type of vNVDIMM
> > > backend in some management utility (e.g., libvirt?), and deny to
> > > launch QEMU if MAP_SYNC is not supported while files on host NVDIMM
> > > are in use.
> >
> > Instead of making libvirt check if MAP_SYNC is supported and just
> > hope it won't fail, it would be safer to let libvirt tell QEMU
> > that MAP_SYNC must never fail.
>
> For example, add an option "sync" to memory-backend-file, and pass the
> it to qemu_ram_mmap()?
Yes. It could be a OnOffAuto option, "auto" would make QEMU try
to use MAP_SYNC but not fail if it's unavailable. "on" would
make QEMU ensure MAP_SYNC is really enabled.
>
> >
> > However, it looks like kernel 4.14 won't even fail if MAP_SYNC is
> > specified. How exactly can userspace detect if MAP_SYNC is
> > really supported?
>
> Use MAP_SYNC with MAP_SHARED_VALIDATE (both introduced in 4.15
> kernel). Linux kernel 4.15 and later validate whether the MAP_SYNC is
> supported. Because MAP_SHARED_VALIDATE is defined equally to
> (MAP_SHARED | MAP_PRIVATE), it always fails on older kernels which do
> not support MAP_SYNC as well.
Nice.
>
> If we agree to introduce an option "sync" or likelihood, we can do the
> above check in qemu_ram_mmap().
Sounds good to me. Thanks!
--
Eduardo
prev parent reply other threads:[~2018-01-04 11:57 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-27 6:56 [Qemu-devel] [PATCH] util/mmap-alloc: support MAP_SYNC in qemu_ram_mmap() Haozhong Zhang
2018-01-02 16:02 ` Michael S. Tsirkin
2018-01-03 3:16 ` Haozhong Zhang
2018-01-03 13:45 ` Eduardo Habkost
2018-01-04 1:23 ` Haozhong Zhang
2018-01-04 11:57 ` Eduardo Habkost [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180104115707.GC3143@localhost.localdomain \
--to=ehabkost@redhat.com \
--cc=dan.j.williams@intel.com \
--cc=imammedo@redhat.com \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=xiaoguangrong.eric@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.