From: Pankaj Gupta <pagupta@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
KVM list <kvm@vger.kernel.org>,
Qemu Developers <qemu-devel@nongnu.org>,
linux-nvdimm <linux-nvdimm@ml01.01.org>,
Linux MM <linux-mm@kvack.org>, Jan Kara <jack@suse.cz>,
Stefan Hajnoczi <stefanha@redhat.com>,
Rik van Riel <riel@surriel.com>,
haozhong zhang <haozhong.zhang@intel.com>,
Nitesh Narayan Lal <nilal@redhat.com>,
Kevin Wolf <kwolf@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
ross zwisler <ross.zwisler@intel.com>,
David Hildenbrand <david@redhat.com>,
xiaoguangrong eric <xiaoguangrong.eric@gmail.com>,
Christoph Hellwig <hch@infradead.org>,
Marcel Apfelbaum <marcel@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
niteshnarayanlal@hotmail.com, Igor Mammedov <imammedo@redhat.com>,
lcapitulino@redhat.com
Subject: Re: [Qemu-devel] [RFC v2 2/2] pmem: device flush over VIRTIO
Date: Thu, 26 Apr 2018 13:13:44 -0400 (EDT) [thread overview]
Message-ID: <1302242642.23016855.1524762824836.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <CAPcyv4jv-hJNKJxak98T7aCnWztVEDTE8o=8fjvOrVmrTfyjdA@mail.gmail.com>
> >
> >>
> >> On Wed, Apr 25, 2018 at 04:54:14PM +0530, Pankaj Gupta wrote:
> >> > This patch adds functionality to perform
> >> > flush from guest to hosy over VIRTIO
> >> > when 'ND_REGION_VIRTIO'flag is set on
> >> > nd_negion. Flag is set by 'virtio-pmem'
> >> > driver.
> >> >
> >> > Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
> >> > ---
> >> > drivers/nvdimm/region_devs.c | 7 +++++++
> >> > 1 file changed, 7 insertions(+)
> >> >
> >> > diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> >> > index a612be6..6c6454e 100644
> >> > --- a/drivers/nvdimm/region_devs.c
> >> > +++ b/drivers/nvdimm/region_devs.c
> >> > @@ -20,6 +20,7 @@
> >> > #include <linux/nd.h>
> >> > #include "nd-core.h"
> >> > #include "nd.h"
> >> > +#include <linux/virtio_pmem.h>
> >> >
> >> > /*
> >> > * For readq() and writeq() on 32-bit builds, the hi-lo, lo-hi order is
> >> > @@ -1074,6 +1075,12 @@ void nvdimm_flush(struct nd_region *nd_region)
> >> > struct nd_region_data *ndrd = dev_get_drvdata(&nd_region->dev);
> >> > int i, idx;
> >> >
> >> > + /* call PV device flush */
> >> > + if (test_bit(ND_REGION_VIRTIO, &nd_region->flags)) {
> >> > + virtio_pmem_flush(&nd_region->dev);
> >> > + return;
> >> > + }
> >>
> >> How does libnvdimm know when flush has completed?
> >>
> >> Callers expect the flush to be finished when nvdimm_flush() returns but
> >> the virtio driver has only queued the request, it hasn't waited for
> >> completion!
> >
> > I tried to implement what nvdimm does right now. It just writes to
> > flush hint address to make sure data persists.
>
> nvdimm_flush() is currently expected to be synchronous. Currently it
> is sfence(); write to special address; sfence(). By the time the
> second sfence returns the data is flushed. So you would need to make
> this virtio flush interface synchronous as well, but that appears
> problematic to stop the guest for unbounded amounts of time. Instead,
> you need to rework nvdimm_flush() and the pmem driver to make these
> flush requests asynchronous and add the plumbing for completion
> callbacks via bio_endio().
o.k.
>
> > I just did not want to block guest write requests till host side
> > fsync completes.
>
> You must complete the flush before bio_endio(), otherwise you're
> violating the expectations of the guest filesystem/block-layer.
sure!
>
> >
> > be worse for operations on different guest files because all these
> > operations would happen
> > ultimately on same file at host.
> >
> > I think with current way, we can achieve an asynchronous queuing mechanism
> > on cost of not
> > 100% sure when fsync would complete but it is assured it will happen. Also,
> > its entire block
> > flush.
>
> No, again, that's broken. We need to add the plumbing for
> communicating the fsync() completion relative the WRITE_{FLUSH,FUA}
> bio in the guest.
Sure. Thanks Dan & Stefan for the explanation and review.
Best regards,
Pankaj
next prev parent reply other threads:[~2018-04-26 17:13 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-25 11:24 [Qemu-devel] [RFC v2 0/2] kvm "fake DAX" device flushing Pankaj Gupta
2018-04-25 11:24 ` [Qemu-devel] [RFC v2 1/2] virtio: add pmem driver Pankaj Gupta
2018-04-25 14:21 ` Dan Williams
2018-04-25 14:43 ` Dan Williams
2018-04-26 12:27 ` Jeff Moyer
2018-04-26 17:15 ` Pankaj Gupta
2018-04-26 17:24 ` Jeff Moyer
2018-04-25 14:52 ` Michael S. Tsirkin
2018-04-25 15:11 ` Pankaj Gupta
2018-04-26 13:12 ` Stefan Hajnoczi
2018-04-26 15:44 ` Pankaj Gupta
2018-04-27 13:31 ` Stefan Hajnoczi
2018-04-28 10:48 ` Pankaj Gupta
2018-04-25 11:24 ` [Qemu-devel] [RFC v2 2/2] pmem: device flush over VIRTIO Pankaj Gupta
2018-04-25 14:23 ` Dan Williams
2018-04-25 14:47 ` Pankaj Gupta
2018-04-26 13:15 ` Stefan Hajnoczi
2018-04-26 16:40 ` Pankaj Gupta
2018-04-26 16:57 ` Dan Williams
2018-04-26 17:13 ` Pankaj Gupta [this message]
2018-04-25 11:24 ` [Qemu-devel] [RFC v2] qemu: Add virtio pmem device Pankaj Gupta
2018-04-25 11:35 ` no-reply
2018-04-25 11:58 ` Pankaj Gupta
2018-04-25 14:23 ` Eric Blake
2018-04-25 14:51 ` Pankaj Gupta
2018-04-25 11:46 ` no-reply
2018-04-25 14:25 ` Eric Blake
2018-04-25 14:55 ` Pankaj Gupta
2018-04-26 13:24 ` Stefan Hajnoczi
2018-04-26 16:43 ` Pankaj Gupta
2018-06-01 12:24 ` [Qemu-devel] [RFC v2 0/2] kvm "fake DAX" device flushing Igor Mammedov
2018-06-04 5:56 ` Pankaj Gupta
2018-06-04 9:55 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1302242642.23016855.1524762824836.JavaMail.zimbra@redhat.com \
--to=pagupta@redhat.com \
--cc=dan.j.williams@intel.com \
--cc=david@redhat.com \
--cc=haozhong.zhang@intel.com \
--cc=hch@infradead.org \
--cc=imammedo@redhat.com \
--cc=jack@suse.cz \
--cc=kvm@vger.kernel.org \
--cc=kwolf@redhat.com \
--cc=lcapitulino@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@ml01.01.org \
--cc=marcel@redhat.com \
--cc=mst@redhat.com \
--cc=nilal@redhat.com \
--cc=niteshnarayanlal@hotmail.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=riel@surriel.com \
--cc=ross.zwisler@intel.com \
--cc=stefanha@gmail.com \
--cc=stefanha@redhat.com \
--cc=xiaoguangrong.eric@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).