qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Pankaj Gupta <pagupta@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: cohuck@redhat.com, Jan Kara <jack@suse.cz>,
	KVM list <kvm@vger.kernel.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>, david <david@fromorbit.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	virtualization@lists.linux-foundation.org,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Ross Zwisler <zwisler@kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Dave Jiang <dave.jiang@intel.com>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Vishal L Verma <vishal.l.verma@intel.com>,
	David Hildenbrand <david@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Christoph Hellwig <hch@infradead.org>,
	Linux ACPI <linux-acpi@vger.kernel.org>,
	jmoyer <jmoyer@redhat.com>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	Len Brown <lenb@kernel.org>,
	kilobyte@angband.pl, Rik van Riel <riel@surriel.com>,
	yuval shaia <yuval.shaia@oracle.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	lcapitulino@redhat.com, Kevin Wolf <kwolf@redhat.com>,
	Nitesh Narayan Lal <nilal@redhat.com>,
	Theodore Ts'o <tytso@mit.edu>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	"Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Igor Mammedov <imammedo@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v5 1/6] libnvdimm: nd_region flush callback support
Date: Thu, 11 Apr 2019 11:57:38 -0400 (EDT)	[thread overview]
Message-ID: <1463291806.21158433.1554998258746.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <CAPcyv4i1FjBzfcWtzeCvj-giHa9xaX0dMQ==q4LZC_L8VEqkHg@mail.gmail.com>



> >
> > This patch adds functionality to perform flush from guest
> > to host over VIRTIO. We are registering a callback based
> > on 'nd_region' type. virtio_pmem driver requires this special
> > flush function. For rest of the region types we are registering
> > existing flush function. Report error returned by host fsync
> > failure to userspace.
> >
> > This also handles asynchronous flush requests from the block layer
> > by creating a child bio and chaining it with parent bio.
> >
> > Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
> > ---bio_chain Dan williams
> [..]
> > diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> > index b4ef7d9ff22e..fb1041ab32a6 100644
> > --- a/drivers/nvdimm/region_devs.c
> > +++ b/drivers/nvdimm/region_devs.c
> > @@ -295,7 +295,9 @@ static ssize_t deep_flush_store(struct device *dev,
> > struct device_attribute *att
> >                 return rc;
> >         if (!flush)
> >                 return -EINVAL;
> > -       nvdimm_flush(nd_region);
> > +       rc = nvdimm_flush(nd_region, NULL, false);
> > +       if (rc)
> > +               return rc;
> >
> >         return len;
> >  }
> > @@ -1085,6 +1087,11 @@ static struct nd_region *nd_region_create(struct
> > nvdimm_bus *nvdimm_bus,
> >         dev->of_node = ndr_desc->of_node;
> >         nd_region->ndr_size = resource_size(ndr_desc->res);
> >         nd_region->ndr_start = ndr_desc->res->start;
> > +       if (ndr_desc->flush)
> > +               nd_region->flush = ndr_desc->flush;
> > +       else
> > +               nd_region->flush = generic_nvdimm_flush;
> > +
> >         nd_device_register(dev);
> >
> >         return nd_region;
> > @@ -1125,11 +1132,36 @@ struct nd_region
> > *nvdimm_volatile_region_create(struct nvdimm_bus *nvdimm_bus,
> >  }
> >  EXPORT_SYMBOL_GPL(nvdimm_volatile_region_create);
> >
> > +int nvdimm_flush(struct nd_region *nd_region, struct bio *bio, bool async)
> > +{
> 
> I don't quite see the point of the 'async' argument. All the usages of
> this routine are either
> 
> nvdimm_flush(nd_region, bio, true)
> ...or:
> nvdimm_flush(nd_region, NULL, false)

Agree.

> 
> ...so why not gate async behavior on the presence of the 'bio' argument?

Sure.

> 
> 
> > +       int rc = 0;
> > +
> > +       /* Create child bio for asynchronous flush and chain with
> > +        * parent bio. Otherwise directly call nd_region flush.
> > +        */
> > +       if (async && bio->bi_iter.bi_sector != -1) {
> > +
> > +               struct bio *child = bio_alloc(GFP_ATOMIC, 0);
> > +
> > +               if (!child)
> > +                       return -ENOMEM;
> > +               bio_copy_dev(child, bio);
> > +               child->bi_opf = REQ_PREFLUSH;
> > +               child->bi_iter.bi_sector = -1;
> > +               bio_chain(child, bio);
> > +               submit_bio(child);
> 
> I understand how this works, but it's a bit too "magical" for my
> taste. I would prefer that all flush implementations take an optional
> 'bio' argument rather than rely on the make_request implementation to
> stash the bio away on a driver specific list.

I did this to make use of "bio_chain" for chaining child bio for async flush
suggested [1]. Are you saying to remove this and just call "flush" based on 
bio argument? Or I implemented the 'bio_chain' request entirely wrong?

[1] https://lkml.org/lkml/2018/9/27/1028

> 
> > +       } else {
> > +               if (nd_region->flush(nd_region))
> > +                       rc = -EIO;
> 
> Given the common case wants to be fast and synchronous I think we
> should try to avoid retpoline overhead by default. So something like
> this:
> 
> if (nd_region->flush == generic_nvdimm_flush)
>     rc = generic_nvdimm_flush(...);

Sure.

Thanks,
Pankaj
> 
> 

WARNING: multiple messages have this Message-ID (diff)
From: Pankaj Gupta <pagupta@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jan Kara <jack@suse.cz>, KVM list <kvm@vger.kernel.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>, david <david@fromorbit.com>,
	Qemu Developers <qemu-devel@nongnu.org>,
	virtualization@lists.linux-foundation.org,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Ross Zwisler <zwisler@kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Dave Jiang <dave.jiang@intel.com>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Vishal L Verma <vishal.l.verma@intel.com>,
	David Hildenbrand <david@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Christoph Hellwig <hch@infradead.org>,
	Linux ACPI <linux-acpi@vger.kernel.org>,
	jmoyer <jmoyer@redhat.com>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	Len Brown <lenb@kernel.org>,
	kilobyte@angband.pl, Rik van Riel <riel@surriel.com>,
	yuval shaia <yuval.shaia@oracle.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	lcapitulino@redhat.com, Kevin Wolf <kwolf@redhat.com>,
	Nitesh Narayan Lal <nilal@redhat.com>,
	Theodore Ts'o <tytso@mit.edu>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	cohuck@redhat.com, "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>
Subject: Re: [Qemu-devel] [PATCH v5 1/6] libnvdimm: nd_region flush callback support
Date: Thu, 11 Apr 2019 11:57:38 -0400 (EDT)	[thread overview]
Message-ID: <1463291806.21158433.1554998258746.JavaMail.zimbra@redhat.com> (raw)
Message-ID: <20190411155738.AUjmT0UO9phwxpAj2x4ik2Uw5kkRzX__hR53apRI3zw@z> (raw)
In-Reply-To: <CAPcyv4i1FjBzfcWtzeCvj-giHa9xaX0dMQ==q4LZC_L8VEqkHg@mail.gmail.com>



> >
> > This patch adds functionality to perform flush from guest
> > to host over VIRTIO. We are registering a callback based
> > on 'nd_region' type. virtio_pmem driver requires this special
> > flush function. For rest of the region types we are registering
> > existing flush function. Report error returned by host fsync
> > failure to userspace.
> >
> > This also handles asynchronous flush requests from the block layer
> > by creating a child bio and chaining it with parent bio.
> >
> > Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
> > ---bio_chain Dan williams
> [..]
> > diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> > index b4ef7d9ff22e..fb1041ab32a6 100644
> > --- a/drivers/nvdimm/region_devs.c
> > +++ b/drivers/nvdimm/region_devs.c
> > @@ -295,7 +295,9 @@ static ssize_t deep_flush_store(struct device *dev,
> > struct device_attribute *att
> >                 return rc;
> >         if (!flush)
> >                 return -EINVAL;
> > -       nvdimm_flush(nd_region);
> > +       rc = nvdimm_flush(nd_region, NULL, false);
> > +       if (rc)
> > +               return rc;
> >
> >         return len;
> >  }
> > @@ -1085,6 +1087,11 @@ static struct nd_region *nd_region_create(struct
> > nvdimm_bus *nvdimm_bus,
> >         dev->of_node = ndr_desc->of_node;
> >         nd_region->ndr_size = resource_size(ndr_desc->res);
> >         nd_region->ndr_start = ndr_desc->res->start;
> > +       if (ndr_desc->flush)
> > +               nd_region->flush = ndr_desc->flush;
> > +       else
> > +               nd_region->flush = generic_nvdimm_flush;
> > +
> >         nd_device_register(dev);
> >
> >         return nd_region;
> > @@ -1125,11 +1132,36 @@ struct nd_region
> > *nvdimm_volatile_region_create(struct nvdimm_bus *nvdimm_bus,
> >  }
> >  EXPORT_SYMBOL_GPL(nvdimm_volatile_region_create);
> >
> > +int nvdimm_flush(struct nd_region *nd_region, struct bio *bio, bool async)
> > +{
> 
> I don't quite see the point of the 'async' argument. All the usages of
> this routine are either
> 
> nvdimm_flush(nd_region, bio, true)
> ...or:
> nvdimm_flush(nd_region, NULL, false)

Agree.

> 
> ...so why not gate async behavior on the presence of the 'bio' argument?

Sure.

> 
> 
> > +       int rc = 0;
> > +
> > +       /* Create child bio for asynchronous flush and chain with
> > +        * parent bio. Otherwise directly call nd_region flush.
> > +        */
> > +       if (async && bio->bi_iter.bi_sector != -1) {
> > +
> > +               struct bio *child = bio_alloc(GFP_ATOMIC, 0);
> > +
> > +               if (!child)
> > +                       return -ENOMEM;
> > +               bio_copy_dev(child, bio);
> > +               child->bi_opf = REQ_PREFLUSH;
> > +               child->bi_iter.bi_sector = -1;
> > +               bio_chain(child, bio);
> > +               submit_bio(child);
> 
> I understand how this works, but it's a bit too "magical" for my
> taste. I would prefer that all flush implementations take an optional
> 'bio' argument rather than rely on the make_request implementation to
> stash the bio away on a driver specific list.

I did this to make use of "bio_chain" for chaining child bio for async flush
suggested [1]. Are you saying to remove this and just call "flush" based on 
bio argument? Or I implemented the 'bio_chain' request entirely wrong?

[1] https://lkml.org/lkml/2018/9/27/1028

> 
> > +       } else {
> > +               if (nd_region->flush(nd_region))
> > +                       rc = -EIO;
> 
> Given the common case wants to be fast and synchronous I think we
> should try to avoid retpoline overhead by default. So something like
> this:
> 
> if (nd_region->flush == generic_nvdimm_flush)
>     rc = generic_nvdimm_flush(...);

Sure.

Thanks,
Pankaj
> 
> 


  parent reply	other threads:[~2019-04-11 15:57 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-10  4:08 [Qemu-devel] [PATCH v5 0/6] virtio pmem driver Pankaj Gupta
2019-04-10  4:08 ` Pankaj Gupta
2019-04-10  4:08 ` [Qemu-devel] [PATCH v5 1/6] libnvdimm: nd_region flush callback support Pankaj Gupta
2019-04-10  4:08   ` Pankaj Gupta
2019-04-11 14:51   ` Dan Williams
2019-04-11 14:51     ` Dan Williams
2019-04-11 15:57     ` Pankaj Gupta [this message]
2019-04-11 15:57       ` Pankaj Gupta
2019-04-11 16:09       ` Dan Williams
2019-04-11 16:09         ` Dan Williams
2019-04-11 16:23         ` Pankaj Gupta
2019-04-11 16:23           ` Pankaj Gupta
2019-04-12  8:32     ` Jan Kara
2019-04-12  8:32       ` Jan Kara
2019-04-12 13:12       ` Jeff Moyer
2019-04-12 13:12         ` Jeff Moyer
2019-04-18  6:27         ` Pankaj Gupta
2019-04-18  6:27           ` Pankaj Gupta
2019-04-18 16:05         ` Dan Williams
2019-04-18 16:05           ` Dan Williams
2019-04-18 16:10           ` Jeff Moyer
2019-04-18 16:10             ` Jeff Moyer
2019-04-18 16:18           ` Christoph Hellwig
2019-04-18 16:18             ` Christoph Hellwig
2019-04-18 18:14             ` Dan Williams
2019-04-18 18:14               ` Dan Williams
2019-04-22 15:51               ` Jeff Moyer
2019-04-22 15:51                 ` Jeff Moyer
2019-04-22 19:44                 ` Dan Williams
2019-04-22 19:44                   ` Dan Williams
2019-04-22 21:03                   ` Jeff Moyer
2019-04-22 21:03                     ` Jeff Moyer
2019-04-23  4:07                     ` Pankaj Gupta
2019-04-23  4:07                       ` Pankaj Gupta
2019-04-10  4:08 ` [Qemu-devel] [PATCH v5 2/5] virtio-pmem: Add virtio pmem driver Pankaj Gupta
2019-04-10  4:08   ` Pankaj Gupta
2019-04-10 12:24   ` Cornelia Huck
2019-04-10 12:24     ` Cornelia Huck
2019-04-10 14:38     ` Yuval Shaia
2019-04-10 14:38       ` Yuval Shaia
2019-04-10 15:44       ` Pankaj Gupta
2019-04-10 15:44         ` Pankaj Gupta
2019-04-10 15:38     ` Pankaj Gupta
2019-04-10 15:38       ` Pankaj Gupta
2019-04-10 13:12   ` Michael S. Tsirkin
2019-04-10 13:12     ` Michael S. Tsirkin
2019-04-10 14:03     ` Pankaj Gupta
2019-04-10 14:03       ` Pankaj Gupta
2019-04-10 14:31       ` Cornelia Huck
2019-04-10 14:31         ` Cornelia Huck
2019-04-10 16:46         ` Michael S. Tsirkin
2019-04-10 16:46           ` Michael S. Tsirkin
2019-04-10 16:52           ` Cornelia Huck
2019-04-10 16:52             ` Cornelia Huck
2019-04-10 14:41   ` Yuval Shaia
2019-04-10 14:41     ` Yuval Shaia
2019-04-10  4:08 ` [Qemu-devel] [PATCH v5 3/6] libnvdimm: add dax_dev sync flag Pankaj Gupta
2019-04-10  4:08   ` Pankaj Gupta
2019-04-10  8:28   ` Jan Kara
2019-04-10  8:28     ` Jan Kara
2019-04-10  8:38     ` Pankaj Gupta
2019-04-10  8:38       ` Pankaj Gupta
2019-04-11 14:56   ` Dan Williams
2019-04-11 14:56     ` Dan Williams
2019-04-11 15:39     ` Pankaj Gupta
2019-04-11 15:39       ` Pankaj Gupta
2019-04-10  4:08 ` [Qemu-devel] [PATCH v5 4/6] dax: check synchronous mapping is supported Pankaj Gupta
2019-04-10  4:08   ` Pankaj Gupta
2019-04-10  8:25   ` Jan Kara
2019-04-10  8:25     ` Jan Kara
2019-04-10  8:31     ` Pankaj Gupta
2019-04-10  8:31       ` Pankaj Gupta
2019-04-10  4:08 ` [Qemu-devel] [PATCH v5 5/6] ext4: disable map_sync for async flush Pankaj Gupta
2019-04-10  4:08   ` Pankaj Gupta
2019-04-10  4:08 ` [Qemu-devel] [PATCH v5 6/6] xfs: " Pankaj Gupta
2019-04-10  4:08   ` Pankaj Gupta
2019-04-10  8:08 ` [Qemu-devel] [PATCH v5 0/6] virtio pmem driver Arkadiusz Miśkiewicz
2019-04-10  8:08   ` Arkadiusz Miśkiewicz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1463291806.21158433.1554998258746.JavaMail.zimbra@redhat.com \
    --to=pagupta@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=cohuck@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=dave.jiang@intel.com \
    --cc=david@fromorbit.com \
    --cc=david@redhat.com \
    --cc=hch@infradead.org \
    --cc=imammedo@redhat.com \
    --cc=jack@suse.cz \
    --cc=jasowang@redhat.com \
    --cc=jmoyer@redhat.com \
    --cc=kilobyte@angband.pl \
    --cc=kvm@vger.kernel.org \
    --cc=kwolf@redhat.com \
    --cc=lcapitulino@redhat.com \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=nilal@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=riel@surriel.com \
    --cc=rjw@rjwysocki.net \
    --cc=stefanha@redhat.com \
    --cc=tytso@mit.edu \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    --cc=xiaoguangrong.eric@gmail.com \
    --cc=yuval.shaia@oracle.com \
    --cc=zwisler@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).