All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Yongji Xie <xieyongji@bytedance.com>
Cc: "qemu devel list" <qemu-devel@nongnu.org>,
	"Peter Lieven" <pl@kamp.de>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Juan Quintela" <quintela@redhat.com>,
	qemu-block@nongnu.org, "Eduardo Habkost" <eduardo@habkost.net>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"David Woodhouse" <dwmw2@infradead.org>,
	"Stefan Weil" <sw@weilnetz.de>, "Fam Zheng" <fam@euphon.net>,
	"Julia Suvorova" <jusual@redhat.com>,
	"Ronnie Sahlberg" <ronniesahlberg@gmail.com>,
	xen-devel@lists.xenproject.org, "Hanna Reitz" <hreitz@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	eesposit@redhat.com, "Kevin Wolf" <kwolf@redhat.com>,
	"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Paul Durrant" <paul@xen.org>,
	"Aarushi Mehta" <mehta.aaru20@gmail.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Anthony Perard" <anthony.perard@citrix.com>,
	"Richard W.M. Jones" <rjones@redhat.com>,
	"Coiby Xu" <Coiby.Xu@gmail.com>,
	"Stefano Garzarella" <sgarzare@redhat.com>
Subject: Re: [PATCH v3 13/20] block/export: rewrite vduse-blk drain code
Date: Tue, 25 Apr 2023 12:42:41 -0400	[thread overview]
Message-ID: <20230425164241.GC725672@fedora> (raw)
In-Reply-To: <CACycT3suSR+nYhe4z2zuocYsBBVSDBCE+614zT0jfDZCBRveaA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4185 bytes --]

On Fri, Apr 21, 2023 at 11:36:02AM +0800, Yongji Xie wrote:
> Hi Stefan,
> 
> On Thu, Apr 20, 2023 at 7:39 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> >
> > vduse_blk_detach_ctx() waits for in-flight requests using
> > AIO_WAIT_WHILE(). This is not allowed according to a comment in
> > bdrv_set_aio_context_commit():
> >
> >   /*
> >    * Take the old AioContex when detaching it from bs.
> >    * At this point, new_context lock is already acquired, and we are now
> >    * also taking old_context. This is safe as long as bdrv_detach_aio_context
> >    * does not call AIO_POLL_WHILE().
> >    */
> >
> > Use this opportunity to rewrite the drain code in vduse-blk:
> >
> > - Use the BlockExport refcount so that vduse_blk_exp_delete() is only
> >   called when there are no more requests in flight.
> >
> > - Implement .drained_poll() so in-flight request coroutines are stopped
> >   by the time .bdrv_detach_aio_context() is called.
> >
> > - Remove AIO_WAIT_WHILE() from vduse_blk_detach_ctx() to solve the
> >   .bdrv_detach_aio_context() constraint violation. It's no longer
> >   needed due to the previous changes.
> >
> > - Always handle the VDUSE file descriptor, even in drained sections. The
> >   VDUSE file descriptor doesn't submit I/O, so it's safe to handle it in
> >   drained sections. This ensures that the VDUSE kernel code gets a fast
> >   response.
> >
> > - Suspend virtqueue fd handlers in .drained_begin() and resume them in
> >   .drained_end(). This eliminates the need for the
> >   aio_set_fd_handler(is_external=true) flag, which is being removed from
> >   QEMU.
> >
> > This is a long list but splitting it into individual commits would
> > probably lead to git bisect failures - the changes are all related.
> >
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  block/export/vduse-blk.c | 132 +++++++++++++++++++++++++++------------
> >  1 file changed, 93 insertions(+), 39 deletions(-)
> >
> > diff --git a/block/export/vduse-blk.c b/block/export/vduse-blk.c
> > index f7ae44e3ce..35dc8fcf45 100644
> > --- a/block/export/vduse-blk.c
> > +++ b/block/export/vduse-blk.c
> > @@ -31,7 +31,8 @@ typedef struct VduseBlkExport {
> >      VduseDev *dev;
> >      uint16_t num_queues;
> >      char *recon_file;
> > -    unsigned int inflight;
> > +    unsigned int inflight; /* atomic */
> > +    bool vqs_started;
> >  } VduseBlkExport;
> >
> >  typedef struct VduseBlkReq {
> > @@ -41,13 +42,24 @@ typedef struct VduseBlkReq {
> >
> >  static void vduse_blk_inflight_inc(VduseBlkExport *vblk_exp)
> >  {
> > -    vblk_exp->inflight++;
> > +    if (qatomic_fetch_inc(&vblk_exp->inflight) == 0) {
> 
> I wonder why we need to use atomic operations here.

The inflight counter is only modified by the vhost-user export thread,
but it may be read by another thread here:

  static bool vduse_blk_drained_poll(void *opaque)
  {
      BlockExport *exp = opaque;
      VduseBlkExport *vblk_exp = container_of(exp, VduseBlkExport, export);

      return qatomic_read(&vblk_exp->inflight) > 0;

BlockDevOps->drained_poll() calls are invoked when BlockDriverStates are
drained (e.g. blk_drain_all() and related APIs).

> > @@ -355,13 +410,12 @@ static void vduse_blk_exp_delete(BlockExport *exp)
> >      g_free(vblk_exp->handler.serial);
> >  }
> >
> > +/* Called with exp->ctx acquired */
> >  static void vduse_blk_exp_request_shutdown(BlockExport *exp)
> >  {
> >      VduseBlkExport *vblk_exp = container_of(exp, VduseBlkExport, export);
> >
> > -    aio_context_acquire(vblk_exp->export.ctx);
> > -    vduse_blk_detach_ctx(vblk_exp);
> > -    aio_context_acquire(vblk_exp->export.ctx);
> > +    vduse_blk_stop_virtqueues(vblk_exp);
> 
> Can we add a AIO_WAIT_WHILE() here? Then we don't need to
> increase/decrease the BlockExport refcount during I/O processing.

I don't think so because vduse_blk_exp_request_shutdown() is not the
only place where we wait for requests to complete. There would still
need to be away to wait for requests to finish (without calling
AIO_WAIT_WHILE()) in vduse_blk_drained_poll().

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2023-04-25 16:43 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-20 11:37 [PATCH v3 00/20] block: remove aio_disable_external() API Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 01/20] hw/qdev: introduce qdev_is_realized() helper Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 02/20] virtio-scsi: avoid race between unplug and transport event Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 03/20] virtio-scsi: stop using aio_disable_external() during unplug Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 04/20] block/export: only acquire AioContext once for vhost_user_server_stop() Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 05/20] util/vhost-user-server: rename refcount to in_flight counter Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 06/20] block/export: wait for vhost-user-blk requests when draining Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 07/20] block/export: stop using is_external in vhost-user-blk server Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 08/20] hw/xen: do not use aio_set_fd_handler(is_external=true) in xen_xenstore Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 09/20] block: add blk_in_drain() API Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 10/20] block: drain from main loop thread in bdrv_co_yield_to_drain() Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 11/20] xen-block: implement BlockDevOps->drained_begin() Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 12/20] hw/xen: do not set is_external=true on evtchn fds Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 13/20] block/export: rewrite vduse-blk drain code Stefan Hajnoczi
2023-04-21  3:36   ` Yongji Xie
2023-04-25 16:42     ` Stefan Hajnoczi [this message]
2023-04-26  2:23       ` Yongji Xie
2023-04-20 11:37 ` [PATCH v3 14/20] block/export: don't require AioContext lock around blk_exp_ref/unref() Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 15/20] block/fuse: do not set is_external=true on FUSE fd Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 16/20] virtio: make it possible to detach host notifier from any thread Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 17/20] virtio-blk: implement BlockDevOps->drained_begin() Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 18/20] virtio-scsi: " Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 19/20] virtio: do not set is_external=true on host notifiers Stefan Hajnoczi
2023-04-20 11:37 ` [PATCH v3 20/20] aio: remove aio_disable_external() API Stefan Hajnoczi
2023-04-20 13:44   ` Philippe Mathieu-Daudé
2023-04-25 16:29     ` Stefan Hajnoczi
2023-04-20 13:39 ` [PATCH v3 00/20] block: " Philippe Mathieu-Daudé
2023-04-25 16:29   ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230425164241.GC725672@fedora \
    --to=stefanha@redhat.com \
    --cc=Coiby.Xu@gmail.com \
    --cc=anthony.perard@citrix.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=eduardo@habkost.net \
    --cc=eesposit@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jusual@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mehta.aaru20@gmail.com \
    --cc=mst@redhat.com \
    --cc=paul@xen.org \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=pl@kamp.de \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richard.henderson@linaro.org \
    --cc=rjones@redhat.com \
    --cc=ronniesahlberg@gmail.com \
    --cc=sgarzare@redhat.com \
    --cc=sstabellini@kernel.org \
    --cc=sw@weilnetz.de \
    --cc=xen-devel@lists.xenproject.org \
    --cc=xieyongji@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.