All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Stefano Garzarella <sgarzare@redhat.com>
Cc: qemu-devel@nongnu.org, xen-devel@lists.xenproject.org,
	"Julia Suvorova" <jusual@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Kevin Wolf" <kwolf@redhat.com>, "Fam Zheng" <fam@euphon.net>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	qemu-block@nongnu.org, "Paul Durrant" <paul@xen.org>,
	"Anthony Perard" <anthony.perard@citrix.com>,
	"Aarushi Mehta" <mehta.aaru20@gmail.com>,
	"Hanna Reitz" <hreitz@redhat.com>
Subject: Re: [PATCH 1/6] block: add blk_io_plug_call() API
Date: Tue, 23 May 2023 11:47:08 -0400	[thread overview]
Message-ID: <20230523154708.GB96478@fedora> (raw)
In-Reply-To: <mzxjz4d3ab3sq6grwsle6wlacysh2uffz42ojpdze3hmqimbr5@fxgkad47nnim>

[-- Attachment #1: Type: text/plain, Size: 7534 bytes --]

On Fri, May 19, 2023 at 10:45:57AM +0200, Stefano Garzarella wrote:
> On Wed, May 17, 2023 at 06:10:17PM -0400, Stefan Hajnoczi wrote:
> > Introduce a new API for thread-local blk_io_plug() that does not
> > traverse the block graph. The goal is to make blk_io_plug() multi-queue
> > friendly.
> > 
> > Instead of having block drivers track whether or not we're in a plugged
> > section, provide an API that allows them to defer a function call until
> > we're unplugged: blk_io_plug_call(fn, opaque). If blk_io_plug_call() is
> > called multiple times with the same fn/opaque pair, then fn() is only
> > called once at the end of the function - resulting in batching.
> > 
> > This patch introduces the API and changes blk_io_plug()/blk_io_unplug().
> > blk_io_plug()/blk_io_unplug() no longer require a BlockBackend argument
> > because the plug state is now thread-local.
> > 
> > Later patches convert block drivers to blk_io_plug_call() and then we
> > can finally remove .bdrv_co_io_plug() once all block drivers have been
> > converted.
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> > MAINTAINERS                       |   1 +
> > include/sysemu/block-backend-io.h |  13 +--
> > block/block-backend.c             |  22 -----
> > block/plug.c                      | 159 ++++++++++++++++++++++++++++++
> > hw/block/dataplane/xen-block.c    |   8 +-
> > hw/block/virtio-blk.c             |   4 +-
> > hw/scsi/virtio-scsi.c             |   6 +-
> > block/meson.build                 |   1 +
> > 8 files changed, 173 insertions(+), 41 deletions(-)
> > create mode 100644 block/plug.c
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 50585117a0..574202295c 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -2644,6 +2644,7 @@ F: util/aio-*.c
> > F: util/aio-*.h
> > F: util/fdmon-*.c
> > F: block/io.c
> > +F: block/plug.c
> > F: migration/block*
> > F: include/block/aio.h
> > F: include/block/aio-wait.h
> > diff --git a/include/sysemu/block-backend-io.h b/include/sysemu/block-backend-io.h
> > index d62a7ee773..be4dcef59d 100644
> > --- a/include/sysemu/block-backend-io.h
> > +++ b/include/sysemu/block-backend-io.h
> > @@ -100,16 +100,9 @@ void blk_iostatus_set_err(BlockBackend *blk, int error);
> > int blk_get_max_iov(BlockBackend *blk);
> > int blk_get_max_hw_iov(BlockBackend *blk);
> > 
> > -/*
> > - * blk_io_plug/unplug are thread-local operations. This means that multiple
> > - * IOThreads can simultaneously call plug/unplug, but the caller must ensure
> > - * that each unplug() is called in the same IOThread of the matching plug().
> > - */
> > -void coroutine_fn blk_co_io_plug(BlockBackend *blk);
> > -void co_wrapper blk_io_plug(BlockBackend *blk);
> > -
> > -void coroutine_fn blk_co_io_unplug(BlockBackend *blk);
> > -void co_wrapper blk_io_unplug(BlockBackend *blk);
> > +void blk_io_plug(void);
> > +void blk_io_unplug(void);
> > +void blk_io_plug_call(void (*fn)(void *), void *opaque);
> > 
> > AioContext *blk_get_aio_context(BlockBackend *blk);
> > BlockAcctStats *blk_get_stats(BlockBackend *blk);
> > diff --git a/block/block-backend.c b/block/block-backend.c
> > index ca537cd0ad..1f1d226ba6 100644
> > --- a/block/block-backend.c
> > +++ b/block/block-backend.c
> > @@ -2568,28 +2568,6 @@ void blk_add_insert_bs_notifier(BlockBackend *blk, Notifier *notify)
> >     notifier_list_add(&blk->insert_bs_notifiers, notify);
> > }
> > 
> > -void coroutine_fn blk_co_io_plug(BlockBackend *blk)
> > -{
> > -    BlockDriverState *bs = blk_bs(blk);
> > -    IO_CODE();
> > -    GRAPH_RDLOCK_GUARD();
> > -
> > -    if (bs) {
> > -        bdrv_co_io_plug(bs);
> > -    }
> > -}
> > -
> > -void coroutine_fn blk_co_io_unplug(BlockBackend *blk)
> > -{
> > -    BlockDriverState *bs = blk_bs(blk);
> > -    IO_CODE();
> > -    GRAPH_RDLOCK_GUARD();
> > -
> > -    if (bs) {
> > -        bdrv_co_io_unplug(bs);
> > -    }
> > -}
> > -
> > BlockAcctStats *blk_get_stats(BlockBackend *blk)
> > {
> >     IO_CODE();
> > diff --git a/block/plug.c b/block/plug.c
> > new file mode 100644
> > index 0000000000..6738a568ba
> > --- /dev/null
> > +++ b/block/plug.c
> > @@ -0,0 +1,159 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +/*
> > + * Block I/O plugging
> > + *
> > + * Copyright Red Hat.
> > + *
> > + * This API defers a function call within a blk_io_plug()/blk_io_unplug()
> > + * section, allowing multiple calls to batch up. This is a performance
> > + * optimization that is used in the block layer to submit several I/O requests
> > + * at once instead of individually:
> > + *
> > + *   blk_io_plug(); <-- start of plugged region
> > + *   ...
> > + *   blk_io_plug_call(my_func, my_obj); <-- deferred my_func(my_obj) call
> > + *   blk_io_plug_call(my_func, my_obj); <-- another
> > + *   blk_io_plug_call(my_func, my_obj); <-- another
> > + *   ...
> > + *   blk_io_unplug(); <-- end of plugged region, my_func(my_obj) is called once
> > + *
> > + * This code is actually generic and not tied to the block layer. If another
> > + * subsystem needs this functionality, it could be renamed.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qemu/coroutine-tls.h"
> > +#include "qemu/notify.h"
> > +#include "qemu/thread.h"
> > +#include "sysemu/block-backend.h"
> > +
> > +/* A function call that has been deferred until unplug() */
> > +typedef struct {
> > +    void (*fn)(void *);
> > +    void *opaque;
> > +} UnplugFn;
> > +
> > +/* Per-thread state */
> > +typedef struct {
> > +    unsigned count;       /* how many times has plug() been called? */
> > +    GArray *unplug_fns;   /* functions to call at unplug time */
> > +} Plug;
> > +
> > +/* Use get_ptr_plug() to fetch this thread-local value */
> > +QEMU_DEFINE_STATIC_CO_TLS(Plug, plug);
> > +
> > +/* Called at thread cleanup time */
> > +static void blk_io_plug_atexit(Notifier *n, void *value)
> > +{
> > +    Plug *plug = get_ptr_plug();
> > +    g_array_free(plug->unplug_fns, TRUE);
> > +}
> > +
> > +/* This won't involve coroutines, so use __thread */
> > +static __thread Notifier blk_io_plug_atexit_notifier;
> > +
> > +/**
> > + * blk_io_plug_call:
> > + * @fn: a function pointer to be invoked
> > + * @opaque: a user-defined argument to @fn()
> > + *
> > + * Call @fn(@opaque) immediately if not within a blk_io_plug()/blk_io_unplug()
> > + * section.
> 
> Just to understand better, what if two BlockDrivers share the same
> iothread but one calls blk_io_plug()/blk_io_unplug(), while the other
> calls this function not in a blk_io_plug()/blk_io_unplug() section?
> 
> If the call is in the middle of the other BlockDriver's section, it is
> deferred, right?

Yes, the call is deferred until blk_io_unplug().

> Is this situation possible?

One scenario I can think of is when aio_poll() is called between
plug/unplug. In that case, some I/O associated with device B might
happen while device A is with plug/unplug.

> Or should we prevent blk_io_plug_call() from being called out of a
> blk_io_plug()/blk_io_unplug() section?

blk_io_plug_call() is called outside blk_io_plug()/blk_io_unplug() when
device emulation doesn't use plug/unplug. For example, IDE doesn't use
it but still calls down into the same Linux AIO or io_uring code that
invokes blk_io_plug_call(). This is why blk_io_plug_call() calls fn()
immediately when invoked outside plug/unplug.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2023-05-23 15:47 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-17 22:10 [PATCH 0/6] block: add blk_io_plug_call() API Stefan Hajnoczi
2023-05-17 22:10 ` [PATCH 1/6] " Stefan Hajnoczi
2023-05-19  0:04   ` Eric Blake
2023-05-23 15:47     ` Stefan Hajnoczi
2023-05-19  8:45   ` Stefano Garzarella
2023-05-23 15:47     ` Stefan Hajnoczi [this message]
2023-05-24  8:05       ` Stefano Garzarella
2023-05-17 22:10 ` [PATCH 2/6] block/nvme: convert to " Stefan Hajnoczi
2023-05-19  0:06   ` Eric Blake
2023-05-19  8:46   ` Stefano Garzarella
2023-05-23 15:47     ` Stefan Hajnoczi
2023-05-17 22:10 ` [PATCH 3/6] block/blkio: " Stefan Hajnoczi
2023-05-19  0:12   ` Eric Blake
2023-05-19  8:47   ` Stefano Garzarella
2023-05-23 15:48     ` Stefan Hajnoczi
2023-05-17 22:10 ` [PATCH 4/6] block/io_uring: " Stefan Hajnoczi
2023-05-19  0:18   ` Eric Blake
2023-05-23 15:48     ` Stefan Hajnoczi
2023-05-17 22:10 ` [PATCH 5/6] block/linux-aio: " Stefan Hajnoczi
2023-05-19  0:28   ` Eric Blake
2023-05-17 22:10 ` [PATCH 6/6] block: remove bdrv_co_io_plug() API Stefan Hajnoczi
2023-05-19  0:29   ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230523154708.GB96478@fedora \
    --to=stefanha@redhat.com \
    --cc=anthony.perard@citrix.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jusual@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mehta.aaru20@gmail.com \
    --cc=mst@redhat.com \
    --cc=paul@xen.org \
    --cc=pbonzini@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=sgarzare@redhat.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.