qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Cc: "Fam Zheng" <fam@euphon.net>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@virtuozzo.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	qemu-block@nongnu.org, "Juan Quintela" <quintela@redhat.com>,
	qemu-devel@nongnu.org, "John Snow" <jsnow@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Hanna Reitz" <hreitz@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Eric Blake" <eblake@redhat.com>
Subject: Re: [PATCH v5 28/31] block.c: assert BQL lock held in bdrv_co_invalidate_cache
Date: Wed, 19 Jan 2022 19:34:42 +0100	[thread overview]
Message-ID: <YehZwkGVPK6phrc2@redhat.com> (raw)
In-Reply-To: <b1e190b3-7466-0d4e-554b-bd9d4ce5a43e@redhat.com>

Am 20.12.2021 um 13:20 hat Emanuele Giuseppe Esposito geschrieben:
> 
> 
> On 17/12/2021 17:38, Emanuele Giuseppe Esposito wrote:
> > 
> > 
> > On 17/12/2021 12:04, Hanna Reitz wrote:
> > > On 24.11.21 07:44, Emanuele Giuseppe Esposito wrote:
> > > > bdrv_co_invalidate_cache is special: it is an I/O function,
> > > 
> > > I still don’t believe it is, but well.
> > > 
> > > (Yes, it is called by a test in an iothread, but I believe we’ve
> > > seen that the tests simply sometimes test things that shouldn’t be
> > > allowed.)
> > > 
> > > > but uses the block layer permission API, which is GS.
> > > > 
> > > > Because of this, we can assert that either the function is
> > > > being called with BQL held, and thus can use the permission API,
> > > > or make sure that the permission API is not used, by ensuring that
> > > > bs (and parents) .open_flags does not contain BDRV_O_INACTIVE.
> > > > 
> > > > Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> > > > ---
> > > >   block.c | 26 ++++++++++++++++++++++++++
> > > >   1 file changed, 26 insertions(+)
> > > > 
> > > > diff --git a/block.c b/block.c
> > > > index a0309f827d..805974676b 100644
> > > > --- a/block.c
> > > > +++ b/block.c
> > > > @@ -6574,6 +6574,26 @@ void bdrv_init_with_whitelist(void)
> > > >       bdrv_init();
> > > >   }
> > > > +static bool bdrv_is_active(BlockDriverState *bs)
> > > > +{
> > > > +    BdrvChild *parent;
> > > > +
> > > > +    if (bs->open_flags & BDRV_O_INACTIVE) {
> > > > +        return false;
> > > > +    }
> > > > +
> > > > +    QLIST_FOREACH(parent, &bs->parents, next_parent) {
> > > > +        if (parent->klass->parent_is_bds) {
> > > > +            BlockDriverState *parent_bs = parent->opaque;
> > > 
> > > This looks like a really bad hack to me.  We purposefully have made
> > > the parent link opaque so that a BDS cannot easily reach its
> > > parents.  All accesses should go through BdrvChildClass methods.
> > > 
> > > I also don’t understand why we need to query parents at all.  The
> > > only fact that determines whether the current BDS will have its
> > > permissions changed is whether the BDS itself is active or
> > > inactive.  Sure, we’ll invoke bdrv_co_invalidate_cache() on the
> > > parents, too, but then we could simply let the assertion fail there.
> > > 
> > > > +            if (!bdrv_is_active(parent_bs)) {
> > > > +                return false;
> > > > +            }
> > > > +        }
> > > > +    }
> > > > +
> > > > +   return true;
> > > > +}
> > > > +
> > > >   int coroutine_fn bdrv_co_invalidate_cache(BlockDriverState
> > > > *bs, Error **errp)
> > > >   {
> > > >       BdrvChild *child, *parent;
> > > > @@ -6585,6 +6605,12 @@ int coroutine_fn
> > > > bdrv_co_invalidate_cache(BlockDriverState *bs, Error **errp)
> > > >           return -ENOMEDIUM;
> > > >       }
> > > > +    /*
> > > > +     * No need to muck with permissions if bs is active.
> > > > +     * TODO: should activation be a separate function?
> > > > +     */
> > > > +    assert(qemu_in_main_thread() || bdrv_is_active(bs));
> > > > +
> > > 
> > > I don’t understand this, really.  It looks to me like “if you don’t
> > > call this in the main thread, this better be a no-op”, i.e., you
> > > must never call this function in an I/O thread if you really want to
> > > use it.  I.e. what I’d classify as a GS function.
> > > 
> > > It sounds like this is just a special case for said test, and
> > > special-casing code for tests sounds like a bad idea.
> > 
> > Ok, but trying to leave just the qemu_in_main_thread() assertion makes
> > test 307 (./check 307) fail.
> > I am actually not sure on why it fails, but I am sure it is because of
> > the assertion, since without it it passes.
> > 
> > I tried with gdb (./check -gdb 307 on one terminal and
> > gdb -iex "target remote localhost:12345"
> > in another) but it points me to this below, which I think is the ndb
> > server getting the socket closed (because on the other side it crashed),
> > and not the actual error.
> > 
> > 
> > Thread 1 "qemu-system-x86" received signal SIGPIPE, Broken pipe.
> > 0x00007ffff68af54d in sendmsg () from target:/lib64/libc.so.6
> > (gdb) bt
> > #0  0x00007ffff68af54d in sendmsg () from target:/lib64/libc.so.6
> > #1  0x0000555555c13cc9 in qio_channel_socket_writev (ioc=<optimized
> > out>, iov=0x5555569a4870, niov=1, fds=0x0, nfds=<optimized out>,
> > errp=0x0)
> >      at ../io/channel-socket.c:561
> > #2  0x0000555555c19b18 in qio_channel_writev_full_all
> > (ioc=0x55555763b800, iov=iov@entry=0x7fffe8dffd80, niov=niov@entry=1,
> > fds=fds@entry=0x0,
> >      nfds=nfds@entry=0, errp=errp@entry=0x0) at ../io/channel.c:240
> > #3  0x0000555555c19bd2 in qio_channel_writev_all (errp=0x0, niov=1,
> > iov=0x7fffe8dffd80, ioc=<optimized out>) at ../io/channel.c:220
> > #4  qio_channel_write_all (ioc=<optimized out>,
> > buf=buf@entry=0x7fffe8dffdd0 "", buflen=buflen@entry=20,
> > errp=errp@entry=0x0) at ../io/channel.c:330
> > #5  0x0000555555c27e75 in nbd_write (errp=0x0, size=20,
> > buffer=0x7fffe8dffdd0, ioc=<optimized out>) at ../nbd/nbd-internal.h:71
> > #6  nbd_negotiate_send_rep_len (client=client@entry=0x555556f60930,
> > type=type@entry=1, len=len@entry=0, errp=errp@entry=0x0) at
> > ../nbd/server.c:203
> > #7  0x0000555555c29db1 in nbd_negotiate_send_rep (errp=0x0, type=1,
> > client=0x555556f60930) at ../nbd/server.c:211
> > --Type <RET> for more, q to quit, c to continue without paging--
> > #8  nbd_negotiate_options (errp=0x7fffe8dffe88, client=<optimized out>)
> > at ../nbd/server.c:1224
> > #9  nbd_negotiate (errp=0x7fffe8dffe88, client=<optimized out>) at
> > ../nbd/server.c:1340
> > #10 nbd_co_client_start (opaque=<optimized out>) at ../nbd/server.c:2715
> > #11 0x0000555555d70423 in coroutine_trampoline (i0=<optimized out>,
> > i1=<optimized out>) at ../util/coroutine-ucontext.c:173
> > #12 0x00007ffff67f3820 in ?? () from target:/lib64/libc.so.6
> > #13 0x00007fffffffca80 in ?? ()
> > 
> 
> Ok the reason for this is line 89 of tests/qemu-iotests/307:
> 
> # Should ignore the iothread conflict, but then fail because of the
> # permission conflict (and not crash)
> vm.qmp_log('block-export-add', id='export1', type='nbd', node_name='null',
>                iothread='iothread1', fixed_iothread=False, writable=True)
> 
> This calls qmp_block_export_add() and then blk_exp_add(), that calls
> bdrv_invalidate_cache().
> Both functions are running in the main thread, but then
> bdrv_invalidate_cache invokes bdrv_co_invalidate_cache() as a coroutine in
> the AioContext of the given bs, so not the main loop.
> 
> So Hanna, what should we do here? This seems very similar to the discussion
> in patch 22, ie run blockdev-create (in this case block-export-add, which
> seems similar?) involving nodes in I/O threads.

My gut feeling is that the bug is that we're entering coroutine context
in the node's iothread too early. I think the only place where we really
need it is the bs->drv->bdrv_co_invalidate_cache() call.

In fact, not only permission code, but also parent->klass->activate()
must be run in the main thread. The only implementation of the callback
is blk_root_activate(), and it calls some permissions functions, too,
but also qemu_add_vm_change_state_handler(), which doesn't look thread
safe. Unlikely to ever cause problems and if it does, it won't be
reproducible, but still a bug.

So if we go back to a bdrv_invalidate_cache() that does all the graph
manipulations (and asserts that we're in the main loop) and then have a
much smaller bdrv_co_invalidate_cache() that basically just calls into
the driver, would that solve the problem?

Kevin



  parent reply	other threads:[~2022-01-19 18:37 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-24  6:43 [PATCH v5 00/31] block layer: split block APIs in global state and I/O Emanuele Giuseppe Esposito
2021-11-24  6:43 ` [PATCH v5 01/31] main-loop.h: introduce qemu_in_main_thread() Emanuele Giuseppe Esposito
2021-11-24  6:43 ` [PATCH v5 02/31] include/block/block: split header into I/O and global state API Emanuele Giuseppe Esposito
2021-11-24  6:43 ` [PATCH v5 03/31] assertions for block " Emanuele Giuseppe Esposito
2021-12-16 15:17   ` Hanna Reitz
2021-11-24  6:43 ` [PATCH v5 04/31] include/sysemu/block-backend: split header into I/O and global state (GS) API Emanuele Giuseppe Esposito
2021-12-10 14:21   ` Hanna Reitz
2021-11-24  6:43 ` [PATCH v5 05/31] block-backend: special comments for blk_set/get_perm due to fuse Emanuele Giuseppe Esposito
2021-12-10 14:38   ` Hanna Reitz
2021-12-15  8:57     ` Emanuele Giuseppe Esposito
2021-12-15 10:13       ` Emanuele Giuseppe Esposito
2021-12-16 14:00         ` Hanna Reitz
2021-11-24  6:43 ` [PATCH v5 06/31] block/block-backend.c: assertions for block-backend Emanuele Giuseppe Esposito
2021-12-10 15:37   ` Hanna Reitz
2021-11-24  6:43 ` [PATCH v5 07/31] include/block/block_int: split header into I/O and global state API Emanuele Giuseppe Esposito
2021-11-24  6:43 ` [PATCH v5 08/31] assertions for block_int " Emanuele Giuseppe Esposito
2021-11-24  6:43 ` [PATCH v5 09/31] block: introduce assert_bdrv_graph_writable Emanuele Giuseppe Esposito
2021-12-10 17:43   ` Hanna Reitz
2021-12-14 19:48     ` Emanuele Giuseppe Esposito
2021-12-16 14:01       ` Hanna Reitz
2021-11-24  6:43 ` [PATCH v5 10/31] block.c: modify .attach and .detach callbacks of child_of_bds Emanuele Giuseppe Esposito
2021-12-16 14:57   ` Hanna Reitz
2021-12-16 16:05     ` Emanuele Giuseppe Esposito
2021-12-16 16:12       ` Hanna Reitz
2021-11-24  6:43 ` [PATCH v5 11/31] include/block/blockjob_int.h: split header into I/O and GS API Emanuele Giuseppe Esposito
2021-11-24  6:43 ` [PATCH v5 12/31] assertions for blockjob_int.h Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 13/31] block.c: add assertions to static functions Emanuele Giuseppe Esposito
2021-12-16 16:08   ` Hanna Reitz
2021-12-16 16:39     ` Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 14/31] include/block/blockjob.h: global state API Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 15/31] assertions for blockjob.h " Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 16/31] include/sysemu/blockdev.h: " Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 17/31] assertions for blockdev.h " Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 18/31] include/block/snapshot: global state API + assertions Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 19/31] block/copy-before-write.h: " Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 20/31] block/coroutines: I/O API Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 21/31] block_int-common.h: split function pointers in BlockDriver Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 22/31] block_int-common.h: assertion in the callers of BlockDriver function pointers Emanuele Giuseppe Esposito
2021-12-16 18:43   ` Hanna Reitz
2021-12-17 15:53     ` Emanuele Giuseppe Esposito
2021-12-17 16:00       ` Hanna Reitz
2021-11-24  6:44 ` [PATCH v5 23/31] block_int-common.h: split function pointers in BdrvChildClass Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 24/31] block_int-common.h: assertions in the callers of BdrvChildClass function pointers Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 25/31] block-backend-common.h: split function pointers in BlockDevOps Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 26/31] job.h: split function pointers in JobDriver Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 27/31] job.h: assertions in the callers of JobDriver funcion pointers Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 28/31] block.c: assert BQL lock held in bdrv_co_invalidate_cache Emanuele Giuseppe Esposito
2021-12-17 11:04   ` Hanna Reitz
2021-12-17 16:38     ` Emanuele Giuseppe Esposito
2021-12-20 12:20       ` Emanuele Giuseppe Esposito
2021-12-23 17:11         ` Hanna Reitz
2022-01-19 15:57           ` Hanna Reitz
2022-01-19 17:44             ` Hanna Reitz
2022-01-19 18:34         ` Kevin Wolf [this message]
2022-01-20 13:22           ` Paolo Bonzini
2022-01-20 13:48             ` Kevin Wolf
2022-01-21 10:22               ` Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 29/31] jobs: introduce pre_run function in JobDriver Emanuele Giuseppe Esposito
2021-11-24  6:44 ` [PATCH v5 30/31] crypto: delegate permission functions to JobDriver .pre_run Emanuele Giuseppe Esposito
2021-12-17 12:29   ` Hanna Reitz
2021-12-17 14:32     ` Hanna Reitz
2021-12-20 15:47     ` Emanuele Giuseppe Esposito
2021-12-23 17:15       ` Hanna Reitz
2021-11-24  6:44 ` [PATCH v5 31/31] block.c: assertions to the block layer permissions API Emanuele Giuseppe Esposito
2021-11-29 13:32 ` [PATCH v5 00/31] block layer: split block APIs in global state and I/O Stefan Hajnoczi
2021-11-30  8:39   ` Emanuele Giuseppe Esposito

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YehZwkGVPK6phrc2@redhat.com \
    --to=kwolf@redhat.com \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=eesposit@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).