qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: qemu-devel@nongnu.org
Cc: Roman Kagan <rvkagan@yandex-team.ru>,
	Kevin Wolf <kwolf@redhat.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	"open list:Network Block Dev..." <qemu-block@nongnu.org>,
	Max Reitz <mreitz@redhat.com>
Subject: [PULL 18/20] block/nbd: only detach existing iochannel from aio_context
Date: Tue,  2 Feb 2021 16:45:27 -0600	[thread overview]
Message-ID: <20210202224529.642055-19-eblake@redhat.com> (raw)
In-Reply-To: <20210202224529.642055-1-eblake@redhat.com>

From: Roman Kagan <rvkagan@yandex-team.ru>

When the reconnect in NBD client is in progress, the iochannel used for
NBD connection doesn't exist.  Therefore an attempt to detach it from
the aio_context of the parent BlockDriverState results in a NULL pointer
dereference.

The problem is triggerable, in particular, when an outgoing migration is
about to finish, and stopping the dataplane tries to move the
BlockDriverState from the iothread aio_context to the main loop.  If the
NBD connection is lost before this point, and the NBD client has entered
the reconnect procedure, QEMU crashes:

  #0  qemu_aio_coroutine_enter (ctx=0x5618056c7580, co=0x0)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-coroutine.c:109
  #1  0x00005618034b1b68 in nbd_client_attach_aio_context_bh (
      opaque=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block/nbd.c:164
  #2  0x000056180353116b in aio_wait_bh (opaque=0x7f60e1e63700)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:55
  #3  0x0000561803530633 in aio_bh_call (bh=0x7f60d40a7e80)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:136
  #4  aio_bh_poll (ctx=ctx@entry=0x5618056c7580)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:164
  #5  0x0000561803533e5a in aio_poll (ctx=ctx@entry=0x5618056c7580,
      blocking=blocking@entry=true)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-posix.c:650
  #6  0x000056180353128d in aio_wait_bh_oneshot (ctx=0x5618056c7580,
      cb=<optimized out>, opaque=<optimized out>)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:71
  #7  0x000056180345c50a in bdrv_attach_aio_context (new_context=0x5618056c7580,
      bs=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6172
  #8  bdrv_set_aio_context_ignore (bs=bs@entry=0x561805ed4c00,
      new_context=new_context@entry=0x5618056c7580,
      ignore=ignore@entry=0x7f60e1e63780)
      at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6237
  #9  0x000056180345c969 in bdrv_child_try_set_aio_context (
      bs=bs@entry=0x561805ed4c00, ctx=0x5618056c7580,
      ignore_child=<optimized out>, errp=<optimized out>)
      at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6332
  #10 0x00005618034957db in blk_do_set_aio_context (blk=0x56180695b3f0,
      new_context=0x5618056c7580, update_root_node=update_root_node@entry=true,
      errp=errp@entry=0x0)
      at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:1989
  #11 0x00005618034980bd in blk_set_aio_context (blk=<optimized out>,
      new_context=<optimized out>, errp=errp@entry=0x0)
      at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:2010
  #12 0x0000561803197953 in virtio_blk_data_plane_stop (vdev=<optimized out>)
      at /build/qemu-6MF7tq/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292
  #13 0x00005618033d67bf in virtio_bus_stop_ioeventfd (bus=0x5618056d9f08)
      at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio-bus.c:245
  #14 0x00005618031c9b2e in virtio_vmstate_change (opaque=0x5618056d9f90,
      running=0, state=<optimized out>)
      at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio.c:3220
  #15 0x0000561803208bfd in vm_state_notify (running=running@entry=0,
      state=state@entry=RUN_STATE_FINISH_MIGRATE)
      at /build/qemu-6MF7tq/qemu-5.0.1/softmmu/vl.c:1275
  #16 0x0000561803155c02 in do_vm_stop (state=RUN_STATE_FINISH_MIGRATE,
      send_stop=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/cpus.c:1032
  #17 0x00005618033e3765 in migration_completion (s=0x5618056e6960)
      at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:2914
  #18 migration_iteration_run (s=0x5618056e6960)
      at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3275
  #19 migration_thread (opaque=opaque@entry=0x5618056e6960)
      at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3439
  #20 0x0000561803536ad6 in qemu_thread_start (args=<optimized out>)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-thread-posix.c:519
  #21 0x00007f61085d06ba in start_thread ()
     from /lib/x86_64-linux-gnu/libpthread.so.0
  #22 0x00007f610830641d in sysctl () from /lib/x86_64-linux-gnu/libc.so.6
  #23 0x0000000000000000 in ?? ()

Fix it by checking that the iochannel is non-null before trying to
detach it from the aio_context.  If it is null, no detaching is needed,
and it will get reattached in the proper aio_context once the connection
is reestablished.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210129073859.683063-2-rvkagan@yandex-team.ru>
Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/nbd.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/block/nbd.c b/block/nbd.c
index 42e10c7c93f5..bcd6641e90f5 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -235,7 +235,14 @@ static void nbd_client_detach_aio_context(BlockDriverState *bs)

     /* Timer is deleted in nbd_client_co_drain_begin() */
     assert(!s->reconnect_delay_timer);
-    qio_channel_detach_aio_context(QIO_CHANNEL(s->ioc));
+    /*
+     * If reconnect is in progress we may have no ->ioc.  It will be
+     * re-instantiated in the proper aio context once the connection is
+     * reestablished.
+     */
+    if (s->ioc) {
+        qio_channel_detach_aio_context(QIO_CHANNEL(s->ioc));
+    }
 }

 static void nbd_client_attach_aio_context_bh(void *opaque)
-- 
2.30.0



  parent reply	other threads:[~2021-02-02 23:03 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-02 22:45 [PULL 00/20] NBD patches for 2021-02-02 Eric Blake
2021-02-02 22:45 ` [PULL 01/20] iotests: Fix expected whitespace for 185 Eric Blake
2021-02-02 22:45 ` [PULL 02/20] block: refactor bdrv_check_request: add errp Eric Blake
2021-02-02 22:45 ` [PULL 03/20] util/iov: make qemu_iovec_init_extended() honest Eric Blake
2021-02-02 22:45 ` [PULL 04/20] block: fix theoretical overflow in bdrv_init_padding() Eric Blake
2021-02-02 22:45 ` [PULL 05/20] block/io: refactor bdrv_pad_request(): move bdrv_pad_request() up Eric Blake
2021-02-02 22:45 ` [PULL 06/20] block/io: bdrv_pad_request(): support qemu_iovec_init_extended failure Eric Blake
2021-02-02 22:45 ` [PULL 07/20] block/throttle-groups: throttle_group_co_io_limits_intercept(): 64bit bytes Eric Blake
2021-02-02 22:45 ` [PULL 08/20] block/io: improve bdrv_check_request: check qiov too Eric Blake
2021-02-02 22:45 ` [PULL 09/20] block: use int64_t as bytes type in tracked requests Eric Blake
2021-02-02 22:45 ` [PULL 10/20] block/io: use int64_t bytes in driver wrappers Eric Blake
2021-02-02 22:45 ` [PULL 11/20] block/io: support int64_t bytes in bdrv_co_do_pwrite_zeroes() Eric Blake
2021-02-02 22:45 ` [PULL 12/20] block/io: support int64_t bytes in bdrv_aligned_pwritev() Eric Blake
2021-02-02 22:45 ` [PULL 13/20] block/io: support int64_t bytes in bdrv_co_do_copy_on_readv() Eric Blake
2021-02-02 22:45 ` [PULL 14/20] block/io: support int64_t bytes in bdrv_aligned_preadv() Eric Blake
2021-02-02 22:45 ` [PULL 15/20] block/io: support int64_t bytes in bdrv_co_p{read, write}v_part() Eric Blake
2021-02-02 22:45 ` [PULL 16/20] block/io: support int64_t bytes in read/write wrappers Eric Blake
2021-02-02 22:45 ` [PULL 17/20] block/io: use int64_t bytes in copy_range Eric Blake
2021-02-02 22:45 ` Eric Blake [this message]
2021-02-02 22:45 ` [PULL 19/20] block/nbd: only enter connection coroutine if it's present Eric Blake
2021-02-02 22:45 ` [PULL 20/20] nbd: make nbd_read* return -EIO on error Eric Blake
2021-02-02 23:12 ` [PULL 00/20] NBD patches for 2021-02-02 no-reply
2021-02-03 13:25   ` Eric Blake
2021-02-03 13:27 ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210202224529.642055-19-eblake@redhat.com \
    --to=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rvkagan@yandex-team.ru \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).