* [PATCH v2 1/3] block/nbd: only detach existing iochannel from aio_context
2021-01-29 7:38 [PATCH v2 0/3] block/nbd: fix crashers in reconnect while migrating Roman Kagan
@ 2021-01-29 7:38 ` Roman Kagan
2021-01-29 7:38 ` [PATCH v2 2/3] block/nbd: only enter connection coroutine if it's present Roman Kagan
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Roman Kagan @ 2021-01-29 7:38 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block, Max Reitz
When the reconnect in NBD client is in progress, the iochannel used for
NBD connection doesn't exist. Therefore an attempt to detach it from
the aio_context of the parent BlockDriverState results in a NULL pointer
dereference.
The problem is triggerable, in particular, when an outgoing migration is
about to finish, and stopping the dataplane tries to move the
BlockDriverState from the iothread aio_context to the main loop. If the
NBD connection is lost before this point, and the NBD client has entered
the reconnect procedure, QEMU crashes:
#0 qemu_aio_coroutine_enter (ctx=0x5618056c7580, co=0x0)
at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-coroutine.c:109
#1 0x00005618034b1b68 in nbd_client_attach_aio_context_bh (
opaque=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block/nbd.c:164
#2 0x000056180353116b in aio_wait_bh (opaque=0x7f60e1e63700)
at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:55
#3 0x0000561803530633 in aio_bh_call (bh=0x7f60d40a7e80)
at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:136
#4 aio_bh_poll (ctx=ctx@entry=0x5618056c7580)
at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:164
#5 0x0000561803533e5a in aio_poll (ctx=ctx@entry=0x5618056c7580,
blocking=blocking@entry=true)
at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-posix.c:650
#6 0x000056180353128d in aio_wait_bh_oneshot (ctx=0x5618056c7580,
cb=<optimized out>, opaque=<optimized out>)
at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:71
#7 0x000056180345c50a in bdrv_attach_aio_context (new_context=0x5618056c7580,
bs=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6172
#8 bdrv_set_aio_context_ignore (bs=bs@entry=0x561805ed4c00,
new_context=new_context@entry=0x5618056c7580,
ignore=ignore@entry=0x7f60e1e63780)
at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6237
#9 0x000056180345c969 in bdrv_child_try_set_aio_context (
bs=bs@entry=0x561805ed4c00, ctx=0x5618056c7580,
ignore_child=<optimized out>, errp=<optimized out>)
at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6332
#10 0x00005618034957db in blk_do_set_aio_context (blk=0x56180695b3f0,
new_context=0x5618056c7580, update_root_node=update_root_node@entry=true,
errp=errp@entry=0x0)
at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:1989
#11 0x00005618034980bd in blk_set_aio_context (blk=<optimized out>,
new_context=<optimized out>, errp=errp@entry=0x0)
at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:2010
#12 0x0000561803197953 in virtio_blk_data_plane_stop (vdev=<optimized out>)
at /build/qemu-6MF7tq/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292
#13 0x00005618033d67bf in virtio_bus_stop_ioeventfd (bus=0x5618056d9f08)
at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio-bus.c:245
#14 0x00005618031c9b2e in virtio_vmstate_change (opaque=0x5618056d9f90,
running=0, state=<optimized out>)
at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio.c:3220
#15 0x0000561803208bfd in vm_state_notify (running=running@entry=0,
state=state@entry=RUN_STATE_FINISH_MIGRATE)
at /build/qemu-6MF7tq/qemu-5.0.1/softmmu/vl.c:1275
#16 0x0000561803155c02 in do_vm_stop (state=RUN_STATE_FINISH_MIGRATE,
send_stop=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/cpus.c:1032
#17 0x00005618033e3765 in migration_completion (s=0x5618056e6960)
at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:2914
#18 migration_iteration_run (s=0x5618056e6960)
at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3275
#19 migration_thread (opaque=opaque@entry=0x5618056e6960)
at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3439
#20 0x0000561803536ad6 in qemu_thread_start (args=<optimized out>)
at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-thread-posix.c:519
#21 0x00007f61085d06ba in start_thread ()
from /lib/x86_64-linux-gnu/libpthread.so.0
#22 0x00007f610830641d in sysctl () from /lib/x86_64-linux-gnu/libc.so.6
#23 0x0000000000000000 in ?? ()
Fix it by checking that the iochannel is non-null before trying to
detach it from the aio_context. If it is null, no detaching is needed,
and it will get reattached in the proper aio_context once the connection
is reestablished.
Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
block/nbd.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/block/nbd.c b/block/nbd.c
index 42e10c7c93..bcd6641e90 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -235,7 +235,14 @@ static void nbd_client_detach_aio_context(BlockDriverState *bs)
/* Timer is deleted in nbd_client_co_drain_begin() */
assert(!s->reconnect_delay_timer);
- qio_channel_detach_aio_context(QIO_CHANNEL(s->ioc));
+ /*
+ * If reconnect is in progress we may have no ->ioc. It will be
+ * re-instantiated in the proper aio context once the connection is
+ * reestablished.
+ */
+ if (s->ioc) {
+ qio_channel_detach_aio_context(QIO_CHANNEL(s->ioc));
+ }
}
static void nbd_client_attach_aio_context_bh(void *opaque)
--
2.29.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 2/3] block/nbd: only enter connection coroutine if it's present
2021-01-29 7:38 [PATCH v2 0/3] block/nbd: fix crashers in reconnect while migrating Roman Kagan
2021-01-29 7:38 ` [PATCH v2 1/3] block/nbd: only detach existing iochannel from aio_context Roman Kagan
@ 2021-01-29 7:38 ` Roman Kagan
2021-01-29 7:38 ` [PATCH v2 3/3] nbd: make nbd_read* return -EIO on error Roman Kagan
2021-02-02 22:33 ` [PATCH v2 0/3] block/nbd: fix crashers in reconnect while migrating Eric Blake
3 siblings, 0 replies; 5+ messages in thread
From: Roman Kagan @ 2021-01-29 7:38 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block, Max Reitz
When an NBD block driver state is moved from one aio_context to another
(e.g. when doing a drain in a migration thread),
nbd_client_attach_aio_context_bh is executed that enters the connection
coroutine.
However, the assumption that ->connection_co is always present here
appears incorrect: the connection may have encountered an error other
than -EIO in the underlying transport, and thus may have decided to quit
rather than keep trying to reconnect, and therefore it may have
terminated the connection coroutine. As a result an attempt to reassign
the client in this state (NBD_CLIENT_QUIT) to a different aio_context
leads to a null pointer dereference:
#0 qio_channel_detach_aio_context (ioc=0x0)
at /build/qemu-gYtjVn/qemu-5.0.1/io/channel.c:452
#1 0x0000562a242824b3 in bdrv_detach_aio_context (bs=0x562a268d6a00)
at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6151
#2 bdrv_set_aio_context_ignore (bs=bs@entry=0x562a268d6a00,
new_context=new_context@entry=0x562a260c9580,
ignore=ignore@entry=0x7feeadc9b780)
at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6230
#3 0x0000562a24282969 in bdrv_child_try_set_aio_context
(bs=bs@entry=0x562a268d6a00, ctx=0x562a260c9580,
ignore_child=<optimized out>, errp=<optimized out>)
at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6332
#4 0x0000562a242bb7db in blk_do_set_aio_context (blk=0x562a2735d0d0,
new_context=0x562a260c9580,
update_root_node=update_root_node@entry=true, errp=errp@entry=0x0)
at /build/qemu-gYtjVn/qemu-5.0.1/block/block-backend.c:1989
#5 0x0000562a242be0bd in blk_set_aio_context (blk=<optimized out>,
new_context=<optimized out>, errp=errp@entry=0x0)
at /build/qemu-gYtjVn/qemu-5.0.1/block/block-backend.c:2010
#6 0x0000562a23fbd953 in virtio_blk_data_plane_stop (vdev=<optimized
out>)
at /build/qemu-gYtjVn/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292
#7 0x0000562a241fc7bf in virtio_bus_stop_ioeventfd (bus=0x562a260dbf08)
at /build/qemu-gYtjVn/qemu-5.0.1/hw/virtio/virtio-bus.c:245
#8 0x0000562a23fefb2e in virtio_vmstate_change (opaque=0x562a260dbf90,
running=0, state=<optimized out>)
at /build/qemu-gYtjVn/qemu-5.0.1/hw/virtio/virtio.c:3220
#9 0x0000562a2402ebfd in vm_state_notify (running=running@entry=0,
state=state@entry=RUN_STATE_FINISH_MIGRATE)
at /build/qemu-gYtjVn/qemu-5.0.1/softmmu/vl.c:1275
#10 0x0000562a23f7bc02 in do_vm_stop (state=RUN_STATE_FINISH_MIGRATE,
send_stop=<optimized out>)
at /build/qemu-gYtjVn/qemu-5.0.1/cpus.c:1032
#11 0x0000562a24209765 in migration_completion (s=0x562a260e83a0)
at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:2914
#12 migration_iteration_run (s=0x562a260e83a0)
at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:3275
#13 migration_thread (opaque=opaque@entry=0x562a260e83a0)
at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:3439
#14 0x0000562a2435ca96 in qemu_thread_start (args=<optimized out>)
at /build/qemu-gYtjVn/qemu-5.0.1/util/qemu-thread-posix.c:519
#15 0x00007feed31466ba in start_thread (arg=0x7feeadc9c700)
at pthread_create.c:333
#16 0x00007feed2e7c41d in __GI___sysctl (name=0x0, nlen=608471908,
oldval=0x562a2452b138, oldlenp=0x0, newval=0x562a2452c5e0
<__func__.28102>, newlen=0)
at ../sysdeps/unix/sysv/linux/sysctl.c:30
#17 0x0000000000000000 in ?? ()
Fix it by checking that the connection coroutine is non-null before
trying to enter it. If it is null, no entering is needed, as the
connection is probably going down anyway.
Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
block/nbd.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/block/nbd.c b/block/nbd.c
index bcd6641e90..b3cbbeb4b0 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -250,13 +250,15 @@ static void nbd_client_attach_aio_context_bh(void *opaque)
BlockDriverState *bs = opaque;
BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
- /*
- * The node is still drained, so we know the coroutine has yielded in
- * nbd_read_eof(), the only place where bs->in_flight can reach 0, or it is
- * entered for the first time. Both places are safe for entering the
- * coroutine.
- */
- qemu_aio_coroutine_enter(bs->aio_context, s->connection_co);
+ if (s->connection_co) {
+ /*
+ * The node is still drained, so we know the coroutine has yielded in
+ * nbd_read_eof(), the only place where bs->in_flight can reach 0, or
+ * it is entered for the first time. Both places are safe for entering
+ * the coroutine.
+ */
+ qemu_aio_coroutine_enter(bs->aio_context, s->connection_co);
+ }
bdrv_dec_in_flight(bs);
}
--
2.29.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v2 3/3] nbd: make nbd_read* return -EIO on error
2021-01-29 7:38 [PATCH v2 0/3] block/nbd: fix crashers in reconnect while migrating Roman Kagan
2021-01-29 7:38 ` [PATCH v2 1/3] block/nbd: only detach existing iochannel from aio_context Roman Kagan
2021-01-29 7:38 ` [PATCH v2 2/3] block/nbd: only enter connection coroutine if it's present Roman Kagan
@ 2021-01-29 7:38 ` Roman Kagan
2021-02-02 22:33 ` [PATCH v2 0/3] block/nbd: fix crashers in reconnect while migrating Eric Blake
3 siblings, 0 replies; 5+ messages in thread
From: Roman Kagan @ 2021-01-29 7:38 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block, Max Reitz
NBD reconnect logic considers the error code from the functions that
read NBD messages to tell if reconnect should be attempted or not: it is
attempted on -EIO, otherwise the client transitions to NBD_CLIENT_QUIT
state (see nbd_channel_error). This error code is propagated from the
primitives like nbd_read.
The problem, however, is that nbd_read itself turns every error into -1
rather than -EIO. As a result, if the NBD server happens to die while
sending the message, the client in QEMU receives less data than it
expects, considers it as a fatal error, and wouldn't attempt
reestablishing the connection.
Fix it by turning every negative return from qio_channel_read_all into
-EIO returned from nbd_read. Apparently that was the original behavior,
but got broken later. Also adjust nbd_readXX to follow.
Fixes: e6798f06a6 ("nbd: generalize usage of nbd_read")
Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
include/block/nbd.h | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 4a52a43ef5..5f34d23bb0 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -364,7 +364,7 @@ static inline int nbd_read(QIOChannel *ioc, void *buffer, size_t size,
if (desc) {
error_prepend(errp, "Failed to read %s: ", desc);
}
- return -1;
+ return ret;
}
return 0;
@@ -375,8 +375,9 @@ static inline int nbd_read##bits(QIOChannel *ioc, \
uint##bits##_t *val, \
const char *desc, Error **errp) \
{ \
- if (nbd_read(ioc, val, sizeof(*val), desc, errp) < 0) { \
- return -1; \
+ int ret = nbd_read(ioc, val, sizeof(*val), desc, errp); \
+ if (ret < 0) { \
+ return ret; \
} \
*val = be##bits##_to_cpu(*val); \
return 0; \
--
2.29.2
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2 0/3] block/nbd: fix crashers in reconnect while migrating
2021-01-29 7:38 [PATCH v2 0/3] block/nbd: fix crashers in reconnect while migrating Roman Kagan
` (2 preceding siblings ...)
2021-01-29 7:38 ` [PATCH v2 3/3] nbd: make nbd_read* return -EIO on error Roman Kagan
@ 2021-02-02 22:33 ` Eric Blake
3 siblings, 0 replies; 5+ messages in thread
From: Eric Blake @ 2021-02-02 22:33 UTC (permalink / raw)
To: Roman Kagan, qemu-devel
Cc: Kevin Wolf, Vladimir Sementsov-Ogievskiy, qemu-block, Max Reitz
On 1/29/21 1:38 AM, Roman Kagan wrote:
> During the final phase of migration the NBD reconnection logic may
> encounter situations it doesn't expect during regular operation.
>
> This series addresses some of them that make qemu crash. They are
> reproducible when a vm with a secondary drive attached via nbd with
> non-zero "reconnect-delay" runs a stress load (fio with big queue depth)
> in the guest on that drive and is migrated (e.g. to a file), while the
> nbd server is SIGKILL-ed and restarted every second.
>
> See the individual patches for specific crash conditions and more
> detailed explanations.
>
> v1 -> v2:
> - fix corrupted backtraces in log messages
> - add r-b by Vladimir
>
Thanks, queuing through my NBD tree.
> Roman Kagan (3):
> block/nbd: only detach existing iochannel from aio_context
> block/nbd: only enter connection coroutine if it's present
> nbd: make nbd_read* return -EIO on error
>
> include/block/nbd.h | 7 ++++---
> block/nbd.c | 25 +++++++++++++++++--------
> 2 files changed, 21 insertions(+), 11 deletions(-)
>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 5+ messages in thread