qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fabian Ebner <f.ebner@proxmox.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	"open list:Network Block Dev..." <qemu-block@nongnu.org>,
	qemu-devel@nongnu.org, Hanna Reitz <hreitz@redhat.com>,
	Eric Blake <eblake@redhat.com>,
	Thomas Lamprecht <t.lamprecht@proxmox.com>
Subject: Re: [PULL 18/20] block/nbd: drop connection_co
Date: Wed, 2 Feb 2022 12:49:36 +0100	[thread overview]
Message-ID: <8e8b69e4-a178-aff1-4de3-e697b942f3b3@proxmox.com> (raw)
In-Reply-To: <20210927215545.3930309-19-eblake@redhat.com>

Am 27.09.21 um 23:55 schrieb Eric Blake:
> From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> OK, that's a big rewrite of the logic.
> 
> Pre-patch we have an always running coroutine - connection_co. It does
> reply receiving and reconnecting. And it leads to a lot of difficult
> and unobvious code around drained sections and context switch. We also
> abuse bs->in_flight counter which is increased for connection_co and
> temporary decreased in points where we want to allow drained section to
> begin. One of these place is in another file: in nbd_read_eof() in
> nbd/client.c.
> 
> We also cancel reconnect and requests waiting for reconnect on drained
> begin which is not correct. And this patch fixes that.
> 
> Let's finally drop this always running coroutine and go another way:
> do both reconnect and receiving in request coroutines.
>

Hi,

while updating our stack to 6.2, one of our live-migration tests stopped
working (backtrace is below) and bisecting led me to this patch.

The VM has a single qcow2 disk (converting to raw doesn't make a
difference) and the issue only appears when using iothread (for both
virtio-scsi-pci and virtio-block-pci).

Reverting 1af7737871fb3b66036f5e520acb0a98fc2605f7 (which lives on top)
and 4ddb5d2fde6f22b2cf65f314107e890a7ca14fcf (the commit corresponding
to this patch) in v6.2.0 makes the migration work again.

Backtrace:

Thread 1 (Thread 0x7f9d93458fc0 (LWP 56711) "kvm"):
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f9d9d6bc537 in __GI_abort () at abort.c:79
#2  0x00007f9d9d6bc40f in __assert_fail_base (fmt=0x7f9d9d825128
"%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x5579153763f8
"qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)",
file=0x5579153764f9 "../io/channel.c", line=483, function=<optimized
out>) at assert.c:92
#3  0x00007f9d9d6cb662 in __GI___assert_fail
(assertion=assertion@entry=0x5579153763f8
"qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)",
file=file@entry=0x5579153764f9 "../io/channel.c", line=line@entry=483,
function=function@entry=0x557915376570 <__PRETTY_FUNCTION__.2>
"qio_channel_restart_read") at assert.c:101
#4  0x00005579150c351c in qio_channel_restart_read (opaque=<optimized
out>) at ../io/channel.c:483
#5  qio_channel_restart_read (opaque=<optimized out>) at ../io/channel.c:477
#6  0x000055791520182a in aio_dispatch_handler
(ctx=ctx@entry=0x557916908c60, node=0x7f9d8400f800) at
../util/aio-posix.c:329
#7  0x0000557915201f62 in aio_dispatch_handlers (ctx=0x557916908c60) at
../util/aio-posix.c:372
#8  aio_dispatch (ctx=0x557916908c60) at ../util/aio-posix.c:382
#9  0x00005579151ea74e in aio_ctx_dispatch (source=<optimized out>,
callback=<optimized out>, user_data=<optimized out>) at ../util/async.c:311
#10 0x00007f9d9e647e6b in g_main_context_dispatch () from
/lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x0000557915203030 in glib_pollfds_poll () at ../util/main-loop.c:232
#12 os_host_main_loop_wait (timeout=992816) at ../util/main-loop.c:255
#13 main_loop_wait (nonblocking=nonblocking@entry=0) at
../util/main-loop.c:531
#14 0x00005579150539c1 in qemu_main_loop () at ../softmmu/runstate.c:726
#15 0x0000557914ce8ebe in main (argc=<optimized out>, argv=<optimized
out>, envp=<optimized out>) at ../softmmu/main.c:50




  reply	other threads:[~2022-02-02 11:52 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-27 21:55 [PULL 00/20] NBD patches through 2021-09-27 Eric Blake
2021-09-27 21:55 ` [PULL 01/20] qemu-nbd: Change default cache mode to writeback Eric Blake
2021-09-27 21:55 ` [PULL 02/20] block/io: bring request check to bdrv_co_(read, write)v_vmstate Eric Blake
2021-09-27 21:55 ` [PULL 03/20] qcow2: check request on vmstate save/load path Eric Blake
2021-09-27 21:55 ` [PULL 04/20] block: use int64_t instead of uint64_t in driver read handlers Eric Blake
2021-09-27 21:55 ` [PULL 05/20] block: use int64_t instead of uint64_t in driver write handlers Eric Blake
2021-09-27 21:55 ` [PULL 06/20] block: use int64_t instead of uint64_t in copy_range driver handlers Eric Blake
2021-09-27 21:55 ` [PULL 07/20] block: make BlockLimits::max_pwrite_zeroes 64bit Eric Blake
2021-09-27 21:55 ` [PULL 08/20] block: use int64_t instead of int in driver write_zeroes handlers Eric Blake
2021-09-27 21:55 ` [PULL 09/20] block/io: allow 64bit write-zeroes requests Eric Blake
2021-09-27 21:55 ` [PULL 10/20] block: make BlockLimits::max_pdiscard 64bit Eric Blake
2021-09-27 21:55 ` [PULL 11/20] block: use int64_t instead of int in driver discard handlers Eric Blake
2021-09-27 21:55 ` [PULL 12/20] block/io: allow 64bit discard requests Eric Blake
2021-09-27 21:55 ` [PULL 13/20] nbd/server: Allow LIST_META_CONTEXT without STRUCTURED_REPLY Eric Blake
2021-09-27 21:55 ` [PULL 14/20] nbd/client-connection: nbd_co_establish_connection(): fix non set errp Eric Blake
2021-09-27 21:55 ` [PULL 15/20] block/nbd: nbd_channel_error() shutdown channel unconditionally Eric Blake
2021-09-27 21:55 ` [PULL 16/20] block/nbd: move nbd_recv_coroutines_wake_all() up Eric Blake
2021-09-27 21:55 ` [PULL 17/20] block/nbd: refactor nbd_recv_coroutines_wake_all() Eric Blake
2021-09-27 21:55 ` [PULL 18/20] block/nbd: drop connection_co Eric Blake
2022-02-02 11:49   ` Fabian Ebner [this message]
2022-02-02 13:53     ` Eric Blake
2022-02-02 14:21       ` Hanna Reitz
2022-02-03  8:49         ` Fabian Ebner
2021-09-27 21:55 ` [PULL 19/20] block/nbd: check that received handle is valid Eric Blake
2021-09-27 21:55 ` [PULL 20/20] nbd/server: Add --selinux-label option Eric Blake
2021-09-29  8:59 ` [PULL 00/20] NBD patches through 2021-09-27 Peter Maydell
2021-09-29 12:40   ` Paolo Bonzini
2021-09-29 13:58     ` Richard Henderson
2021-09-29 15:03       ` Paolo Bonzini
2021-09-29 18:29         ` Eric Blake
2021-09-29 19:14           ` Richard W.M. Jones
2021-09-30  8:29           ` Daniel P. Berrangé
2021-09-30  8:45           ` Richard W.M. Jones
2021-09-30 14:27             ` Richard Henderson
2021-09-30 14:37               ` Richard W.M. Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8e8b69e4-a178-aff1-4de3-e697b942f3b3@proxmox.com \
    --to=f.ebner@proxmox.com \
    --cc=eblake@redhat.com \
    --cc=hreitz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=t.lamprecht@proxmox.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).