From: Fam Zheng <famz@redhat.com>
To: Ed Swierk <eswierk@skyportsystems.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] Assertion failure taking external snapshot with virtio drive + iothread
Date: Wed, 22 Mar 2017 17:19:24 +0800 [thread overview]
Message-ID: <20170322091000.GA25375@lemon.lan> (raw)
In-Reply-To: <CAO_EM_kLdLfQQjEZ4pzptStGearFNzV0gSYkER4PwKAdge=oCA@mail.gmail.com>
On Tue, 03/21 06:05, Ed Swierk wrote:
> On Tue, Mar 21, 2017 at 5:50 AM, Fam Zheng <famz@redhat.com> wrote:
> > On Tue, 03/21 05:20, Ed Swierk wrote:
> >> On Mon, Mar 20, 2017 at 10:26 PM, Fam Zheng <famz@redhat.com> wrote:
> >> > On Fri, 03/17 09:55, Ed Swierk wrote:
> >> >> I'm running into the same problem taking an external snapshot with a
> >> >> virtio-blk drive with iothread, so it's not specific to virtio-scsi.
> >> >> Run a Linux guest on qemu master
> >> >>
> >> >> qemu-system-x86_64 -nographic -enable-kvm -monitor
> >> >> telnet:0.0.0.0:1234,server,nowait -m 1024 -object
> >> >> iothread,id=iothread1 -drive file=/x/drive.qcow2,if=none,id=drive0
> >> >> -device virtio-blk-pci,iothread=iothread1,drive=drive0
> >> >>
> >> >> Then in the monitor
> >> >>
> >> >> snapshot_blkdev drive0 /x/snap1.qcow2
> >> >>
> >> >> qemu bombs with
> >> >>
> >> >> qemu-system-x86_64: /x/qemu/include/block/aio.h:457:
> >> >> aio_enable_external: Assertion `ctx->external_disable_cnt > 0' failed.
> >> >>
> >> >> whereas without the iothread the assertion failure does not occur.
> >> >
> >> >
> >> > Can you test this one?
> >> >
> >> > ---
> >> >
> >> >
> >> > diff --git a/blockdev.c b/blockdev.c
> >> > index c5b2c2c..4c217d5 100644
> >> > --- a/blockdev.c
> >> > +++ b/blockdev.c
> >> > @@ -1772,6 +1772,8 @@ static void external_snapshot_prepare(BlkActionState *common,
> >> > return;
> >> > }
> >> >
> >> > + bdrv_set_aio_context(state->new_bs, state->aio_context);
> >> > +
> >> > /* This removes our old bs and adds the new bs. This is an operation that
> >> > * can fail, so we need to do it in .prepare; undoing it for abort is
> >> > * always possible. */
> >> > @@ -1789,8 +1791,6 @@ static void external_snapshot_commit(BlkActionState *common)
> >> > ExternalSnapshotState *state =
> >> > DO_UPCAST(ExternalSnapshotState, common, common);
> >> >
> >> > - bdrv_set_aio_context(state->new_bs, state->aio_context);
> >> > -
> >> > /* We don't need (or want) to use the transactional
> >> > * bdrv_reopen_multiple() across all the entries at once, because we
> >> > * don't want to abort all of them if one of them fails the reopen */
> >>
> >> With this change, a different assertion fails on running snapshot_blkdev:
> >>
> >> qemu-system-x86_64: /x/qemu/block/io.c:164: bdrv_drain_recurse:
> >> Assertion `qemu_get_current_aio_context() == qemu_get_aio_context()'
> >> failed.
>
> Actually running snapshot_blkdev command in the text monitor doesn't
> trigger this assertion (I mixed up my notes). Instead it's triggered
> by the following sequence in qmp-shell:
>
> (QEMU) blockdev-snapshot-sync device=drive0 format=qcow2
> snapshot-file=/x/snap1.qcow2
> {"return": {}}
> (QEMU) block-commit device=drive0
> {"return": {}}
> (QEMU) block-job-complete device=drive0
> {"return": {}}
>
> > Is there a backtrace?
>
> #0 0x00007ffff3757067 in raise () from /lib/x86_64-linux-gnu/libc.so.6
> #1 0x00007ffff3758448 in abort () from /lib/x86_64-linux-gnu/libc.so.6
> #2 0x00007ffff3750266 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #3 0x00007ffff3750312 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
> #4 0x0000555555b4b0bb in bdrv_drain_recurse
> (bs=bs@entry=0x555557bd6010) at /x/qemu/block/io.c:164
> #5 0x0000555555b4b7ad in bdrv_drained_begin (bs=0x555557bd6010) at
> /x/qemu/block/io.c:231
> #6 0x0000555555b4b802 in bdrv_parent_drained_begin
> (bs=0x5555568c1a00) at /x/qemu/block/io.c:53
> #7 bdrv_drained_begin (bs=bs@entry=0x5555568c1a00) at /x/qemu/block/io.c:228
> #8 0x0000555555b4be1e in bdrv_co_drain_bh_cb (opaque=0x7fff9aaece40)
> at /x/qemu/block/io.c:190
> #9 0x0000555555bb431e in aio_bh_call (bh=0x55555750e5f0) at
> /x/qemu/util/async.c:90
> #10 aio_bh_poll (ctx=ctx@entry=0x555556718090) at /x/qemu/util/async.c:118
> #11 0x0000555555bb72eb in aio_poll (ctx=0x555556718090,
> blocking=blocking@entry=true) at /x/qemu/util/aio-posix.c:682
> #12 0x00005555559443ce in iothread_run (opaque=0x555556717b80) at
> /x/qemu/iothread.c:59
> #13 0x00007ffff3ad50a4 in start_thread () from
> /lib/x86_64-linux-gnu/libpthread.so.0
> #14 0x00007ffff380a87d in clone () from /lib/x86_64-linux-gnu/libc.so.6
Hmm, looks like a separate bug to me. In addition please apply this (the
assertion here is correct I think, but all callers are not audited yet):
diff --git a/block.c b/block.c
index 6e906ec..447d908 100644
--- a/block.c
+++ b/block.c
@@ -1737,6 +1737,9 @@ static void bdrv_replace_child_noperm(BdrvChild *child,
{
BlockDriverState *old_bs = child->bs;
+ if (old_bs && new_bs) {
+ assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs));
+ }
if (old_bs) {
if (old_bs->quiesce_counter && child->role->drained_end) {
child->role->drained_end(child);
diff --git a/block/mirror.c b/block/mirror.c
index ca4baa5..a23ca9e 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1147,6 +1147,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
return;
}
mirror_top_bs->total_sectors = bs->total_sectors;
+ bdrv_set_aio_context(mirror_top_bs, bdrv_get_aio_context(bs));
/* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
* it alive until block_job_create() even if bs has no parent. */
next prev parent reply other threads:[~2017-03-22 9:19 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-17 16:55 [Qemu-devel] Assertion failure taking external snapshot with virtio drive + iothread Ed Swierk
2017-03-17 17:11 ` Paolo Bonzini
2017-03-17 17:15 ` Paolo Bonzini
2017-03-17 17:32 ` Ed Swierk
2017-03-17 18:10 ` Paolo Bonzini
2017-03-17 19:27 ` Paolo Bonzini
2017-03-20 21:54 ` Ed Swierk
2017-03-21 1:48 ` Ed Swierk
2017-03-21 5:26 ` Fam Zheng
2017-03-21 12:20 ` Ed Swierk
2017-03-21 12:50 ` Fam Zheng
2017-03-21 13:05 ` Ed Swierk
2017-03-22 9:19 ` Fam Zheng [this message]
2017-03-22 17:36 ` Ed Swierk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170322091000.GA25375@lemon.lan \
--to=famz@redhat.com \
--cc=eswierk@skyportsystems.com \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.