qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fam Zheng <famz@redhat.com>
To: qemu-devel@nongnu.org
Cc: Jeff Cody <jcody@redhat.com>, Kevin Wolf <kwolf@redhat.com>,
	Max Reitz <mreitz@redhat.com>,
	qemu-block@nongnu.org, pbonzini@redhat.com, stefanha@redhat.com
Subject: [Qemu-devel] [PATCH for-2.6 2/3] mirror: Skip BH for mirror_exit if in main loop
Date: Fri, 22 Apr 2016 14:35:06 +0800	[thread overview]
Message-ID: <1461306907-2837-3-git-send-email-famz@redhat.com> (raw)
In-Reply-To: <1461306907-2837-1-git-send-email-famz@redhat.com>

Commit 5a7e7a0ba moved mirror_exit to a BH handler but didn't add any
protection against guest requests that could sneak in before the BH is
dispatched. For example, this could happen (assuming a code base at that
commit):

        main_loop_wait # 1
          os_host_main_loop_wait
            g_main_context_dispatch
              aio_ctx_dispatch
                aio_dispatch
                  ...
                    mirror_run
                      bdrv_drain
    (a)               block_job_defer_to_main_loop
          qemu_iohandler_poll
            virtio_queue_host_notifier_read
              ...
                virtio_submit_multiwrite
    (b)           blk_aio_multiwrite

        main_loop_wait # 2
          <snip>
                aio_dispatch
                  aio_bh_poll
    (c)             mirror_exit

At (a) we know the BDS has no pending request. However, the same
main_loop_wait call is going to dispatch iohandlers (EventNotifier
events), which may lead to a new I/O from guest. So the invariant is
already broken at (c). Data loss.

Commit f3926945c8 made iohandler to use aio API.  The order of
virtio_queue_host_notifier_read and block_job_defer_to_main_loop within
a main_loop_wait becomes unpredictable, and even worse, if the host
notifier event arrives at the next main_loop_wait call, the
unpredictable order between mirror_exit and
virtio_queue_host_notifier_read is also a trouble. As shown below, this
commit made the bug easier to trigger:

    - Bug case 1:

        main_loop_wait # 1
          os_host_main_loop_wait
            g_main_context_dispatch
              aio_ctx_dispatch (qemu_aio_context)
                ...
                  mirror_run
                    bdrv_drain
    (a)             block_job_defer_to_main_loop
              aio_ctx_dispatch (iohandler_ctx)
                virtio_queue_host_notifier_read
                  ...
                    virtio_submit_multiwrite
    (b)               blk_aio_multiwrite

        main_loop_wait # 2
          ...
                aio_dispatch
                  aio_bh_poll
    (c)             mirror_exit

    - Bug case 2:

        main_loop_wait # 1
          os_host_main_loop_wait
            g_main_context_dispatch
              aio_ctx_dispatch (qemu_aio_context)
                ...
                  mirror_run
                    bdrv_drain
    (a)             block_job_defer_to_main_loop

        main_loop_wait # 2
          ...
            aio_ctx_dispatch (iohandler_ctx)
              virtio_queue_host_notifier_read
                ...
                  virtio_submit_multiwrite
    (b)             blk_aio_multiwrite
              aio_dispatch
                aio_bh_poll
    (c)           mirror_exit

In both cases, (b) breaks the invariant wanted by (a) and (c).

Unfortunately, the request loss has been silent, until 3f09bfbc7be added
an assertion at (c) to check the invariant in
bdrv_replace_in_backing_chain. Max reproduced an assertion failure after
that commit, by doing active committing while the guest is running
bonnie++.

2.5 added bdrv_drained_begin at (a) to protect the dataplane case from
similar problems, but we never realize the main loop bug until now.

As a bandage, this patch undoes the change of 5a7e7a0bad17 in
non-dataplane case, and calls mirror_replace directly in mirror_run.

The next step after 2.6 is to complete bdrv_drained_begin complete
(including both fixing the dispatching code in the main loop and marking
host event notifiers etc. as external) and then this special casing is
not necessary.

Launchpad Bug: 1570134

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/mirror.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block/mirror.c b/block/mirror.c
index 6c3fe43..bc77773 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -726,6 +726,13 @@ immediate_exit:
     /* Before we switch to target in mirror_exit, make sure data doesn't
      * change. */
     bdrv_drained_begin(s->common.bs);
+    if (qemu_get_aio_context() == bdrv_get_aio_context(bs)) {
+        /* FIXME: if we are in the main loop, the iohandler doesn't honor
+         * bdrv_drained_begin yet, and guest requests can sneak in before BH
+         * callback runs. Do the replace now to avoid it. */
+        mirror_replace(s, &data->ret);
+        data->should_replace = false;
+    }
     block_job_defer_to_main_loop(&s->common, mirror_exit, data);
 }
 
-- 
2.8.0

  parent reply	other threads:[~2016-04-22  6:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-22  6:35 [Qemu-devel] [PATCH for-2.6 0/3] block: Fix assertion failure at mirror exit Fam Zheng
2016-04-22  6:35 ` [Qemu-devel] [PATCH for-2.6 1/3] mirror: Extract mirror_replace Fam Zheng
2016-04-22  6:35 ` Fam Zheng [this message]
2016-04-22  9:37   ` [Qemu-devel] [PATCH for-2.6 2/3] mirror: Skip BH for mirror_exit if in main loop Fam Zheng
2016-04-22  6:35 ` [Qemu-devel] [PATCH for-2.6 3/3] block: Update comment of bdrv_drained_begin Fam Zheng
2016-04-22  6:46 ` [Qemu-devel] [PATCH for-2.6 0/3] block: Fix assertion failure at mirror exit Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1461306907-2837-3-git-send-email-famz@redhat.com \
    --to=famz@redhat.com \
    --cc=jcody@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).