qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Klaus Jensen" <its@irrelevant.dk>,
	kwolf@redhat.com, "Cleber Rosa" <crosa@redhat.com>,
	"John Snow" <jsnow@redhat.com>, "Hanna Reitz" <hreitz@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Keith Busch" <kbusch@kernel.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Peter Xu" <peterx@redhat.com>,
	qemu-block@nongnu.org, "Stefan Hajnoczi" <stefanha@redhat.com>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
	"David Hildenbrand" <david@redhat.com>,
	"Fam Zheng" <fam@euphon.net>
Subject: [PATCH v3 2/5] test-bdrv-drain: avoid race with BH in IOThread drain test
Date: Tue, 12 Sep 2023 19:10:34 -0400	[thread overview]
Message-ID: <20230912231037.826804-3-stefanha@redhat.com> (raw)
In-Reply-To: <20230912231037.826804-1-stefanha@redhat.com>

This patch fixes a race condition in test-bdrv-drain that is difficult
to reproduce. test-bdrv-drain sometimes fails without an error message
on the block pull request sent by Kevin Wolf on Sep 4, 2023. I was able
to reproduce it locally and found that "block-backend: process I/O in
the current AioContext" (in this patch series) is the first commit where
it reproduces.

I do not know why "block-backend: process I/O in the current AioContext"
exposes this bug. It might be related to the fact that the test's preadv
request runs in the main thread instead of IOThread a after my commit.
That might simply change the timing of the test.

Now on to the race condition in test-bdrv-drain. The main thread
schedules a BH in IOThread a and then drains the BDS:

  aio_bh_schedule_oneshot(ctx_a, test_iothread_main_thread_bh, &data);

  /* The request is running on the IOThread a. Draining its block device
   * will make sure that it has completed as far as the BDS is concerned,
   * but the drain in this thread can continue immediately after
   * bdrv_dec_in_flight() and aio_ret might be assigned only slightly
   * later. */
  do_drain_begin(drain_type, bs);

If the BH completes before do_drain_begin() then there is nothing to
worry about.

If the BH invokes bdrv_flush() before do_drain_begin(), then
do_drain_begin() waits for it to complete.

The problematic case is when do_drain_begin() runs before the BH enters
bdrv_flush(). Then do_drain_begin() misses the BH and the drain
mechanism has failed in quiescing I/O.

Fix this by incrementing the in_flight counter so that do_drain_begin()
waits for test_iothread_main_thread_bh().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/unit/test-bdrv-drain.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c
index ccc453c29e..67a79aa3f0 100644
--- a/tests/unit/test-bdrv-drain.c
+++ b/tests/unit/test-bdrv-drain.c
@@ -512,6 +512,7 @@ static void test_iothread_main_thread_bh(void *opaque)
      * executed during drain, otherwise this would deadlock. */
     aio_context_acquire(bdrv_get_aio_context(data->bs));
     bdrv_flush(data->bs);
+    bdrv_dec_in_flight(data->bs); /* incremented by test_iothread_common() */
     aio_context_release(bdrv_get_aio_context(data->bs));
 }
 
@@ -583,6 +584,13 @@ static void test_iothread_common(enum drain_type drain_type, int drain_thread)
             aio_context_acquire(ctx_a);
         }
 
+        /*
+         * Increment in_flight so that do_drain_begin() waits for
+         * test_iothread_main_thread_bh(). This prevents the race between
+         * test_iothread_main_thread_bh() in IOThread a and do_drain_begin() in
+         * this thread. test_iothread_main_thread_bh() decrements in_flight.
+         */
+        bdrv_inc_in_flight(bs);
         aio_bh_schedule_oneshot(ctx_a, test_iothread_main_thread_bh, &data);
 
         /* The request is running on the IOThread a. Draining its block device
-- 
2.41.0



  parent reply	other threads:[~2023-09-12 23:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-12 23:10 [PATCH v3 0/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
2023-09-12 23:10 ` [PATCH v3 1/5] block: remove AIOCBInfo->get_aio_context() Stefan Hajnoczi
2023-09-13 15:59   ` Eric Blake
2023-09-14  6:44   ` Klaus Jensen
2023-09-12 23:10 ` Stefan Hajnoczi [this message]
2023-09-13 16:08   ` [PATCH v3 2/5] test-bdrv-drain: avoid race with BH in IOThread drain test Eric Blake
2023-09-14 12:01     ` Stefan Hajnoczi
2023-09-12 23:10 ` [PATCH v3 3/5] block-backend: process I/O in the current AioContext Stefan Hajnoczi
2023-09-13 16:20   ` Eric Blake
2023-09-12 23:10 ` [PATCH v3 4/5] block-backend: process zoned requests " Stefan Hajnoczi
2023-09-13 16:22   ` Eric Blake
2023-09-12 23:10 ` [PATCH v3 5/5] block-coroutine-wrapper: use qemu_get_current_aio_context() Stefan Hajnoczi
2023-09-13 16:23   ` Eric Blake
2023-09-15 14:39 ` [PATCH v3 0/5] block-backend: process I/O in the current AioContext Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230912231037.826804-3-stefanha@redhat.com \
    --to=stefanha@redhat.com \
    --cc=crosa@redhat.com \
    --cc=david@redhat.com \
    --cc=fam@euphon.net \
    --cc=hreitz@redhat.com \
    --cc=its@irrelevant.dk \
    --cc=jsnow@redhat.com \
    --cc=kbusch@kernel.org \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=philmd@linaro.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).