qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL for-2.11-rc1 v2 0/2] Block patches
@ 2017-11-08 19:20 Stefan Hajnoczi
  2017-11-08 19:20 ` [Qemu-devel] [PULL for-2.11-rc1 v2 1/2] tests-aio-multithread: fix /aio/multi/schedule race condition Stefan Hajnoczi
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2017-11-08 19:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Stefan Hajnoczi

The following changes since commit b0fbe46ad82982b289a44ee2495b59b0bad8a842:

  Update version for v2.11.0-rc0 release (2017-11-07 16:05:28 +0000)

are available in the git repository at:

  git://github.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to ef6dada8b44e1e7c4bec5c1115903af9af415b50:

  util/async: use atomic_mb_set in qemu_bh_cancel (2017-11-08 19:09:15 +0000)

----------------------------------------------------------------
Pull request

v2:
 * v1 emails 2/3 and 3/3 weren't sent due to an email failure
 * Included Sergio's updated wording in the commit description

----------------------------------------------------------------

Sergio Lopez (1):
  util/async: use atomic_mb_set in qemu_bh_cancel

Stefan Hajnoczi (1):
  tests-aio-multithread: fix /aio/multi/schedule race condition

 tests/test-aio-multithread.c | 5 ++---
 util/async.c                 | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

-- 
2.13.6

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PULL for-2.11-rc1 v2 1/2] tests-aio-multithread: fix /aio/multi/schedule race condition
  2017-11-08 19:20 [Qemu-devel] [PULL for-2.11-rc1 v2 0/2] Block patches Stefan Hajnoczi
@ 2017-11-08 19:20 ` Stefan Hajnoczi
  2017-11-08 19:20 ` [Qemu-devel] [PULL for-2.11-rc1 v2 2/2] util/async: use atomic_mb_set in qemu_bh_cancel Stefan Hajnoczi
  2017-11-13 10:04 ` [Qemu-devel] [PULL for-2.11-rc1 v2 0/2] Block patches Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2017-11-08 19:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Stefan Hajnoczi

test_multi_co_schedule_entry() set to_schedule[id] in the final loop
iteration before terminating the coroutine.  There is a race condition
where the main thread attempts to enter the terminating or terminated
coroutine when signalling coroutines to stop:

  atomic_mb_set(&now_stopping, true);
  for (i = 0; i < NUM_CONTEXTS; i++) {
      ctx_run(i, finish_cb, NULL);  <--- enters dead coroutine!
      to_schedule[i] = NULL;
  }

Make sure only to set to_schedule[id] if this coroutine really needs to
be scheduled!

Reported-by: "R.Nageswara Sastry" <nasastry@in.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20171106190233.1175-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/test-aio-multithread.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tests/test-aio-multithread.c b/tests/test-aio-multithread.c
index 549d784915..d396185972 100644
--- a/tests/test-aio-multithread.c
+++ b/tests/test-aio-multithread.c
@@ -144,17 +144,16 @@ static void finish_cb(void *opaque)
 static coroutine_fn void test_multi_co_schedule_entry(void *opaque)
 {
     g_assert(to_schedule[id] == NULL);
-    atomic_mb_set(&to_schedule[id], qemu_coroutine_self());
 
     while (!atomic_mb_read(&now_stopping)) {
         int n;
 
         n = g_test_rand_int_range(0, NUM_CONTEXTS);
         schedule_next(n);
+
+        atomic_mb_set(&to_schedule[id], qemu_coroutine_self());
         qemu_coroutine_yield();
-
         g_assert(to_schedule[id] == NULL);
-        atomic_mb_set(&to_schedule[id], qemu_coroutine_self());
     }
 }
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PULL for-2.11-rc1 v2 2/2] util/async: use atomic_mb_set in qemu_bh_cancel
  2017-11-08 19:20 [Qemu-devel] [PULL for-2.11-rc1 v2 0/2] Block patches Stefan Hajnoczi
  2017-11-08 19:20 ` [Qemu-devel] [PULL for-2.11-rc1 v2 1/2] tests-aio-multithread: fix /aio/multi/schedule race condition Stefan Hajnoczi
@ 2017-11-08 19:20 ` Stefan Hajnoczi
  2017-11-13 10:04 ` [Qemu-devel] [PULL for-2.11-rc1 v2 0/2] Block patches Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2017-11-08 19:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Sergio Lopez, Stefan Hajnoczi

From: Sergio Lopez <slp@redhat.com>

Commit b7a745d added a qemu_bh_cancel call to the completion function
as an optimization to prevent it from unnecessarily rescheduling itself.

This completion function is scheduled from worker_thread, after setting
the state of a ThreadPoolElement to THREAD_DONE.

This was considered to be safe, as the completion function restarts the
loop just after the call to qemu_bh_cancel. But, as this loop lacks a HW
memory barrier, the read of req->state may actually happen _before_ the
call, seeing it still as THREAD_QUEUED, and ending the completion
function without having processed a pending TPE linked at pool->head:

         worker thread             |            I/O thread
------------------------------------------------------------------------
                                   | speculatively read req->state
req->state = THREAD_DONE;          |
qemu_bh_schedule(p->completion_bh) |
  bh->scheduled = 1;               |
                                   | qemu_bh_cancel(p->completion_bh)
                                   |   bh->scheduled = 0;
                                   | if (req->state == THREAD_DONE)
                                   |   // sees THREAD_QUEUED

The source of the misunderstanding was that qemu_bh_cancel is now being
used by the _consumer_ rather than the producer, and therefore now needs
to have acquire semantics just like e.g. aio_bh_poll.

In some situations, if there are no other independent requests in the
same aio context that could eventually trigger the scheduling of the
completion function, the omitted TPE and all operations pending on it
will get stuck forever.

[Added Sergio's updated wording about the HW memory barrier.
--Stefan]

Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-id: 20171108063447.2842-1-slp@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 util/async.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/async.c b/util/async.c
index 355af73ee7..0e1bd8780a 100644
--- a/util/async.c
+++ b/util/async.c
@@ -174,7 +174,7 @@ void qemu_bh_schedule(QEMUBH *bh)
  */
 void qemu_bh_cancel(QEMUBH *bh)
 {
-    bh->scheduled = 0;
+    atomic_mb_set(&bh->scheduled, 0);
 }
 
 /* This func is async.The bottom half will do the delete action at the finial
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [PULL for-2.11-rc1 v2 0/2] Block patches
  2017-11-08 19:20 [Qemu-devel] [PULL for-2.11-rc1 v2 0/2] Block patches Stefan Hajnoczi
  2017-11-08 19:20 ` [Qemu-devel] [PULL for-2.11-rc1 v2 1/2] tests-aio-multithread: fix /aio/multi/schedule race condition Stefan Hajnoczi
  2017-11-08 19:20 ` [Qemu-devel] [PULL for-2.11-rc1 v2 2/2] util/async: use atomic_mb_set in qemu_bh_cancel Stefan Hajnoczi
@ 2017-11-13 10:04 ` Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Maydell @ 2017-11-13 10:04 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: QEMU Developers

On 8 November 2017 at 19:20, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> The following changes since commit b0fbe46ad82982b289a44ee2495b59b0bad8a842:
>
>   Update version for v2.11.0-rc0 release (2017-11-07 16:05:28 +0000)
>
> are available in the git repository at:
>
>   git://github.com/stefanha/qemu.git tags/block-pull-request
>
> for you to fetch changes up to ef6dada8b44e1e7c4bec5c1115903af9af415b50:
>
>   util/async: use atomic_mb_set in qemu_bh_cancel (2017-11-08 19:09:15 +0000)
>
> ----------------------------------------------------------------
> Pull request
>
> v2:
>  * v1 emails 2/3 and 3/3 weren't sent due to an email failure
>  * Included Sergio's updated wording in the commit description
>
> ----------------------------------------------------------------

Applied, thanks.

-- PMM

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-11-13 10:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-08 19:20 [Qemu-devel] [PULL for-2.11-rc1 v2 0/2] Block patches Stefan Hajnoczi
2017-11-08 19:20 ` [Qemu-devel] [PULL for-2.11-rc1 v2 1/2] tests-aio-multithread: fix /aio/multi/schedule race condition Stefan Hajnoczi
2017-11-08 19:20 ` [Qemu-devel] [PULL for-2.11-rc1 v2 2/2] util/async: use atomic_mb_set in qemu_bh_cancel Stefan Hajnoczi
2017-11-13 10:04 ` [Qemu-devel] [PULL for-2.11-rc1 v2 0/2] Block patches Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).