[Qemu-devel] [PATCH v2 for-2.11 0/4] Fix segfault in blockjob race condition

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v2 for-2.11 0/4] Fix segfault in blockjob race condition
@ 2017-11-21  2:23 Jeff Cody
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 1/4] blockjob: do not allow coroutine double entry or entry-after-completion Jeff Cody
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Jeff Cody @ 2017-11-21  2:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, mreitz, stefanha, famz, pbonzini, kwolf

Changes from v1 -> v2:

Patch 1: Updated docs in blockjob_int.h (Thanks Stefan)

Patch 2/3: Squashed, and used const char * to hold the __func__ name of
           the original scheduler (Thanks Paolo)

Patch 4: Unchanged.

Patch 5: Dropped qcow format for the test, it was so slow the test times
         out, and it doesn't add any new dimension to the test.


# git-backport-diff -r qemu/master.. -u github/bz1508708

001/4:[0003] [FC] 'blockjob: do not allow coroutine double entry or entry-after-completion'
002/4:[down] 'coroutine: abort if we try to schedule or enter a pending coroutine'
003/4:[----] [--] 'qemu-iotests: add option in common.qemu for mismatch only'
004/4:[0002] [FC] 'qemu-iotest: add test for blockjob coroutine race condition'


This series fixes a race condition segfault when using iothreads with
blockjobs.

The qemu iotest in this series is a reproducer, as is the reproducer
script attached in this bug report:

https://bugzilla.redhat.com/show_bug.cgi?id=1508708

There are two additional patches to try and catch this sort of scenario
with an abort, before a segfault or memory corruption occurs.


Jeff Cody (4):
  blockjob: do not allow coroutine double entry or
    entry-after-completion
  coroutine: abort if we try to schedule or enter a pending coroutine
  qemu-iotests: add option in common.qemu for mismatch only
  qemu-iotest: add test for blockjob coroutine race condition

 blockjob.c                     |  9 ++--
 include/block/blockjob_int.h   |  3 +-
 include/qemu/coroutine_int.h   |  6 +++
 tests/qemu-iotests/200         | 99 ++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/200.out     | 14 ++++++
 tests/qemu-iotests/common.qemu |  8 +++-
 tests/qemu-iotests/group       |  1 +
 util/async.c                   | 11 +++++
 util/qemu-coroutine-sleep.c    | 11 +++++
 util/qemu-coroutine.c          | 11 +++++
 10 files changed, 168 insertions(+), 5 deletions(-)
 create mode 100755 tests/qemu-iotests/200
 create mode 100644 tests/qemu-iotests/200.out

-- 
2.9.5

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v2 for-2.11 1/4] blockjob: do not allow coroutine double entry or entry-after-completion
  2017-11-21  2:23 [Qemu-devel] [PATCH v2 for-2.11 0/4] Fix segfault in blockjob race condition Jeff Cody
@ 2017-11-21  2:23 ` Jeff Cody
  2017-11-21 10:49   ` Stefan Hajnoczi
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine Jeff Cody
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Jeff Cody @ 2017-11-21  2:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, mreitz, stefanha, famz, pbonzini, kwolf

When block_job_sleep_ns() is called, the co-routine is scheduled for
future execution.  If we allow the job to be re-entered prior to the
scheduled time, we present a race condition in which a coroutine can be
entered recursively, or even entered after the coroutine is deleted.

The job->busy flag is used by blockjobs when a coroutine is busy
executing. The function 'block_job_enter()' obeys the busy flag,
and will not enter a coroutine if set.  If we sleep a job, we need to
leave the busy flag set, so that subsequent calls to block_job_enter()
are prevented.

This fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1508708

Also, in block_job_start(), set the relevant job flags (.busy, .paused)
before creating the coroutine, not just before executing it.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 blockjob.c                   | 9 ++++++---
 include/block/blockjob_int.h | 3 ++-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 3a0c491..e181295 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -291,10 +291,10 @@ void block_job_start(BlockJob *job)
 {
     assert(job && !block_job_started(job) && job->paused &&
            job->driver && job->driver->start);
-    job->co = qemu_coroutine_create(block_job_co_entry, job);
     job->pause_count--;
     job->busy = true;
     job->paused = false;
+    job->co = qemu_coroutine_create(block_job_co_entry, job);
     bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
@@ -797,11 +797,14 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
         return;
     }
 
-    job->busy = false;
+    /* We need to leave job->busy set here, because when we have
+     * put a coroutine to 'sleep', we have scheduled it to run in
+     * the future.  We cannot enter that same coroutine again before
+     * it wakes and runs, otherwise we risk double-entry or entry after
+     * completion. */
     if (!block_job_should_pause(job)) {
         co_aio_sleep_ns(blk_get_aio_context(job->blk), type, ns);
     }
-    job->busy = true;
 
     block_job_pause_point(job);
 }
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index f13ad05..43f3be2 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -143,7 +143,8 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
  * @ns: How many nanoseconds to stop for.
  *
  * Put the job to sleep (assuming that it wasn't canceled) for @ns
- * nanoseconds.  Canceling the job will interrupt the wait immediately.
+ * nanoseconds.  Canceling the job will not interrupt the wait, so the
+ * cancel will not process until the coroutine wakes up.
  */
 void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns);
 
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine
  2017-11-21  2:23 [Qemu-devel] [PATCH v2 for-2.11 0/4] Fix segfault in blockjob race condition Jeff Cody
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 1/4] blockjob: do not allow coroutine double entry or entry-after-completion Jeff Cody
@ 2017-11-21  2:23 ` Jeff Cody
  2017-11-21 10:59   ` Stefan Hajnoczi
                     ` (2 more replies)
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 3/4] qemu-iotests: add option in common.qemu for mismatch only Jeff Cody
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 4/4] qemu-iotest: add test for blockjob coroutine race condition Jeff Cody
  3 siblings, 3 replies; 13+ messages in thread
From: Jeff Cody @ 2017-11-21  2:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, mreitz, stefanha, famz, pbonzini, kwolf

The previous patch fixed a race condition, in which there were
coroutines being executing doubly, or after coroutine deletion.

We can detect common scenarios when this happens, and print an error
message and abort before we corrupt memory / data, or segfault.

This patch will abort if an attempt to enter a coroutine is made while
it is currently pending execution, either in a specific AioContext bh,
or pending execution via a timer.  It will also abort if a coroutine
is scheduled, before a prior scheduled run has occured.

We cannot rely on the existing co->caller check for recursive re-entry
to catch this, as the coroutine may run and exit with
COROUTINE_TERMINATE before the scheduled coroutine executes.

(This is the scenario that was occuring and fixed in the previous
patch).

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 include/qemu/coroutine_int.h |  6 ++++++
 util/async.c                 | 11 +++++++++++
 util/qemu-coroutine-sleep.c  | 11 +++++++++++
 util/qemu-coroutine.c        | 11 +++++++++++
 4 files changed, 39 insertions(+)

diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
index cb98892..56e4c48 100644
--- a/include/qemu/coroutine_int.h
+++ b/include/qemu/coroutine_int.h
@@ -53,6 +53,12 @@ struct Coroutine {
 
     /* Only used when the coroutine has yielded.  */
     AioContext *ctx;
+
+    /* Used to catch and abort on illegal co-routine entry.
+     * Will contain the name of the function that had first
+     * scheduled the coroutine. */
+    const char *scheduled;
+
     QSIMPLEQ_ENTRY(Coroutine) co_queue_next;
     QSLIST_ENTRY(Coroutine) co_scheduled_next;
 };
diff --git a/util/async.c b/util/async.c
index 0e1bd87..49174b3 100644
--- a/util/async.c
+++ b/util/async.c
@@ -388,6 +388,7 @@ static void co_schedule_bh_cb(void *opaque)
         QSLIST_REMOVE_HEAD(&straight, co_scheduled_next);
         trace_aio_co_schedule_bh_cb(ctx, co);
         aio_context_acquire(ctx);
+        atomic_set(&co->scheduled, NULL);
         qemu_coroutine_enter(co);
         aio_context_release(ctx);
     }
@@ -438,6 +439,16 @@ fail:
 void aio_co_schedule(AioContext *ctx, Coroutine *co)
 {
     trace_aio_co_schedule(ctx, co);
+    const char *scheduled = atomic_read(&co->scheduled);
+
+    if (scheduled) {
+        fprintf(stderr,
+                "%s: Co-routine was already scheduled in '%s'\n",
+                __func__, scheduled);
+        abort();
+    }
+    atomic_set(&co->scheduled, __func__);
+
     QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
                               co, co_scheduled_next);
     qemu_bh_schedule(ctx->co_schedule_bh);
diff --git a/util/qemu-coroutine-sleep.c b/util/qemu-coroutine-sleep.c
index 9c56550..38dc4c8 100644
--- a/util/qemu-coroutine-sleep.c
+++ b/util/qemu-coroutine-sleep.c
@@ -13,6 +13,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/coroutine.h"
+#include "qemu/coroutine_int.h"
 #include "qemu/timer.h"
 #include "block/aio.h"
 
@@ -25,6 +26,7 @@ static void co_sleep_cb(void *opaque)
 {
     CoSleepCB *sleep_cb = opaque;
 
+    atomic_set(&sleep_cb->co->scheduled, NULL);
     aio_co_wake(sleep_cb->co);
 }
 
@@ -34,6 +36,15 @@ void coroutine_fn co_aio_sleep_ns(AioContext *ctx, QEMUClockType type,
     CoSleepCB sleep_cb = {
         .co = qemu_coroutine_self(),
     };
+    const char *scheduled = atomic_read(&sleep_cb.co->scheduled);
+
+    if (scheduled) {
+        fprintf(stderr,
+                "%s: Co-routine was already scheduled in '%s'\n",
+                __func__, scheduled);
+        abort();
+    }
+    atomic_set(&sleep_cb.co->scheduled, __func__);
     sleep_cb.ts = aio_timer_new(ctx, type, SCALE_NS, co_sleep_cb, &sleep_cb);
     timer_mod(sleep_cb.ts, qemu_clock_get_ns(type) + ns);
     qemu_coroutine_yield();
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index d6095c1..fbfd0ad 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -106,9 +106,20 @@ void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine *co)
 {
     Coroutine *self = qemu_coroutine_self();
     CoroutineAction ret;
+    const char *scheduled = atomic_read(&co->scheduled);
 
     trace_qemu_aio_coroutine_enter(ctx, self, co, co->entry_arg);
 
+    /* if the Coroutine has already been scheduled, entering it again will
+     * cause us to enter it twice, potentially even after the coroutine has
+     * been deleted */
+    if (scheduled) {
+        fprintf(stderr,
+                "%s: Co-routine was already scheduled in '%s'\n",
+                __func__, scheduled);
+        abort();
+    }
+
     if (co->caller) {
         fprintf(stderr, "Co-routine re-entered recursively\n");
         abort();
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v2 for-2.11 3/4] qemu-iotests: add option in common.qemu for mismatch only
  2017-11-21  2:23 [Qemu-devel] [PATCH v2 for-2.11 0/4] Fix segfault in blockjob race condition Jeff Cody
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 1/4] blockjob: do not allow coroutine double entry or entry-after-completion Jeff Cody
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine Jeff Cody
@ 2017-11-21  2:23 ` Jeff Cody
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 4/4] qemu-iotest: add test for blockjob coroutine race condition Jeff Cody
  3 siblings, 0 replies; 13+ messages in thread
From: Jeff Cody @ 2017-11-21  2:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, mreitz, stefanha, famz, pbonzini, kwolf

Add option to echo response to QMP / HMP command only on mismatch.

Useful for ignore all normal responses, but catching things like
segfaults.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 tests/qemu-iotests/common.qemu | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 7b3052d..85f66b8 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -50,6 +50,8 @@ _in_fd=4
 #
 # If $silent is set to anything but an empty string, then
 # response is not echoed out.
+# If $mismatch_only is set, only non-matching responses will
+# be echoed.
 function _timed_wait_for()
 {
     local h=${1}
@@ -58,14 +60,18 @@ function _timed_wait_for()
     QEMU_STATUS[$h]=0
     while IFS= read -t ${QEMU_COMM_TIMEOUT} resp <&${QEMU_OUT[$h]}
     do
-        if [ -z "${silent}" ]; then
+        if [ -z "${silent}" ] && [ -z "${mismatch_only}" ]; then
             echo "${resp}" | _filter_testdir | _filter_qemu \
                            | _filter_qemu_io | _filter_qmp | _filter_hmp
         fi
         grep -q "${*}" < <(echo "${resp}")
         if [ $? -eq 0 ]; then
             return
+        elif [ -z "${silent}" ] && [ -n "${mismatch_only}" ]; then
+            echo "${resp}" | _filter_testdir | _filter_qemu \
+                           | _filter_qemu_io | _filter_qmp | _filter_hmp
         fi
+
     done
     QEMU_STATUS[$h]=-1
     if [ -z "${qemu_error_no_exit}" ]; then
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH v2 for-2.11 4/4] qemu-iotest: add test for blockjob coroutine race condition
  2017-11-21  2:23 [Qemu-devel] [PATCH v2 for-2.11 0/4] Fix segfault in blockjob race condition Jeff Cody
                   ` (2 preceding siblings ...)
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 3/4] qemu-iotests: add option in common.qemu for mismatch only Jeff Cody
@ 2017-11-21  2:23 ` Jeff Cody
  3 siblings, 0 replies; 13+ messages in thread
From: Jeff Cody @ 2017-11-21  2:23 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, mreitz, stefanha, famz, pbonzini, kwolf

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 tests/qemu-iotests/200     | 99 ++++++++++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/200.out | 14 +++++++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 114 insertions(+)
 create mode 100755 tests/qemu-iotests/200
 create mode 100644 tests/qemu-iotests/200.out

diff --git a/tests/qemu-iotests/200 b/tests/qemu-iotests/200
new file mode 100755
index 0000000..d8787dd
--- /dev/null
+++ b/tests/qemu-iotests/200
@@ -0,0 +1,99 @@
+#!/bin/bash
+#
+# Block job co-routine race condition test.
+#
+# See: https://bugzilla.redhat.com/show_bug.cgi?id=1508708
+#
+# Copyright (C) 2017 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=jcody@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+here=`pwd`
+status=1    # failure is the default!
+
+_cleanup()
+{
+    _cleanup_qemu
+    rm -f "${TEST_IMG}" "${BACKING_IMG}"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+_supported_fmt qcow2 qed
+_supported_proto file
+_supported_os Linux
+
+BACKING_IMG="${TEST_DIR}/backing.img"
+TEST_IMG="${TEST_DIR}/test.img"
+
+${QEMU_IMG} create -f $IMGFMT "${BACKING_IMG}" 512M | _filter_img_create
+${QEMU_IMG} create -f $IMGFMT -F $IMGFMT "${TEST_IMG}" -b "${BACKING_IMG}" 512M | _filter_img_create
+
+${QEMU_IO} -c "write -P 0xa5 512 300M" "${BACKING_IMG}" | _filter_qemu_io
+
+echo
+echo === Starting QEMU VM ===
+echo
+qemu_comm_method="qmp"
+_launch_qemu -device pci-bridge,id=bridge1,chassis_nr=1,bus=pci.0 \
+             -object iothread,id=iothread0 \
+             -device virtio-scsi-pci,bus=bridge1,addr=0x1f,id=scsi0,iothread=iothread0 \
+             -drive file="${TEST_IMG}",media=disk,if=none,cache=none,id=drive_sysdisk,aio=native,format=$IMGFMT \
+             -device scsi-hd,drive=drive_sysdisk,bus=scsi0.0,id=sysdisk,bootindex=0
+h1=$QEMU_HANDLE
+
+_send_qemu_cmd $h1 "{ 'execute': 'qmp_capabilities' }" 'return'
+
+echo
+echo === Sending stream/cancel, checking for SIGSEGV only  ===
+echo
+for (( i=1;i<500;i++ ))
+do
+    mismatch_only='y' qemu_error_no_exit='n' _send_qemu_cmd $h1 \
+                       "{
+                            'execute': 'block-stream',
+                            'arguments': {
+                                'device': 'drive_sysdisk',
+                                'speed': 10000000,
+                                'on-error': 'report',
+                                'job-id': 'job-$i'
+                             }
+                        }
+                        {
+                            'execute': 'block-job-cancel',
+                            'arguments': {
+                                'device': 'job-$i'
+                             }
+                        }" \
+                       "{.*{.*}.*}"  # should match all well-formed QMP responses
+done
+
+silent='y' _send_qemu_cmd $h1  "{ 'execute': 'quit' }" 'return'
+
+echo "$i iterations performed"
+
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/200.out b/tests/qemu-iotests/200.out
new file mode 100644
index 0000000..af6a809
--- /dev/null
+++ b/tests/qemu-iotests/200.out
@@ -0,0 +1,14 @@
+QA output created by 200
+Formatting 'TEST_DIR/backing.img', fmt=IMGFMT size=536870912
+Formatting 'TEST_DIR/test.img', fmt=IMGFMT size=536870912 backing_file=TEST_DIR/backing.img backing_fmt=IMGFMT
+wrote 314572800/314572800 bytes at offset 512
+300 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Starting QEMU VM ===
+
+{"return": {}}
+
+=== Sending stream/cancel, checking for SIGSEGV only ===
+
+500 iterations performed
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 24e5ad1..25d9adf 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -194,3 +194,4 @@
 194 rw auto migration quick
 195 rw auto quick
 197 rw auto quick
+200 rw auto
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.11 1/4] blockjob: do not allow coroutine double entry or entry-after-completion
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 1/4] blockjob: do not allow coroutine double entry or entry-after-completion Jeff Cody
@ 2017-11-21 10:49   ` Stefan Hajnoczi
  2017-11-21 13:12     ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Hajnoczi @ 2017-11-21 10:49 UTC (permalink / raw)
  To: Jeff Cody; +Cc: qemu-devel, qemu-block, mreitz, famz, pbonzini, kwolf

[-- Attachment #1: Type: text/plain, Size: 576 bytes --]

On Mon, Nov 20, 2017 at 09:23:23PM -0500, Jeff Cody wrote:
> @@ -291,10 +291,10 @@ void block_job_start(BlockJob *job)
>  {
>      assert(job && !block_job_started(job) && job->paused &&
>             job->driver && job->driver->start);
> -    job->co = qemu_coroutine_create(block_job_co_entry, job);
>      job->pause_count--;
>      job->busy = true;
>      job->paused = false;
> +    job->co = qemu_coroutine_create(block_job_co_entry, job);
>      bdrv_coroutine_enter(blk_bs(job->blk), job->co);
>  }

Please see discussion on v1 about this hunk.

The rest looks good.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine Jeff Cody
@ 2017-11-21 10:59   ` Stefan Hajnoczi
  2017-11-21 13:11     ` Paolo Bonzini
  2017-11-21 12:20   ` Eric Blake
  2017-11-21 13:47   ` Kevin Wolf
  2 siblings, 1 reply; 13+ messages in thread
From: Stefan Hajnoczi @ 2017-11-21 10:59 UTC (permalink / raw)
  To: Jeff Cody; +Cc: qemu-devel, qemu-block, mreitz, famz, pbonzini, kwolf

[-- Attachment #1: Type: text/plain, Size: 897 bytes --]

On Mon, Nov 20, 2017 at 09:23:24PM -0500, Jeff Cody wrote:
> @@ -438,6 +439,16 @@ fail:
>  void aio_co_schedule(AioContext *ctx, Coroutine *co)
>  {
>      trace_aio_co_schedule(ctx, co);
> +    const char *scheduled = atomic_read(&co->scheduled);
> +
> +    if (scheduled) {
> +        fprintf(stderr,
> +                "%s: Co-routine was already scheduled in '%s'\n",
> +                __func__, scheduled);
> +        abort();
> +    }
> +    atomic_set(&co->scheduled, __func__);

According to docs/devel/atomics.txt atomic_set()/atomic_read() are
weakly ordered.  They require memory barriers to provide guarantees
about ordering.  Your patch doesn't include barriers or comments about
where the implicit barriers are.

The docs recommend using the following instead of
atomic_read()/atomic_set() to get ordering:

  typeof(*ptr) atomic_mb_read(ptr)
  void         atomic_mb_set(ptr, val)

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine Jeff Cody
  2017-11-21 10:59   ` Stefan Hajnoczi
@ 2017-11-21 12:20   ` Eric Blake
  2017-11-21 13:47   ` Kevin Wolf
  2 siblings, 0 replies; 13+ messages in thread
From: Eric Blake @ 2017-11-21 12:20 UTC (permalink / raw)
  To: Jeff Cody, qemu-devel; +Cc: kwolf, famz, qemu-block, mreitz, stefanha, pbonzini

[-- Attachment #1: Type: text/plain, Size: 1126 bytes --]

On 11/20/2017 08:23 PM, Jeff Cody wrote:
> The previous patch fixed a race condition, in which there were
> coroutines being executing doubly, or after coroutine deletion.
> 
> We can detect common scenarios when this happens, and print an error
> message and abort before we corrupt memory / data, or segfault.
> 
> This patch will abort if an attempt to enter a coroutine is made while
> it is currently pending execution, either in a specific AioContext bh,
> or pending execution via a timer.  It will also abort if a coroutine
> is scheduled, before a prior scheduled run has occured.

s/occured/occurred/

> 
> We cannot rely on the existing co->caller check for recursive re-entry
> to catch this, as the coroutine may run and exit with
> COROUTINE_TERMINATE before the scheduled coroutine executes.
> 
> (This is the scenario that was occuring and fixed in the previous

s/occuring/occurring/

> patch).
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>
> ---

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine
  2017-11-21 10:59   ` Stefan Hajnoczi
@ 2017-11-21 13:11     ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2017-11-21 13:11 UTC (permalink / raw)
  To: Stefan Hajnoczi, Jeff Cody; +Cc: qemu-devel, qemu-block, mreitz, famz, kwolf

[-- Attachment #1: Type: text/plain, Size: 1984 bytes --]

On 21/11/2017 11:59, Stefan Hajnoczi wrote:
> On Mon, Nov 20, 2017 at 09:23:24PM -0500, Jeff Cody wrote:
>> @@ -438,6 +439,16 @@ fail:
>>  void aio_co_schedule(AioContext *ctx, Coroutine *co)
>>  {
>>      trace_aio_co_schedule(ctx, co);
>> +    const char *scheduled = atomic_read(&co->scheduled);
>> +
>> +    if (scheduled) {
>> +        fprintf(stderr,
>> +                "%s: Co-routine was already scheduled in '%s'\n",
>> +                __func__, scheduled);
>> +        abort();
>> +    }
>> +    atomic_set(&co->scheduled, __func__);
> 
> According to docs/devel/atomics.txt atomic_set()/atomic_read() are
> weakly ordered.  They require memory barriers to provide guarantees
> about ordering.  Your patch doesn't include barriers or comments about
> where the implicit barriers are.
> 
> The docs recommend using the following instead of
> atomic_read()/atomic_set() to get ordering:
> 
>   typeof(*ptr) atomic_mb_read(ptr)
>   void         atomic_mb_set(ptr, val)

Even with atomic_mb_read/atomic_mb_set we should document what memory
ordering is required for correctness (i.e. what should be ordered
against what, or equivalently what operation has release/acquire semantics).

My reasoning is that the NULL write to co->scheduled should be ordered
before a write of co (anywhere), but the same is true for co->ctx so the
NULL write is ordered by the smp_wmb in qemu_aio_coroutine_enter.  So
that is fine but it needs a comment.

Dually, the read should be ordered after a read of co, but that's
handled by the callers via atomic_rcu_read if they need the ordering.
No comment needed here I think.

What's left is the __func__ write, where atomic_mb_set should be used
indeed. Or, perhaps better:

    const char *scheduled = atomic_cmpxchg(&co->scheduled,
                                           NULL, __func__);
    if (scheduled) {
        ...
    }

... which also removes the atomic_read.

Thanks,

Paolo

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.11 1/4] blockjob: do not allow coroutine double entry or entry-after-completion
  2017-11-21 10:49   ` Stefan Hajnoczi
@ 2017-11-21 13:12     ` Paolo Bonzini
  2017-11-21 13:26       ` Jeff Cody
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2017-11-21 13:12 UTC (permalink / raw)
  To: Stefan Hajnoczi, Jeff Cody; +Cc: qemu-devel, qemu-block, mreitz, famz, kwolf

[-- Attachment #1: Type: text/plain, Size: 842 bytes --]

On 21/11/2017 11:49, Stefan Hajnoczi wrote:
> On Mon, Nov 20, 2017 at 09:23:23PM -0500, Jeff Cody wrote:
>> @@ -291,10 +291,10 @@ void block_job_start(BlockJob *job)
>>  {
>>      assert(job && !block_job_started(job) && job->paused &&
>>             job->driver && job->driver->start);
>> -    job->co = qemu_coroutine_create(block_job_co_entry, job);
>>      job->pause_count--;
>>      job->busy = true;
>>      job->paused = false;
>> +    job->co = qemu_coroutine_create(block_job_co_entry, job);
>>      bdrv_coroutine_enter(blk_bs(job->blk), job->co);
>>  }
> 
> Please see discussion on v1 about this hunk.
> 
> The rest looks good.

I'm okay with this hunk, but I would appreciate that the commit message
said why it's okay to delay block job cancellation after
block_job_sleep_ns returns.

Thanks,

Paolo


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.11 1/4] blockjob: do not allow coroutine double entry or entry-after-completion
  2017-11-21 13:12     ` Paolo Bonzini
@ 2017-11-21 13:26       ` Jeff Cody
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Cody @ 2017-11-21 13:26 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Stefan Hajnoczi, qemu-devel, qemu-block, mreitz, famz, kwolf

On Tue, Nov 21, 2017 at 02:12:32PM +0100, Paolo Bonzini wrote:
> On 21/11/2017 11:49, Stefan Hajnoczi wrote:
> > On Mon, Nov 20, 2017 at 09:23:23PM -0500, Jeff Cody wrote:
> >> @@ -291,10 +291,10 @@ void block_job_start(BlockJob *job)
> >>  {
> >>      assert(job && !block_job_started(job) && job->paused &&
> >>             job->driver && job->driver->start);
> >> -    job->co = qemu_coroutine_create(block_job_co_entry, job);
> >>      job->pause_count--;
> >>      job->busy = true;
> >>      job->paused = false;
> >> +    job->co = qemu_coroutine_create(block_job_co_entry, job);
> >>      bdrv_coroutine_enter(blk_bs(job->blk), job->co);
> >>  }
> > 
> > Please see discussion on v1 about this hunk.
> > 
> > The rest looks good.
> 
> I'm okay with this hunk, but I would appreciate that the commit message
> said why it's okay to delay block job cancellation after
> block_job_sleep_ns returns.
> 

Stefan is right in his reply to my v1, so I'll go ahead and drop this hunk
for v3.  I'll also add the info you requested to the commit message.

Jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine
  2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine Jeff Cody
  2017-11-21 10:59   ` Stefan Hajnoczi
  2017-11-21 12:20   ` Eric Blake
@ 2017-11-21 13:47   ` Kevin Wolf
  2017-11-21 15:11     ` Paolo Bonzini
  2 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2017-11-21 13:47 UTC (permalink / raw)
  To: Jeff Cody; +Cc: qemu-devel, qemu-block, mreitz, stefanha, famz, pbonzini

Am 21.11.2017 um 03:23 hat Jeff Cody geschrieben:
> The previous patch fixed a race condition, in which there were
> coroutines being executing doubly, or after coroutine deletion.
> 
> We can detect common scenarios when this happens, and print an error
> message and abort before we corrupt memory / data, or segfault.
> 
> This patch will abort if an attempt to enter a coroutine is made while
> it is currently pending execution, either in a specific AioContext bh,
> or pending execution via a timer.  It will also abort if a coroutine
> is scheduled, before a prior scheduled run has occured.
> 
> We cannot rely on the existing co->caller check for recursive re-entry
> to catch this, as the coroutine may run and exit with
> COROUTINE_TERMINATE before the scheduled coroutine executes.
> 
> (This is the scenario that was occuring and fixed in the previous
> patch).
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>
> ---
>  include/qemu/coroutine_int.h |  6 ++++++
>  util/async.c                 | 11 +++++++++++
>  util/qemu-coroutine-sleep.c  | 11 +++++++++++
>  util/qemu-coroutine.c        | 11 +++++++++++
>  4 files changed, 39 insertions(+)
> 
> diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
> index cb98892..56e4c48 100644
> --- a/include/qemu/coroutine_int.h
> +++ b/include/qemu/coroutine_int.h
> @@ -53,6 +53,12 @@ struct Coroutine {
>  
>      /* Only used when the coroutine has yielded.  */
>      AioContext *ctx;
> +
> +    /* Used to catch and abort on illegal co-routine entry.
> +     * Will contain the name of the function that had first
> +     * scheduled the coroutine. */
> +    const char *scheduled;

Not sure if it makes any difference in practice, but I just want to
mention that the new field is right after a cacheline boundary and
the only field that is used in qemu_aio_coroutine_enter() and accesses
this second cacheline.

I'm not paying much attention to this kind of thing in most contexts,
but entering a coroutine is a hot path that we want to be fast, so maybe
it's worth having a second look.

Kevin

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine
  2017-11-21 13:47   ` Kevin Wolf
@ 2017-11-21 15:11     ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2017-11-21 15:11 UTC (permalink / raw)
  To: Kevin Wolf, Jeff Cody; +Cc: qemu-devel, qemu-block, mreitz, stefanha, famz

On 21/11/2017 14:47, Kevin Wolf wrote:
> Am 21.11.2017 um 03:23 hat Jeff Cody geschrieben:
>> The previous patch fixed a race condition, in which there were
>> coroutines being executing doubly, or after coroutine deletion.
>>
>> We can detect common scenarios when this happens, and print an error
>> message and abort before we corrupt memory / data, or segfault.
>>
>> This patch will abort if an attempt to enter a coroutine is made while
>> it is currently pending execution, either in a specific AioContext bh,
>> or pending execution via a timer.  It will also abort if a coroutine
>> is scheduled, before a prior scheduled run has occured.
>>
>> We cannot rely on the existing co->caller check for recursive re-entry
>> to catch this, as the coroutine may run and exit with
>> COROUTINE_TERMINATE before the scheduled coroutine executes.
>>
>> (This is the scenario that was occuring and fixed in the previous
>> patch).
>>
>> Signed-off-by: Jeff Cody <jcody@redhat.com>
>> ---
>>  include/qemu/coroutine_int.h |  6 ++++++
>>  util/async.c                 | 11 +++++++++++
>>  util/qemu-coroutine-sleep.c  | 11 +++++++++++
>>  util/qemu-coroutine.c        | 11 +++++++++++
>>  4 files changed, 39 insertions(+)
>>
>> diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
>> index cb98892..56e4c48 100644
>> --- a/include/qemu/coroutine_int.h
>> +++ b/include/qemu/coroutine_int.h
>> @@ -53,6 +53,12 @@ struct Coroutine {
>>  
>>      /* Only used when the coroutine has yielded.  */
>>      AioContext *ctx;
>> +
>> +    /* Used to catch and abort on illegal co-routine entry.
>> +     * Will contain the name of the function that had first
>> +     * scheduled the coroutine. */
>> +    const char *scheduled;
> 
> Not sure if it makes any difference in practice, but I just want to
> mention that the new field is right after a cacheline boundary and
> the only field that is used in qemu_aio_coroutine_enter() and accesses
> this second cacheline.
> 
> I'm not paying much attention to this kind of thing in most contexts,
> but entering a coroutine is a hot path that we want to be fast, so maybe
> it's worth having a second look.

Makes sense!  Since co_queue_wakeup is used on *yield*, maybe the order
should be: ctx, scheduled, co_queue_next, co_queue_wakeup,
co_scheduled_next.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-11-21 15:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-21  2:23 [Qemu-devel] [PATCH v2 for-2.11 0/4] Fix segfault in blockjob race condition Jeff Cody
2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 1/4] blockjob: do not allow coroutine double entry or entry-after-completion Jeff Cody
2017-11-21 10:49   ` Stefan Hajnoczi
2017-11-21 13:12     ` Paolo Bonzini
2017-11-21 13:26       ` Jeff Cody
2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 2/4] coroutine: abort if we try to schedule or enter a pending coroutine Jeff Cody
2017-11-21 10:59   ` Stefan Hajnoczi
2017-11-21 13:11     ` Paolo Bonzini
2017-11-21 12:20   ` Eric Blake
2017-11-21 13:47   ` Kevin Wolf
2017-11-21 15:11     ` Paolo Bonzini
2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 3/4] qemu-iotests: add option in common.qemu for mismatch only Jeff Cody
2017-11-21  2:23 ` [Qemu-devel] [PATCH v2 for-2.11 4/4] qemu-iotest: add test for blockjob coroutine race condition Jeff Cody

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).