From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42085) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dFfQd-0000rq-7J for qemu-devel@nongnu.org; Tue, 30 May 2017 07:35:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dFfQb-0000Qp-Lu for qemu-devel@nongnu.org; Tue, 30 May 2017 07:35:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52390) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dFfQb-0000Pi-D1 for qemu-devel@nongnu.org; Tue, 30 May 2017 07:35:09 -0400 Date: Tue, 30 May 2017 19:35:02 +0800 From: Fam Zheng Message-ID: <20170530113502.GC27787@lemon.lan> References: <20170530100736.3338-1-roman.penyaev@profitbricks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170530100736.3338-1-roman.penyaev@profitbricks.com> Subject: Re: [Qemu-devel] [PATCH v2 1/1] coroutine-lock: do not touch coroutine after another one has been entered List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Roman Pen Cc: Kevin Wolf , qemu-devel@nongnu.org, Stefan Hajnoczi , Paolo Bonzini On Tue, 05/30 12:07, Roman Pen wrote: > Submission of requests on linux aio is a bit tricky and can lead to > requests completions on submission path: > > 44713c9e8547 ("linux-aio: Handle io_submit() failure gracefully") > 0ed93d84edab ("linux-aio: process completions from ioq_submit()") > > That means that any coroutine which has been yielded in order to wait > for completion can be resumed from submission path and be eventually > terminated (freed). > > The following use-after-free crash was observed when IO throttling > was enabled: > > Program received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7f5813dff700 (LWP 56417)] > virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=) at virtio.c:252 > (gdb) bt > #0 virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=) at virtio.c:252 > ^^^^^^^^^^^^^^ > remember the address > > #1 virtqueue_fill (vq=0x5598b20d21b0, elem=0x7f5804009a30, len=1, idx=0) at virtio.c:282 > #2 virtqueue_push (vq=0x5598b20d21b0, elem=elem@entry=0x7f5804009a30, len=) at virtio.c:308 > #3 virtio_blk_req_complete (req=req@entry=0x7f5804009a30, status=status@entry=0 '\000') at virtio-blk.c:61 > #4 virtio_blk_rw_complete (opaque=, ret=0) at virtio-blk.c:126 > #5 blk_aio_complete (acb=0x7f58040068d0) at block-backend.c:923 > #6 coroutine_trampoline (i0=, i1=) at coroutine-ucontext.c:78 > > (gdb) p * elem > $8 = {index = 77, out_num = 2, in_num = 1, > in_addr = 0x7f5804009ad8, out_addr = 0x7f5804009ae0, > in_sg = 0x0, out_sg = 0x7f5804009a50} > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > 'in_sg' and 'out_sg' are invalid. > e.g. it is impossible that 'in_sg' is zero, > instead its value must be equal to: > > (gdb) p/x 0x7f5804009ad8 + sizeof(elem->in_addr[0]) + 2 * sizeof(elem->out_addr[0]) > $26 = 0x7f5804009af0 > > Seems 'elem' was corrupted. Meanwhile another thread raised an abort: > > Thread 12 (Thread 0x7f57f2ffd700 (LWP 56426)): > #0 raise () from /lib/x86_64-linux-gnu/libc.so.6 > #1 abort () from /lib/x86_64-linux-gnu/libc.so.6 > #2 qemu_coroutine_enter (co=0x7f5804009af0) at qemu-coroutine.c:113 > #3 qemu_co_queue_run_restart (co=0x7f5804009a30) at qemu-coroutine-lock.c:60 > #4 qemu_coroutine_enter (co=0x7f5804009a30) at qemu-coroutine.c:119 > ^^^^^^^^^^^^^^^^^^ > WTF?? this is equal to elem from crashed thread > > #5 qemu_co_queue_run_restart (co=0x7f57e7f16ae0) at qemu-coroutine-lock.c:60 > #6 qemu_coroutine_enter (co=0x7f57e7f16ae0) at qemu-coroutine.c:119 > #7 qemu_co_queue_run_restart (co=0x7f5807e112a0) at qemu-coroutine-lock.c:60 > #8 qemu_coroutine_enter (co=0x7f5807e112a0) at qemu-coroutine.c:119 > #9 qemu_co_queue_run_restart (co=0x7f5807f17820) at qemu-coroutine-lock.c:60 > #10 qemu_coroutine_enter (co=0x7f5807f17820) at qemu-coroutine.c:119 > #11 qemu_co_queue_run_restart (co=0x7f57e7f18e10) at qemu-coroutine-lock.c:60 > #12 qemu_coroutine_enter (co=0x7f57e7f18e10) at qemu-coroutine.c:119 > #13 qemu_co_enter_next (queue=queue@entry=0x5598b1e742d0) at qemu-coroutine-lock.c:106 > #14 timer_cb (blk=0x5598b1e74280, is_write=) at throttle-groups.c:419 > > Crash can be explained by access of 'co' object from the loop inside > qemu_co_queue_run_restart(): > > while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) { > QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next); > ^^^^^^^^^^^^^^^^^^^^ > on each iteration 'co' is accessed, > but 'co' can be already freed > > qemu_coroutine_enter(next); > } > > When 'next' coroutine is resumed (entered) it can in its turn resume > 'co', and eventually free it. That's why we see 'co' (which was freed) > has the same address as 'elem' from the first backtrace. > > The fix is obvious: use temporary queue and do not touch coroutine after > first qemu_coroutine_enter() is invoked. > > The issue is quite rare and happens every ~12 hours on very high IO > and CPU load (building linux kernel with -j512 inside guest) when IO > throttling is enabled. With the fix applied guest is running ~35 hours > and is still alive so far. > > Signed-off-by: Roman Pen > Cc: Paolo Bonzini > Cc: Fam Zheng > Cc: Stefan Hajnoczi > Cc: Kevin Wolf > Cc: qemu-devel@nongnu.org > --- > v2: > Comments tweaks suggested by Paolo. > > util/qemu-coroutine-lock.c | 14 ++++++++++++-- > util/qemu-coroutine.c | 5 +++++ > 2 files changed, 17 insertions(+), 2 deletions(-) > > diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c > index 6328eed26bc6..d589d8c66d5e 100644 > --- a/util/qemu-coroutine-lock.c > +++ b/util/qemu-coroutine-lock.c > @@ -77,10 +77,20 @@ void coroutine_fn qemu_co_queue_wait(CoQueue *queue, CoMutex *mutex) > void qemu_co_queue_run_restart(Coroutine *co) > { > Coroutine *next; > + QSIMPLEQ_HEAD(, Coroutine) tmp_queue_wakeup = > + QSIMPLEQ_HEAD_INITIALIZER(tmp_queue_wakeup); > > trace_qemu_co_queue_run_restart(co); > - while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) { > - QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next); > + > + /* Because "co" has yielded, any coroutine that we wakeup can resume it. > + * If this happens and "co" terminates, co->co_queue_wakeup becomes > + * invalid memory. Therefore, use a temporary queue and do not touch > + * the "co" coroutine as soon as you enter another one. > + */ > + QSIMPLEQ_CONCAT(&tmp_queue_wakeup, &co->co_queue_wakeup); > + > + while ((next = QSIMPLEQ_FIRST(&tmp_queue_wakeup))) { > + QSIMPLEQ_REMOVE_HEAD(&tmp_queue_wakeup, co_queue_next); > qemu_coroutine_enter(next); > } > } > diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c > index 486af9a62275..d6095c1d5aa4 100644 > --- a/util/qemu-coroutine.c > +++ b/util/qemu-coroutine.c > @@ -126,6 +126,11 @@ void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine *co) > > qemu_co_queue_run_restart(co); > > + /* Beware, if ret == COROUTINE_YIELD and qemu_co_queue_run_restart() > + * has started any other coroutine, "co" might have been reentered > + * and even freed by now! So be careful and do not touch it. > + */ > + > switch (ret) { > case COROUTINE_YIELD: > return; > -- > 2.11.0 > > Reviewed-by: Fam Zheng