Re: [Qemu-devel] [PATCH v3 1/1] coroutine-lock: do not touch coroutine after another one has been entered

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Stefan Hajnoczi <stefanha@gmail.com>
To: Roman Pen <roman.penyaev@profitbricks.com>
Cc: Kevin Wolf <kwolf@redhat.com>, Fam Zheng <famz@redhat.com>,
	qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v3 1/1] coroutine-lock: do not touch coroutine after another one has been entered
Date: Wed, 7 Jun 2017 14:54:30 +0100	[thread overview]
Message-ID: <20170607135430.GB21990@stefanha-x1.localdomain> (raw)
In-Reply-To: <20170601160847.23720-1-roman.penyaev@profitbricks.com>

[-- Attachment #1: Type: text/plain, Size: 5130 bytes --]

On Thu, Jun 01, 2017 at 06:08:47PM +0200, Roman Pen wrote:
> Submission of requests on linux aio is a bit tricky and can lead to
> requests completions on submission path:
> 
> 44713c9e8547 ("linux-aio: Handle io_submit() failure gracefully")
> 0ed93d84edab ("linux-aio: process completions from ioq_submit()")
> 
> That means that any coroutine which has been yielded in order to wait
> for completion can be resumed from submission path and be eventually
> terminated (freed).
> 
> The following use-after-free crash was observed when IO throttling
> was enabled:
> 
>  Program received signal SIGSEGV, Segmentation fault.
>  [Switching to Thread 0x7f5813dff700 (LWP 56417)]
>  virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252
>  (gdb) bt
>  #0  virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252
>                               ^^^^^^^^^^^^^^
>                               remember the address
> 
>  #1  virtqueue_fill (vq=0x5598b20d21b0, elem=0x7f5804009a30, len=1, idx=0) at virtio.c:282
>  #2  virtqueue_push (vq=0x5598b20d21b0, elem=elem@entry=0x7f5804009a30, len=<optimized out>) at virtio.c:308
>  #3  virtio_blk_req_complete (req=req@entry=0x7f5804009a30, status=status@entry=0 '\000') at virtio-blk.c:61
>  #4  virtio_blk_rw_complete (opaque=<optimized out>, ret=0) at virtio-blk.c:126
>  #5  blk_aio_complete (acb=0x7f58040068d0) at block-backend.c:923
>  #6  coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:78
> 
>  (gdb) p * elem
>  $8 = {index = 77, out_num = 2, in_num = 1,
>        in_addr = 0x7f5804009ad8, out_addr = 0x7f5804009ae0,
>        in_sg = 0x0, out_sg = 0x7f5804009a50}
>        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>        'in_sg' and 'out_sg' are invalid.
>        e.g. it is impossible that 'in_sg' is zero,
>        instead its value must be equal to:
> 
>        (gdb) p/x 0x7f5804009ad8 + sizeof(elem->in_addr[0]) + 2 * sizeof(elem->out_addr[0])
>        $26 = 0x7f5804009af0
> 
> Seems 'elem' was corrupted.  Meanwhile another thread raised an abort:
> 
>  Thread 12 (Thread 0x7f57f2ffd700 (LWP 56426)):
>  #0  raise () from /lib/x86_64-linux-gnu/libc.so.6
>  #1  abort () from /lib/x86_64-linux-gnu/libc.so.6
>  #2  qemu_coroutine_enter (co=0x7f5804009af0) at qemu-coroutine.c:113
>  #3  qemu_co_queue_run_restart (co=0x7f5804009a30) at qemu-coroutine-lock.c:60
>  #4  qemu_coroutine_enter (co=0x7f5804009a30) at qemu-coroutine.c:119
>                            ^^^^^^^^^^^^^^^^^^
>                            WTF?? this is equal to elem from crashed thread
> 
>  #5  qemu_co_queue_run_restart (co=0x7f57e7f16ae0) at qemu-coroutine-lock.c:60
>  #6  qemu_coroutine_enter (co=0x7f57e7f16ae0) at qemu-coroutine.c:119
>  #7  qemu_co_queue_run_restart (co=0x7f5807e112a0) at qemu-coroutine-lock.c:60
>  #8  qemu_coroutine_enter (co=0x7f5807e112a0) at qemu-coroutine.c:119
>  #9  qemu_co_queue_run_restart (co=0x7f5807f17820) at qemu-coroutine-lock.c:60
>  #10 qemu_coroutine_enter (co=0x7f5807f17820) at qemu-coroutine.c:119
>  #11 qemu_co_queue_run_restart (co=0x7f57e7f18e10) at qemu-coroutine-lock.c:60
>  #12 qemu_coroutine_enter (co=0x7f57e7f18e10) at qemu-coroutine.c:119
>  #13 qemu_co_enter_next (queue=queue@entry=0x5598b1e742d0) at qemu-coroutine-lock.c:106
>  #14 timer_cb (blk=0x5598b1e74280, is_write=<optimized out>) at throttle-groups.c:419
> 
> Crash can be explained by access of 'co' object from the loop inside
> qemu_co_queue_run_restart():
> 
>   while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) {
>       QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next);
>                            ^^^^^^^^^^^^^^^^^^^^
>                            on each iteration 'co' is accessed,
>                            but 'co' can be already freed
> 
>       qemu_coroutine_enter(next);
>   }
> 
> When 'next' coroutine is resumed (entered) it can in its turn resume
> 'co', and eventually free it.  That's why we see 'co' (which was freed)
> has the same address as 'elem' from the first backtrace.
> 
> The fix is obvious: use temporary queue and do not touch coroutine after
> first qemu_coroutine_enter() is invoked.
> 
> The issue is quite rare and happens every ~12 hours on very high IO
> and CPU load (building linux kernel with -j512 inside guest) when IO
> throttling is enabled.  With the fix applied guest is running ~35 hours
> and is still alive so far.
> 
> Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Fam Zheng <famz@redhat.com>
> Cc: Stefan Hajnoczi <stefanha@redhat.com>
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: qemu-devel@nongnu.org
> ---
>  v3:
>      Comments tweaks suggested by Stefan.
>  v2:
>      Comments tweaks suggested by Paolo.
> 
>  util/qemu-coroutine-lock.c | 19 +++++++++++++++++--
>  util/qemu-coroutine.c      |  5 +++++
>  2 files changed, 22 insertions(+), 2 deletions(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

     prev parent reply	other threads:[~2017-06-07 13:54 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-01 16:08 [Qemu-devel] [PATCH v3 1/1] coroutine-lock: do not touch coroutine after another one has been entered Roman Pen
2017-06-02  8:35 ` Stefan Hajnoczi
2017-06-02 15:29 ` Roman Penyaev
2017-06-07 13:54 ` Stefan Hajnoczi [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170607135430.GB21990@stefanha-x1.localdomain \
    --to=stefanha@gmail.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=roman.penyaev@profitbricks.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).