All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fam Zheng <famz@redhat.com>
To: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
	Alberto Garcia <berto@igalia.com>,
	qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] aio-posix: honor is_external in AioContext polling
Date: Tue, 24 Jan 2017 20:04:31 +0800	[thread overview]
Message-ID: <20170124120431.GB13714@lemon.Home> (raw)
In-Reply-To: <20170124095350.16679-1-stefanha@redhat.com>

On Tue, 01/24 09:53, Stefan Hajnoczi wrote:
> AioHandlers marked ->is_external must be skipped when aio_node_check()
> fails.  bdrv_drained_begin() needs this to prevent dataplane from
> submitting new I/O requests while another thread accesses the device and
> relies on it being quiesced.
> 
> This patch fixes the following segfault:
> 
>   Program terminated with signal SIGSEGV, Segmentation fault.
>   #0  0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
>   2650            bdrv_io_plug(child->bs);
>   [Current thread is 1 (Thread 0x7ff5c4bd1c80 (LWP 10917))]
>   (gdb) bt
>   #0  0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
>   #1  0x00005577f6114363 in blk_io_plug (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:1561
>   #2  0x00005577f5d4091d in virtio_blk_handle_vq (s=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/virtio-blk.c:589
>   #3  0x00005577f5d4240d in virtio_blk_data_plane_handle_output (vdev=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/dataplane/virtio-blk.c:158
>   #4  0x00005577f5d88acd in virtio_queue_notify_aio_vq (vq=0x5577f9b3d2a0) at qemu/hw/virtio/virtio.c:1304
>   #5  0x00005577f5d8aaaf in virtio_queue_host_notifier_aio_poll (opaque=0x5577f9b3d308) at qemu/hw/virtio/virtio.c:2134
>   #6  0x00005577f60ca077 in run_poll_handlers_once (ctx=0x5577f79ddbb0) at qemu/aio-posix.c:493
>   #7  0x00005577f60ca268 in try_poll_mode (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:569
>   #8  0x00005577f60ca331 in aio_poll (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:601
>   #9  0x00005577f612722a in bdrv_flush (bs=0x5577f7c20970) at qemu/block/io.c:2403
>   #10 0x00005577f60c1b2d in bdrv_close (bs=0x5577f7c20970) at qemu/block.c:2322
>   #11 0x00005577f60c20e7 in bdrv_delete (bs=0x5577f7c20970) at qemu/block.c:2465
>   #12 0x00005577f60c3ecf in bdrv_unref (bs=0x5577f7c20970) at qemu/block.c:3425
>   #13 0x00005577f60bf951 in bdrv_root_unref_child (child=0x5577f7a2de70) at qemu/block.c:1361
>   #14 0x00005577f6112162 in blk_remove_bs (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:491
>   #15 0x00005577f6111b1b in blk_remove_all_bs () at qemu/block/block-backend.c:245
>   #16 0x00005577f60c1db6 in bdrv_close_all () at qemu/block.c:2382
>   #17 0x00005577f5e60cca in main (argc=20, argv=0x7ffea6eb8398, envp=0x7ffea6eb8440) at qemu/vl.c:4684
> 
> The key thing is that bdrv_close() uses bdrv_drained_begin() and
> virtio_queue_host_notifier_aio_poll() must not be called.
> 
> Thanks to Fam Zheng <famz@redhat.com> for identifying the root cause of
> this crash.
> 
> Reported-by: Alberto Garcia <berto@igalia.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  aio-posix.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/aio-posix.c b/aio-posix.c
> index 9453d83..a8d7090 100644
> --- a/aio-posix.c
> +++ b/aio-posix.c
> @@ -508,7 +508,8 @@ static bool run_poll_handlers_once(AioContext *ctx)
>  
>      QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
>          if (!node->deleted && node->io_poll &&
> -                node->io_poll(node->opaque)) {
> +            aio_node_check(ctx, node->is_external) &&
> +            node->io_poll(node->opaque)) {
>              progress = true;
>          }
>  
> -- 
> 2.9.3
> 
> 

The patch is not wrong and I believe it is enough to fix the crash, however it's
not enough...

All in all I think we should skip external handlers regardless of
aio_disable_external(), or even skip try_poll_mode, in nested aio_poll()'s. The
reasons are 1) many nested aio_poll()'s don't have bdrv_drained_begin, so this
check is not sufficient; 2) aio_poll() on qemu_aio_context doesn't look at
ioeventfd before, but this was changed by adding try_poll_mode(), which is not
very correct.

These two factors combined together make it possible for bdrv_flush() etc to
spin longer than necessary, if not forever, when the guest keeps submitting more
requests with ioeventfd.

Fam

  reply	other threads:[~2017-01-24 12:04 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-24  9:53 [Qemu-devel] [PATCH] aio-posix: honor is_external in AioContext polling Stefan Hajnoczi
2017-01-24 12:04 ` Fam Zheng [this message]
2017-01-24 12:15   ` Paolo Bonzini
2017-01-24 12:47     ` Fam Zheng
2017-01-24 12:51       ` Paolo Bonzini
2017-01-24 13:40 ` Fam Zheng
2017-01-24 14:31 ` Alberto Garcia
2017-01-25 13:16 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170124120431.GB13714@lemon.Home \
    --to=famz@redhat.com \
    --cc=berto@igalia.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.