All of lore.kernel.org
 help / color / mirror / Atom feed
From: Fam Zheng <famz@redhat.com>
To: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] "iothread: release iothread around aio_poll" causes random hangs at startup
Date: Wed, 10 Jun 2015 17:34:08 +0800	[thread overview]
Message-ID: <20150610093408.GC11648@ad.nay.redhat.com> (raw)
In-Reply-To: <557800E0.5020202@de.ibm.com>

On Wed, 06/10 11:18, Christian Borntraeger wrote:
> Am 10.06.2015 um 04:12 schrieb Fam Zheng:
> > On Tue, 06/09 11:01, Christian Borntraeger wrote:
> >> Am 09.06.2015 um 04:28 schrieb Fam Zheng:
> >>> On Tue, 06/02 16:36, Christian Borntraeger wrote:
> >>>> Paolo,
> >>>>
> >>>> I bisected 
> >>>> commit a0710f7995f914e3044e5899bd8ff6c43c62f916
> >>>> Author:     Paolo Bonzini <pbonzini@redhat.com>
> >>>> AuthorDate: Fri Feb 20 17:26:52 2015 +0100
> >>>> Commit:     Kevin Wolf <kwolf@redhat.com>
> >>>> CommitDate: Tue Apr 28 15:36:08 2015 +0200
> >>>>
> >>>>     iothread: release iothread around aio_poll
> >>>>
> >>>> to cause a problem with hanging guests.
> >>>>
> >>>> Having many guests all with a kernel/ramdisk (via -kernel) and
> >>>> several null block devices will result in hangs. All hanging 
> >>>> guests are in partition detection code waiting for an I/O to return
> >>>> so very early maybe even the first I/O.
> >>>>
> >>>> Reverting that commit "fixes" the hangs.
> >>>> Any ideas?
> >>>
> >>> Christian, I can't reproduce this on my x86 box with virtio-blk-pci. Do you
> >>> have a reproducer for x86? Or could you collect backtraces for all the threads
> >>> in QEMU when it hangs?
> >>>
> >>> My long shot is that the main loop is blocked at aio_context_acquire(ctx),
> >>> while the iothread of that ctx is blocked at aio_poll(ctx, blocking).
> >>
> >> Here is a backtrace on s390. I need 2 or more disks, (one is not enough).
> > 
> > It shows iothreads and main loop are all waiting for events, and the vcpu
> > threads are running guest code.
> > 
> > It could be the requests being leaked. Do you see this problem with a regular
> > file based image or null-co driver? Maybe we're missing something about the
> > AioContext in block/null.c.
> 
> It seems to run with normal file based images. As soon as I have two or more null-aio
> devices it hangs pretty soon when doing a reboot loop.
> 

Ahh! If it's a reboot loop, the device reset thing may get fishy. I suspect the
completion BH used by null-aio may be messed up, that's why I wonder whether
null-co:// would work for you. Could you test that?

Also, could you try below patch with null-aio://, too?

Thanks,
Fam

---

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index cd539aa..c87b444 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -652,15 +652,11 @@ static void virtio_blk_reset(VirtIODevice *vdev)
 {
     VirtIOBlock *s = VIRTIO_BLK(vdev);
 
-    if (s->dataplane) {
-        virtio_blk_data_plane_stop(s->dataplane);
-    }
-
-    /*
-     * This should cancel pending requests, but can't do nicely until there
-     * are per-device request lists.
-     */
     blk_drain_all();
+    if (s->dataplane) {
+        virtio_blk_data_plane_stop(s->dataplane);
+    }
+
     blk_set_enable_write_cache(s->blk, s->original_wce);
 }

  reply	other threads:[~2015-06-10  9:34 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-02 14:36 [Qemu-devel] "iothread: release iothread around aio_poll" causes random hangs at startup Christian Borntraeger
2015-06-02 14:51 ` Paolo Bonzini
2015-06-03  9:17   ` Stefan Hajnoczi
2015-06-09  2:28 ` Fam Zheng
2015-06-09  9:01   ` Christian Borntraeger
2015-06-10  2:12     ` Fam Zheng
2015-06-10  9:18       ` Christian Borntraeger
2015-06-10  9:34         ` Fam Zheng [this message]
2015-06-10 10:31           ` Christian Borntraeger
2015-07-16 11:03           ` Christian Borntraeger
2015-07-16 11:20             ` Paolo Bonzini
2015-07-16 11:24               ` Christian Borntraeger
2015-07-16 11:37                 ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150610093408.GC11648@ad.nay.redhat.com \
    --to=famz@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.