All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Fam Zheng <famz@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org
Subject: Re: [PATCH] virtio: Remove virtio device during shutdown
Date: Fri, 13 Mar 2015 15:17:23 +0100	[thread overview]
Message-ID: <20150313151657-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <20150312233552.GB32054@ad.nay.redhat.com>

On Fri, Mar 13, 2015 at 07:35:52AM +0800, Fam Zheng wrote:
> On Thu, 03/12 17:22, Michael S. Tsirkin wrote:
> > On Wed, Mar 11, 2015 at 06:11:35PM +0800, Fam Zheng wrote:
> > > On Wed, 03/11 10:06, Michael S. Tsirkin wrote:
> > > > On Wed, Mar 11, 2015 at 04:09:17PM +0800, Fam Zheng wrote:
> > > > > Currently shutdown is nop for virtio devices, but the core code could
> > > > > remove things behind us such as MSI-X handler etc. For example in the
> > > > > case of virtio-scsi-pci, the device may still try to send interupts,
> > > > > which will be on IRQ lines seeing MSI-X disabled. Those interrupts will
> > > > > be unhandled, and may cause flood.
> > > 
> > > Here is the problem I want to solve - file system driver hang:
> > > 
> > > If a fs code happen to hit __wait_on_buffer right after pci pci_device_shutdown
> > > disabled msix, it will never make progress because the requests it waits for
> > > will never be completed. So the system hangs.
> > 
> > Paolo says that pci reset of virtio scsi device guarantees
> > that all outstanding requests complete.
> > 
> > If true and implemented correctly, I don't see what else
> > needs to be done.
> > 
> > You will need to debug this some more.
> 
> First of all I was wrong about the fs driver above, scratch that, I'm sorry for
> the misleading.
> 
> Regarding the hang in shutdown, Ulrich Obergfell has already pointed out that
> the vcpu is "busy/stuck in interrupt processing":
> 
> https://bugzilla.redhat.com/attachment.cgi?id=998391 (RHBZ 1199155)
> 
> Summary: The reason it is stuck is that an IRQ from virtio-scsi-pci is not
> handled. Why is there that IRQ? Because pci core code disabled msix. Why is it
> not handled?  Because it's done behind virtio-scsi, who still is waiting for
> msix.
> 
> "Hence, the interrupt will not be acknowledged and the guest becomes flooded
> with IRQ 11 interrupt."
> 
> Fortunately it's not a livelock for upstream, because of:
> 
>     commit 184564efae4d775225c8fe3b762a56956fb1f827
>     Author: Zhang Haoyu <zhanghy@sangfor.com>
>     Date:   Thu Sep 11 16:47:04 2014 +0800
> 
>     kvm: ioapic: conditionally delay irq delivery duringeoi broadcast
> 
> But we still should do the shutdown right.
> 
> I also propose to not shutdown msix from pci core shutdown if the device
> doesn't have shutdown function:
> 
> http://www.spinics.net/lists/kernel/msg1944041.html

Makes sense.
Can you bounce this one to me please?
I'll ack.

> With that patch is applied, the "nop" .shutdown in virtio-pci shouldn't hurt
> much.
> 
> Regarding handing the requests, now I don't know if we really care about them
> at shutdown. As you said, waiting for requests may cause more hang.
> 
> Ideas?
> 
> Fam

WARNING: multiple messages have this Message-ID (diff)
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Fam Zheng <famz@redhat.com>
Cc: linux-kernel@vger.kernel.org,
	Rusty Russell <rusty@rustcorp.com.au>,
	virtualization@lists.linux-foundation.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jason Wang <jasowang@redhat.com>
Subject: Re: [PATCH] virtio: Remove virtio device during shutdown
Date: Fri, 13 Mar 2015 15:17:23 +0100	[thread overview]
Message-ID: <20150313151657-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <20150312233552.GB32054@ad.nay.redhat.com>

On Fri, Mar 13, 2015 at 07:35:52AM +0800, Fam Zheng wrote:
> On Thu, 03/12 17:22, Michael S. Tsirkin wrote:
> > On Wed, Mar 11, 2015 at 06:11:35PM +0800, Fam Zheng wrote:
> > > On Wed, 03/11 10:06, Michael S. Tsirkin wrote:
> > > > On Wed, Mar 11, 2015 at 04:09:17PM +0800, Fam Zheng wrote:
> > > > > Currently shutdown is nop for virtio devices, but the core code could
> > > > > remove things behind us such as MSI-X handler etc. For example in the
> > > > > case of virtio-scsi-pci, the device may still try to send interupts,
> > > > > which will be on IRQ lines seeing MSI-X disabled. Those interrupts will
> > > > > be unhandled, and may cause flood.
> > > 
> > > Here is the problem I want to solve - file system driver hang:
> > > 
> > > If a fs code happen to hit __wait_on_buffer right after pci pci_device_shutdown
> > > disabled msix, it will never make progress because the requests it waits for
> > > will never be completed. So the system hangs.
> > 
> > Paolo says that pci reset of virtio scsi device guarantees
> > that all outstanding requests complete.
> > 
> > If true and implemented correctly, I don't see what else
> > needs to be done.
> > 
> > You will need to debug this some more.
> 
> First of all I was wrong about the fs driver above, scratch that, I'm sorry for
> the misleading.
> 
> Regarding the hang in shutdown, Ulrich Obergfell has already pointed out that
> the vcpu is "busy/stuck in interrupt processing":
> 
> https://bugzilla.redhat.com/attachment.cgi?id=998391 (RHBZ 1199155)
> 
> Summary: The reason it is stuck is that an IRQ from virtio-scsi-pci is not
> handled. Why is there that IRQ? Because pci core code disabled msix. Why is it
> not handled?  Because it's done behind virtio-scsi, who still is waiting for
> msix.
> 
> "Hence, the interrupt will not be acknowledged and the guest becomes flooded
> with IRQ 11 interrupt."
> 
> Fortunately it's not a livelock for upstream, because of:
> 
>     commit 184564efae4d775225c8fe3b762a56956fb1f827
>     Author: Zhang Haoyu <zhanghy@sangfor.com>
>     Date:   Thu Sep 11 16:47:04 2014 +0800
> 
>     kvm: ioapic: conditionally delay irq delivery duringeoi broadcast
> 
> But we still should do the shutdown right.
> 
> I also propose to not shutdown msix from pci core shutdown if the device
> doesn't have shutdown function:
> 
> http://www.spinics.net/lists/kernel/msg1944041.html

Makes sense.
Can you bounce this one to me please?
I'll ack.

> With that patch is applied, the "nop" .shutdown in virtio-pci shouldn't hurt
> much.
> 
> Regarding handing the requests, now I don't know if we really care about them
> at shutdown. As you said, waiting for requests may cause more hang.
> 
> Ideas?
> 
> Fam

  reply	other threads:[~2015-03-13 14:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-11  8:09 [PATCH] virtio: Remove virtio device during shutdown Fam Zheng
2015-03-11  8:09 ` Fam Zheng
2015-03-11  9:06 ` Michael S. Tsirkin
2015-03-11  9:06   ` Michael S. Tsirkin
2015-03-11 10:11   ` Fam Zheng
2015-03-11 10:11     ` Fam Zheng
2015-03-12 16:22     ` Michael S. Tsirkin
2015-03-12 16:22       ` Michael S. Tsirkin
2015-03-12 16:39       ` Paolo Bonzini
2015-03-12 16:39         ` Paolo Bonzini
2015-03-12 23:35       ` Fam Zheng
2015-03-12 23:35         ` Fam Zheng
2015-03-13 14:17         ` Michael S. Tsirkin [this message]
2015-03-13 14:17           ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150313151657-mutt-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=famz@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.