From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCH] virtio: Remove virtio device during shutdown Date: Thu, 12 Mar 2015 17:22:40 +0100 Message-ID: <20150312172138-mutt-send-email-mst@redhat.com> References: <1426061357-4440-1-git-send-email-famz@redhat.com> <20150311095814-mutt-send-email-mst@redhat.com> <20150311101135.GA13653@ad.nay.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <20150311101135.GA13653@ad.nay.redhat.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org To: Fam Zheng Cc: Paolo Bonzini , linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org List-Id: virtualization@lists.linuxfoundation.org On Wed, Mar 11, 2015 at 06:11:35PM +0800, Fam Zheng wrote: > On Wed, 03/11 10:06, Michael S. Tsirkin wrote: > > On Wed, Mar 11, 2015 at 04:09:17PM +0800, Fam Zheng wrote: > > > Currently shutdown is nop for virtio devices, but the core code could > > > remove things behind us such as MSI-X handler etc. For example in the > > > case of virtio-scsi-pci, the device may still try to send interupts, > > > which will be on IRQ lines seeing MSI-X disabled. Those interrupts will > > > be unhandled, and may cause flood. > > Here is the problem I want to solve - file system driver hang: > > If a fs code happen to hit __wait_on_buffer right after pci pci_device_shutdown > disabled msix, it will never make progress because the requests it waits for > will never be completed. So the system hangs. Paolo says that pci reset of virtio scsi device guarantees that all outstanding requests complete. If true and implemented correctly, I don't see what else needs to be done. You will need to debug this some more. > In other words we will want to reset virtio device before pci_device_shutdown > AND wake up all waiters. > > Unfortunately, neither your patch nor mine does that, because virtio bus can be > shutdown after pci bus (thanks to Jason for pointing out this). In that case, > any completion after disabling msix is lost. > > Maybe we need both the pci shutdown handler to reset the device and the virtio > shutdown handler to remove the device? > > Fam > > > > > This sounds very tentative. Do you, in fact, observe some problems > > with virtio scsi? How to reproduce them? this needs to go > > into the commit messages. > > OK, my bad. > > > > > > Remove the device in "shutdown" callback to allow device drivers clean > > > up things. > > > > > > Signed-off-by: Fam Zheng > > > > I'm concerned this will cause more hangs on shutdown: one > > of the reasons for reboot is device mal-functioning. > > How about we just reset devices instead? Something like > > the below (untested). > > > > Signed-off-by: Michael S. Tsirkin > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c > > index 5ce2aa4..0769941 100644 > > --- a/drivers/virtio/virtio.c > > +++ b/drivers/virtio/virtio.c > > @@ -269,6 +269,17 @@ static int virtio_dev_remove(struct device *_d) > > return 0; > > } > > > > +static void virtio_dev_shutdown(struct device *_d) > > +{ > > + struct virtio_device *dev = dev_to_virtio(_d); > > + /* > > + * Reset the device to make it stop sending interrupts, DMA, etc. > > + * We are shutting down, no need for full cleanup. > > + */ > > + dev->config->reset(dev); > > + > > +} > > + > > static struct bus_type virtio_bus = { > > .name = "virtio", > > .match = virtio_dev_match, > > @@ -276,6 +288,7 @@ static struct bus_type virtio_bus = { > > .uevent = virtio_uevent, > > .probe = virtio_dev_probe, > > .remove = virtio_dev_remove, > > + .shutdown = virtio_dev_shutdown, > > }; > > > > bool virtio_device_is_legacy_only(struct virtio_device_id id) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1161042AbbCLQXo (ORCPT ); Thu, 12 Mar 2015 12:23:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59093 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932273AbbCLQWy (ORCPT ); Thu, 12 Mar 2015 12:22:54 -0400 Date: Thu, 12 Mar 2015 17:22:40 +0100 From: "Michael S. Tsirkin" To: Fam Zheng Cc: linux-kernel@vger.kernel.org, Rusty Russell , virtualization@lists.linux-foundation.org, Paolo Bonzini , Jason Wang Subject: Re: [PATCH] virtio: Remove virtio device during shutdown Message-ID: <20150312172138-mutt-send-email-mst@redhat.com> References: <1426061357-4440-1-git-send-email-famz@redhat.com> <20150311095814-mutt-send-email-mst@redhat.com> <20150311101135.GA13653@ad.nay.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150311101135.GA13653@ad.nay.redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 11, 2015 at 06:11:35PM +0800, Fam Zheng wrote: > On Wed, 03/11 10:06, Michael S. Tsirkin wrote: > > On Wed, Mar 11, 2015 at 04:09:17PM +0800, Fam Zheng wrote: > > > Currently shutdown is nop for virtio devices, but the core code could > > > remove things behind us such as MSI-X handler etc. For example in the > > > case of virtio-scsi-pci, the device may still try to send interupts, > > > which will be on IRQ lines seeing MSI-X disabled. Those interrupts will > > > be unhandled, and may cause flood. > > Here is the problem I want to solve - file system driver hang: > > If a fs code happen to hit __wait_on_buffer right after pci pci_device_shutdown > disabled msix, it will never make progress because the requests it waits for > will never be completed. So the system hangs. Paolo says that pci reset of virtio scsi device guarantees that all outstanding requests complete. If true and implemented correctly, I don't see what else needs to be done. You will need to debug this some more. > In other words we will want to reset virtio device before pci_device_shutdown > AND wake up all waiters. > > Unfortunately, neither your patch nor mine does that, because virtio bus can be > shutdown after pci bus (thanks to Jason for pointing out this). In that case, > any completion after disabling msix is lost. > > Maybe we need both the pci shutdown handler to reset the device and the virtio > shutdown handler to remove the device? > > Fam > > > > > This sounds very tentative. Do you, in fact, observe some problems > > with virtio scsi? How to reproduce them? this needs to go > > into the commit messages. > > OK, my bad. > > > > > > Remove the device in "shutdown" callback to allow device drivers clean > > > up things. > > > > > > Signed-off-by: Fam Zheng > > > > I'm concerned this will cause more hangs on shutdown: one > > of the reasons for reboot is device mal-functioning. > > How about we just reset devices instead? Something like > > the below (untested). > > > > Signed-off-by: Michael S. Tsirkin > > > > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c > > index 5ce2aa4..0769941 100644 > > --- a/drivers/virtio/virtio.c > > +++ b/drivers/virtio/virtio.c > > @@ -269,6 +269,17 @@ static int virtio_dev_remove(struct device *_d) > > return 0; > > } > > > > +static void virtio_dev_shutdown(struct device *_d) > > +{ > > + struct virtio_device *dev = dev_to_virtio(_d); > > + /* > > + * Reset the device to make it stop sending interrupts, DMA, etc. > > + * We are shutting down, no need for full cleanup. > > + */ > > + dev->config->reset(dev); > > + > > +} > > + > > static struct bus_type virtio_bus = { > > .name = "virtio", > > .match = virtio_dev_match, > > @@ -276,6 +288,7 @@ static struct bus_type virtio_bus = { > > .uevent = virtio_uevent, > > .probe = virtio_dev_probe, > > .remove = virtio_dev_remove, > > + .shutdown = virtio_dev_shutdown, > > }; > > > > bool virtio_device_is_legacy_only(struct virtio_device_id id)