From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eric Auger <eauger@redhat.com>
Cc: linux-kernel@vger.kernel.org,
"Hongyu Ning" <hongyu.ning@linux.intel.com>,
"Jason Wang" <jasowang@redhat.com>,
"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
"Eugenio Pérez" <eperezma@redhat.com>,
virtualization@lists.linux.dev
Subject: Re: [PATCH] virtio: break and reset virtio devices on device_shutdown()
Date: Mon, 24 Feb 2025 03:12:01 -0500 [thread overview]
Message-ID: <20250224030953-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <7fa37894-7d73-4087-a849-2957f31ad7f4@redhat.com>
On Mon, Feb 24, 2025 at 08:49:09AM +0100, Eric Auger wrote:
> Hi Michael,
>
> On 2/21/25 12:42 AM, Michael S. Tsirkin wrote:
> > Hongyu reported a hang on kexec in a VM. QEMU reported invalid memory
> > accesses during the hang.
> >
> > Invalid read at addr 0x102877002, size 2, region '(null)', reason: rejected
> > Invalid write at addr 0x102877A44, size 2, region '(null)', reason: rejected
> > ...
> >
> > It was traced down to virtio-console. Kexec works fine if virtio-console
> > is not in use.
> >
> > The issue is that virtio-console continues to write to the MMIO even after
> > underlying virtio-pci device is reset.
> >
> > Additionally, Eric noticed that IOMMUs are reset before devices, if
> > devices are not reset on shutdown they continue to poke at guest memory
> > and get errors from the IOMMU. Some devices get wedged then.
> >
> > The problem can be solved by breaking all virtio devices on virtio
> > bus shutdown, then resetting them.
> >
> > Reported-by: Eric Auger <eauger@redhat.com>
> > Reported-by: Hongyu Ning <hongyu.ning@linux.intel.com>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Tested-by: Eric Auger <eric.auger@redhat.com>
>
> Thanks!
>
> Eric
Will be in the next pull. Thanks!
Having said that
> > ---
> > drivers/virtio/virtio.c | 31 +++++++++++++++++++++++++++++++
> > 1 file changed, 31 insertions(+)
> >
> > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > index c1cc1157b380..e5b29520d3b2 100644
> > --- a/drivers/virtio/virtio.c
> > +++ b/drivers/virtio/virtio.c
> > @@ -377,6 +377,36 @@ static void virtio_dev_remove(struct device *_d)
> > of_node_put(dev->dev.of_node);
> > }
> >
> > +static void virtio_dev_shutdown(struct device *_d)
> > +{
> > + struct virtio_device *dev = dev_to_virtio(_d);
> > + struct virtio_driver *drv = drv_to_virtio(dev->dev.driver);
> > +
> > + /*
> > + * Stop accesses to or from the device.
> > + * We only need to do it if there's a driver - no accesses otherwise.
> > + */
> > + if (!drv)
> > + return;
> > +
> > + /*
> > + * Some devices get wedged if you kick them after they are
> > + * reset. Mark all vqs as broken to make sure we don't.
> > + */
> > + virtio_break_device(dev);
> > + /*
> > + * The below virtio_synchronize_cbs() guarantees that any interrupt
> > + * for this line arriving after virtio_synchronize_vqs() has completed
> > + * is guaranteed to see vq->broken as true.
> > + */
> > + virtio_synchronize_cbs(dev);
> > + /*
> > + * As IOMMUs are reset on shutdown, this will block device access to memory.
> > + * Some devices get wedged if this happens, so reset to make sure it does not.
> > + */
> > + dev->config->reset(dev);
I feel wedging devices to the point reset does not recover them
if the driver is buggy, isn't good. This is something we should
maybe fix if it's observed with qemu. Let's discuss on that list.
> > +}
> > +
> > static const struct bus_type virtio_bus = {
> > .name = "virtio",
> > .match = virtio_dev_match,
> > @@ -384,6 +414,7 @@ static const struct bus_type virtio_bus = {
> > .uevent = virtio_uevent,
> > .probe = virtio_dev_probe,
> > .remove = virtio_dev_remove,
> > + .shutdown = virtio_dev_shutdown,
> > };
> >
> > int __register_virtio_driver(struct virtio_driver *driver, struct module *owner)
next prev parent reply other threads:[~2025-02-24 8:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-20 23:42 [PATCH] virtio: break and reset virtio devices on device_shutdown() Michael S. Tsirkin
2025-02-21 1:11 ` Jason Wang
2025-02-21 1:36 ` Michael S. Tsirkin
2025-02-24 2:07 ` Jason Wang
2025-02-24 7:49 ` Eric Auger
2025-02-24 8:12 ` Michael S. Tsirkin [this message]
2025-02-24 19:31 ` Michael S. Tsirkin
2025-02-25 14:11 ` Eric Auger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250224030953-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=eauger@redhat.com \
--cc=eperezma@redhat.com \
--cc=hongyu.ning@linux.intel.com \
--cc=jasowang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).