From: "Michael S. Tsirkin" <mst@redhat.com>
To: Eric Auger <eauger@redhat.com>
Cc: "Ning, Hongyu" <hongyu.ning@linux.intel.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
"Jason Wang" <jasowang@redhat.com>,
"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
"Eugenio Pérez" <eperezma@redhat.com>,
"Zhenzhong Duan" <zhenzhong.duan@intel.com>,
virtualization@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] virtio: Remove virtio devices on device_shutdown()
Date: Fri, 14 Feb 2025 07:16:30 -0500 [thread overview]
Message-ID: <20250214070904-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <83b43e73-8599-44ff-8657-6d5f2f9b2de5@redhat.com>
On Fri, Feb 14, 2025 at 08:56:56AM +0100, Eric Auger wrote:
> Hi,
>
> On 2/14/25 8:21 AM, Ning, Hongyu wrote:
> >
> >
> > On 2025/2/6 16:59, Eric Auger wrote:
> >> Hi,
> >>
> >> On 2/4/25 12:46 PM, Eric Auger wrote:
> >>> Hi,
> >>>
> >>> On 2/3/25 3:48 PM, Michael S. Tsirkin wrote:
> >>>> On Fri, Jan 31, 2025 at 10:53:15AM +0100, Eric Auger wrote:
> >>>>> Hi Kirill, Michael
> >>>>>
> >>>>> On 8/8/24 9:51 AM, Kirill A. Shutemov wrote:
> >>>>>> Hongyu reported a hang on kexec in a VM. QEMU reported invalid memory
> >>>>>> accesses during the hang.
> >>>>>>
> >>>>>> Invalid read at addr 0x102877002, size 2, region '(null)',
> >>>>>> reason: rejected
> >>>>>> Invalid write at addr 0x102877A44, size 2, region '(null)',
> >>>>>> reason: rejected
> >>>>>> ...
> >>>>>>
> >>>>>> It was traced down to virtio-console. Kexec works fine if virtio-
> >>>>>> console
> >>>>>> is not in use.
> >>>>>>
> >>>>>> Looks like virtio-console continues to write to the MMIO even after
> >>>>>> underlying virtio-pci device is removed.
> >>>>>>
> >>>>>> The problem can be mitigated by removing all virtio devices on virtio
> >>>>>> bus shutdown.
> >>>>>>
> >>>>>> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> >>>>>> Reported-by: Hongyu Ning <hongyu.ning@linux.intel.com>
> >>>>>
> >>>>> Gentle ping on that patch that seems to have fallen though the cracks.
> >>>>>
> >>>>> I think this fix is really needed. I have another test case with a
> >>>>> rebooting guest exposed with virtio-net (backed by vhost-net) and
> >>>>> viommu. Since there is currently no shutdown for the virtio-net, on
> >>>>> reboot, the IOMMU is disabled through the native_machine_shutdown()/
> >>>>> x86_platform.iommu_shutdown() while the virtio-net is still alive.
> >>>>>
> >>>>> Normally device_shutdown() should call virtio-net shutdown before the
> >>>>> IOMMU tear down and we wouldn't see any spurious transactions after
> >>>>> iommu shutdown.
> >>>>>
> >>>>> With that fix, the above test case is fixed and I do not see spurious
> >>>>> vhost IOTLB miss spurious requests.
> >>>>>
> >>>>> For more details, see qemu thread ([PATCH] hw/virtio/vhost: Disable
> >>>>> IOTLB callbacks when IOMMU gets disabled,
> >>>>> https://lore.kernel.org/all/20250120173339.865681-1-
> >>>>> eric.auger@redhat.com/)
> >>>>>
> >>>>>
> >>>>> Reviewed-by: Eric Auger <eric.auger@redhat.com>
> >>>>> Tested-by: Eric Auger <eric.auger@redhat.com>
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>> Eric
> >>>>>
> >>>>>> ---
> >>>>>> drivers/virtio/virtio.c | 10 ++++++++++
> >>>>>> 1 file changed, 10 insertions(+)
> >>>>>>
> >>>>>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> >>>>>> index a9b93e99c23a..6c2f908eb22c 100644
> >>>>>> --- a/drivers/virtio/virtio.c
> >>>>>> +++ b/drivers/virtio/virtio.c
> >>>>>> @@ -356,6 +356,15 @@ static void virtio_dev_remove(struct device *_d)
> >>>>>> of_node_put(dev->dev.of_node);
> >>>>>> }
> >>>>>> +static void virtio_dev_shutdown(struct device *_d)
> >>>>>> +{
> >>>>>> + struct virtio_device *dev = dev_to_virtio(_d);
> >>>>>> + struct virtio_driver *drv = drv_to_virtio(dev->dev.driver);
> >>>>>> +
> >>>>>> + if (drv && drv->remove)
> >>>>>> + drv->remove(dev);
> >>>>
> >>>>
> >>>> I am concerned that full remove is a heavyweight operation.
> >>>> Do not want to slow down reboots even more.
> >>>> How about just doing a reset, instead?
> >>>
> >>> I tested with
> >>>
> >>> static void virtio_dev_shutdown(struct device *_d)
> >>> {
> >>> struct virtio_device *dev = dev_to_virtio(_d);
> >>>
> >>> virtio_reset_device(dev);
> >>> }
> >>>
> >>>
> >>> and it fixes my issue.
> >>>
> >>> Kirill, would that fix you issue too?
> >
> > Hi,
> >
> > sorry for my late response, I synced with Kirill offline and did a retest.
> >
> > The issue is still reproduced on my side, kexec will be stuck in case of
> > "console=hvc0" append in kernel cmdline and even with such patch applied.
>
> Thanks for testing!
>
> Michael, it looks like the initial patch from Kyrill is the one that
> fixes both issues. virtio_reset_device() usage does not work for the
> initial bug report while it works for me. Other ideas?
>
> Thanks
>
> Eric
Ah, wait a second.
Looks like virtio-console continues to write to the MMIO even after
underlying virtio-pci device is removed.
Hmm. I am not sure why that is a problem, but I assume some hypervisors just
hang the system if you try to kick them after reset.
Unfortunate that spec did not disallow it.
If we want to prevent that, we want to do something like this:
/*
* Some devices get wedged if you kick them after they are
* reset. Mark all vqs as broken to make sure we don't.
*/
virtio_break_device(dev);
/*
* The below virtio_synchronize_cbs() guarantees that any
* interrupt for this line arriving after
* virtio_synchronize_vqs() has completed is guaranteed to see
* vq->broken as true.
*/
virtio_synchronize_cbs(dev);
dev->config->reset(dev);
I assume this still works for you, yes?
> >
> > my kernel code base is 6.14.0-rc2.
> >
> > let me know if any more experiments needed.
> >
> > ---
> > drivers/virtio/virtio.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > index ba37665188b5..f9f885d04763 100644
> > --- a/drivers/virtio/virtio.c
> > +++ b/drivers/virtio/virtio.c
> > @@ -395,6 +395,13 @@ static const struct cpumask
> > *virtio_irq_get_affinity(struct device *_d,
> > return dev->config->get_vq_affinity(dev, irq_vec);
> > }
> >
> > +static void virtio_dev_shutdown(struct device *_d)
> > +{
> > + struct virtio_device *dev = dev_to_virtio(_d);
> > +
> > + virtio_reset_device(dev);
> > +}
> > +
> > static const struct bus_type virtio_bus = {
> > .name = "virtio",
> > .match = virtio_dev_match,
> > @@ -403,6 +410,7 @@ static const struct bus_type virtio_bus = {
> > .probe = virtio_dev_probe,
> > .remove = virtio_dev_remove,
> > .irq_get_affinity = virtio_irq_get_affinity,
> > + .shutdown = virtio_dev_shutdown,
> > };
> >
> > int __register_virtio_driver(struct virtio_driver *driver, struct
> > module *owner)
> > --
> > 2.43.0
> >
> >
> >> gentle ping.
> >>
> >> this also fixes another issue with qemu vSMMU + virtio-scsi-pci. With
> >> the above addition I get rid of spurious warning in qemu on guest reboot.
> >>
> >> qemu-system-aarch64: virtio: zero sized buffers are not allowed
> >> qemu-system-aarch64: vhost vring error in virtqueue 0: Invalid
> >> argument (22)
> >>
> >> Would you mind if I respin?
> >>
> >> Thanks
> >>
> >> Eric
> >>
> >>
> >>
> >>
> >>>
> >>> Thanks
> >>>
> >>> Eric
> >>>>
> >>>>>> +}
> >>>>>> +
> >>>>>> static const struct bus_type virtio_bus = {
> >>>>>> .name = "virtio",
> >>>>>> .match = virtio_dev_match,
> >>>>>> @@ -363,6 +372,7 @@ static const struct bus_type virtio_bus = {
> >>>>>> .uevent = virtio_uevent,
> >>>>>> .probe = virtio_dev_probe,
> >>>>>> .remove = virtio_dev_remove,
> >>>>>> + .shutdown = virtio_dev_shutdown,
> >>>>>> };
> >>>>>> int __register_virtio_driver(struct virtio_driver *driver,
> >>>>>> struct module *owner)
> >>>>
> >>>
> >>
> >
next prev parent reply other threads:[~2025-02-14 12:16 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-08 7:51 [PATCH] virtio: Remove virtio devices on device_shutdown() Kirill A. Shutemov
2024-08-08 11:37 ` Jiri Pirko
2024-08-08 12:11 ` Kirill A. Shutemov
2024-08-08 15:28 ` Jiri Pirko
2024-08-08 12:10 ` Michael S. Tsirkin
2024-08-08 13:15 ` Kirill A. Shutemov
2024-08-08 15:03 ` Michael S. Tsirkin
2024-08-08 15:28 ` Kirill A. Shutemov
2024-11-04 10:22 ` Kirill A. Shutemov
2025-01-31 9:53 ` Eric Auger
2025-01-31 11:55 ` Michael S. Tsirkin
2025-02-03 14:48 ` Michael S. Tsirkin
2025-02-04 11:46 ` Eric Auger
2025-02-06 8:59 ` Eric Auger
2025-02-06 10:04 ` Kirill A. Shutemov
2025-02-06 16:27 ` Eric Auger
2025-02-14 7:21 ` Ning, Hongyu
2025-02-14 7:56 ` Eric Auger
2025-02-14 12:16 ` Michael S. Tsirkin [this message]
2025-02-17 3:29 ` Jason Wang
2025-02-17 7:49 ` Ning, Hongyu
2025-02-17 9:25 ` Eric Auger
2025-02-17 16:59 ` Eric Auger
2025-02-17 23:20 ` Michael S. Tsirkin
2025-02-17 23:57 ` Ning, Hongyu
2025-02-18 0:12 ` Michael S. Tsirkin
2025-02-18 1:49 ` Ning, Hongyu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250214070904-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=eauger@redhat.com \
--cc=eperezma@redhat.com \
--cc=hongyu.ning@linux.intel.com \
--cc=jasowang@redhat.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=virtualization@lists.linux.dev \
--cc=xuanzhuo@linux.alibaba.com \
--cc=zhenzhong.duan@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).