* [PATCH] hw/virtio: Add a protection against duplicate vu_scmi_stop calls
@ 2023-07-20 10:10 Milan Zamazal
2023-07-20 10:55 ` Thomas Huth
2023-07-20 12:30 ` Fabiano Rosas
0 siblings, 2 replies; 7+ messages in thread
From: Milan Zamazal @ 2023-07-20 10:10 UTC (permalink / raw)
To: qemu-devel
Cc: Milan Zamazal, Thomas Huth, Fabiano Rosas, Michael S. Tsirkin,
Peter Maydell, Laurent Vivier, Paolo Bonzini, Richard Henderson,
Alex Bennée
The QEMU CI fails in virtio-scmi test occasionally. As reported by
Thomas Huth, this happens most likely when the system is loaded and it
fails with the following error:
qemu-system-aarch64: ../../devel/qemu/hw/pci/msix.c:659:
msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.
../../devel/qemu/tests/qtest/libqtest.c:200: kill_qemu() detected QEMU death from signal 6 (Aborted) (core dumped)
As discovered by Fabiano Rosas, the cause is a duplicate invocation of
msix_unset_vector_notifiers via duplicate vu_scmi_stop calls:
msix_unset_vector_notifiers
virtio_pci_set_guest_notifiers
vu_scmi_stop
vu_scmi_disconnect
...
qemu_chr_write_buffer
msix_unset_vector_notifiers
virtio_pci_set_guest_notifiers
vu_scmi_stop
vu_scmi_set_status
...
qemu_cleanup
While vu_scmi_stop calls are protected by vhost_dev_is_started()
check, it's apparently not enough. vhost-user-blk and vhost-user-gpio
use an extra protection, see f5b22d06fb (vhost: recheck dev state in
the vhost_migration_log routine) for the motivation. Let's use the
same in vhost-user-scmi, which fixes the failure above.
Fixes: a5dab090e142 ("hw/virtio: Add boilerplate for vhost-user-scmi device")
Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
---
hw/virtio/vhost-user-scmi.c | 7 +++++++
include/hw/virtio/vhost-user-scmi.h | 1 +
2 files changed, 8 insertions(+)
diff --git a/hw/virtio/vhost-user-scmi.c b/hw/virtio/vhost-user-scmi.c
index d386fb2df9..918bb7dcf7 100644
--- a/hw/virtio/vhost-user-scmi.c
+++ b/hw/virtio/vhost-user-scmi.c
@@ -63,6 +63,7 @@ static int vu_scmi_start(VirtIODevice *vdev)
error_report("Error starting vhost-user-scmi: %d", ret);
goto err_guest_notifiers;
}
+ scmi->started_vu = true;
/*
* guest_notifier_mask/pending not used yet, so just unmask
@@ -90,6 +91,12 @@ static void vu_scmi_stop(VirtIODevice *vdev)
struct vhost_dev *vhost_dev = &scmi->vhost_dev;
int ret;
+ /* vhost_dev_is_started() check in the callers is not fully reliable. */
+ if (!scmi->started_vu) {
+ return;
+ }
+ scmi->started_vu = false;
+
if (!k->set_guest_notifiers) {
return;
}
diff --git a/include/hw/virtio/vhost-user-scmi.h b/include/hw/virtio/vhost-user-scmi.h
index 6175a74ebd..c90db77dd5 100644
--- a/include/hw/virtio/vhost-user-scmi.h
+++ b/include/hw/virtio/vhost-user-scmi.h
@@ -25,6 +25,7 @@ struct VHostUserSCMI {
VirtQueue *cmd_vq;
VirtQueue *event_vq;
bool connected;
+ bool started_vu;
};
#endif /* _QEMU_VHOST_USER_SCMI_H */
--
2.40.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] hw/virtio: Add a protection against duplicate vu_scmi_stop calls
2023-07-20 10:10 [PATCH] hw/virtio: Add a protection against duplicate vu_scmi_stop calls Milan Zamazal
@ 2023-07-20 10:55 ` Thomas Huth
2023-07-20 12:30 ` Fabiano Rosas
1 sibling, 0 replies; 7+ messages in thread
From: Thomas Huth @ 2023-07-20 10:55 UTC (permalink / raw)
To: Milan Zamazal, qemu-devel
Cc: Fabiano Rosas, Michael S. Tsirkin, Peter Maydell, Laurent Vivier,
Paolo Bonzini, Richard Henderson, Alex Bennée
On 20/07/2023 12.10, Milan Zamazal wrote:
> The QEMU CI fails in virtio-scmi test occasionally. As reported by
> Thomas Huth, this happens most likely when the system is loaded and it
> fails with the following error:
>
> qemu-system-aarch64: ../../devel/qemu/hw/pci/msix.c:659:
> msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.
> ../../devel/qemu/tests/qtest/libqtest.c:200: kill_qemu() detected QEMU death from signal 6 (Aborted) (core dumped)
>
> As discovered by Fabiano Rosas, the cause is a duplicate invocation of
> msix_unset_vector_notifiers via duplicate vu_scmi_stop calls:
>
> msix_unset_vector_notifiers
> virtio_pci_set_guest_notifiers
> vu_scmi_stop
> vu_scmi_disconnect
> ...
> qemu_chr_write_buffer
>
> msix_unset_vector_notifiers
> virtio_pci_set_guest_notifiers
> vu_scmi_stop
> vu_scmi_set_status
> ...
> qemu_cleanup
>
> While vu_scmi_stop calls are protected by vhost_dev_is_started()
> check, it's apparently not enough. vhost-user-blk and vhost-user-gpio
> use an extra protection, see f5b22d06fb (vhost: recheck dev state in
> the vhost_migration_log routine) for the motivation. Let's use the
> same in vhost-user-scmi, which fixes the failure above.
>
> Fixes: a5dab090e142 ("hw/virtio: Add boilerplate for vhost-user-scmi device")
> Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
> ---
> hw/virtio/vhost-user-scmi.c | 7 +++++++
> include/hw/virtio/vhost-user-scmi.h | 1 +
> 2 files changed, 8 insertions(+)
Thanks, that fixes the problem for me!
Tested-by: Thomas Huth <thuth@redhat.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] hw/virtio: Add a protection against duplicate vu_scmi_stop calls
2023-07-20 10:10 [PATCH] hw/virtio: Add a protection against duplicate vu_scmi_stop calls Milan Zamazal
2023-07-20 10:55 ` Thomas Huth
@ 2023-07-20 12:30 ` Fabiano Rosas
2023-08-03 7:10 ` Milan Zamazal
1 sibling, 1 reply; 7+ messages in thread
From: Fabiano Rosas @ 2023-07-20 12:30 UTC (permalink / raw)
To: Milan Zamazal, qemu-devel
Cc: Milan Zamazal, Thomas Huth, Michael S. Tsirkin, Peter Maydell,
Laurent Vivier, Paolo Bonzini, Richard Henderson,
Alex Bennée
Milan Zamazal <mzamazal@redhat.com> writes:
> The QEMU CI fails in virtio-scmi test occasionally. As reported by
> Thomas Huth, this happens most likely when the system is loaded and it
> fails with the following error:
>
> qemu-system-aarch64: ../../devel/qemu/hw/pci/msix.c:659:
> msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.
> ../../devel/qemu/tests/qtest/libqtest.c:200: kill_qemu() detected QEMU death from signal 6 (Aborted) (core dumped)
>
> As discovered by Fabiano Rosas, the cause is a duplicate invocation of
> msix_unset_vector_notifiers via duplicate vu_scmi_stop calls:
>
> msix_unset_vector_notifiers
> virtio_pci_set_guest_notifiers
> vu_scmi_stop
> vu_scmi_disconnect
> ...
> qemu_chr_write_buffer
>
> msix_unset_vector_notifiers
> virtio_pci_set_guest_notifiers
> vu_scmi_stop
> vu_scmi_set_status
> ...
> qemu_cleanup
>
> While vu_scmi_stop calls are protected by vhost_dev_is_started()
> check, it's apparently not enough. vhost-user-blk and vhost-user-gpio
> use an extra protection, see f5b22d06fb (vhost: recheck dev state in
> the vhost_migration_log routine) for the motivation. Let's use the
> same in vhost-user-scmi, which fixes the failure above.
>
> Fixes: a5dab090e142 ("hw/virtio: Add boilerplate for vhost-user-scmi device")
> Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] hw/virtio: Add a protection against duplicate vu_scmi_stop calls
2023-07-20 12:30 ` Fabiano Rosas
@ 2023-08-03 7:10 ` Milan Zamazal
2023-08-03 7:38 ` Thomas Huth
0 siblings, 1 reply; 7+ messages in thread
From: Milan Zamazal @ 2023-08-03 7:10 UTC (permalink / raw)
To: qemu-devel
Cc: Fabiano Rosas, Thomas Huth, Michael S. Tsirkin, Peter Maydell,
Laurent Vivier, Paolo Bonzini, Richard Henderson,
Alex Bennée
Fabiano Rosas <farosas@suse.de> writes:
> Milan Zamazal <mzamazal@redhat.com> writes:
>
>> The QEMU CI fails in virtio-scmi test occasionally. As reported by
>> Thomas Huth, this happens most likely when the system is loaded and it
>> fails with the following error:
>>
>> qemu-system-aarch64: ../../devel/qemu/hw/pci/msix.c:659:
>> msix_unset_vector_notifiers: Assertion
>> `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier'
>> failed.
>> ../../devel/qemu/tests/qtest/libqtest.c:200: kill_qemu() detected
>> QEMU death from signal 6 (Aborted) (core dumped)
>>
>> As discovered by Fabiano Rosas, the cause is a duplicate invocation of
>> msix_unset_vector_notifiers via duplicate vu_scmi_stop calls:
>>
>> msix_unset_vector_notifiers
>> virtio_pci_set_guest_notifiers
>> vu_scmi_stop
>> vu_scmi_disconnect
>> ...
>> qemu_chr_write_buffer
>>
>> msix_unset_vector_notifiers
>> virtio_pci_set_guest_notifiers
>> vu_scmi_stop
>> vu_scmi_set_status
>> ...
>> qemu_cleanup
>>
>> While vu_scmi_stop calls are protected by vhost_dev_is_started()
>> check, it's apparently not enough. vhost-user-blk and vhost-user-gpio
>> use an extra protection, see f5b22d06fb (vhost: recheck dev state in
>> the vhost_migration_log routine) for the motivation. Let's use the
>> same in vhost-user-scmi, which fixes the failure above.
>>
>> Fixes: a5dab090e142 ("hw/virtio: Add boilerplate for vhost-user-scmi device")
>> Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
Please note that this bug fix should IMO definitely go to 8.1, to not
have a bug in vhost-user-scmi and to not have broken tests. Any chance
to get it merged?
Thanks,
Milan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] hw/virtio: Add a protection against duplicate vu_scmi_stop calls
2023-08-03 7:10 ` Milan Zamazal
@ 2023-08-03 7:38 ` Thomas Huth
2023-08-03 9:50 ` Michael S. Tsirkin
0 siblings, 1 reply; 7+ messages in thread
From: Thomas Huth @ 2023-08-03 7:38 UTC (permalink / raw)
To: Milan Zamazal, qemu-devel
Cc: Fabiano Rosas, Michael S. Tsirkin, Peter Maydell, Laurent Vivier,
Paolo Bonzini, Richard Henderson, Alex Bennée
On 03/08/2023 09.10, Milan Zamazal wrote:
> Fabiano Rosas <farosas@suse.de> writes:
>
>> Milan Zamazal <mzamazal@redhat.com> writes:
>>
>>> The QEMU CI fails in virtio-scmi test occasionally. As reported by
>>> Thomas Huth, this happens most likely when the system is loaded and it
>>> fails with the following error:
>>>
>>> qemu-system-aarch64: ../../devel/qemu/hw/pci/msix.c:659:
>>> msix_unset_vector_notifiers: Assertion
>>> `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier'
>>> failed.
>>> ../../devel/qemu/tests/qtest/libqtest.c:200: kill_qemu() detected
>>> QEMU death from signal 6 (Aborted) (core dumped)
>>>
>>> As discovered by Fabiano Rosas, the cause is a duplicate invocation of
>>> msix_unset_vector_notifiers via duplicate vu_scmi_stop calls:
>>>
>>> msix_unset_vector_notifiers
>>> virtio_pci_set_guest_notifiers
>>> vu_scmi_stop
>>> vu_scmi_disconnect
>>> ...
>>> qemu_chr_write_buffer
>>>
>>> msix_unset_vector_notifiers
>>> virtio_pci_set_guest_notifiers
>>> vu_scmi_stop
>>> vu_scmi_set_status
>>> ...
>>> qemu_cleanup
>>>
>>> While vu_scmi_stop calls are protected by vhost_dev_is_started()
>>> check, it's apparently not enough. vhost-user-blk and vhost-user-gpio
>>> use an extra protection, see f5b22d06fb (vhost: recheck dev state in
>>> the vhost_migration_log routine) for the motivation. Let's use the
>>> same in vhost-user-scmi, which fixes the failure above.
>>>
>>> Fixes: a5dab090e142 ("hw/virtio: Add boilerplate for vhost-user-scmi device")
>>> Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
>>
>> Reviewed-by: Fabiano Rosas <farosas@suse.de>
>
> Please note that this bug fix should IMO definitely go to 8.1, to not
> have a bug in vhost-user-scmi and to not have broken tests. Any chance
> to get it merged?
If nobody else is planning a pull request with this patch included, I can
take it for my next PR (since it is fixing the CI, I just saw another
failure here:
https://gitlab.com/qemu-project/qemu/-/jobs/4790457938#L4784 )
Thomas
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] hw/virtio: Add a protection against duplicate vu_scmi_stop calls
2023-08-03 7:38 ` Thomas Huth
@ 2023-08-03 9:50 ` Michael S. Tsirkin
2023-08-03 10:46 ` Thomas Huth
0 siblings, 1 reply; 7+ messages in thread
From: Michael S. Tsirkin @ 2023-08-03 9:50 UTC (permalink / raw)
To: Thomas Huth
Cc: Milan Zamazal, qemu-devel, Fabiano Rosas, Peter Maydell,
Laurent Vivier, Paolo Bonzini, Richard Henderson,
Alex Bennée
On Thu, Aug 03, 2023 at 09:38:20AM +0200, Thomas Huth wrote:
> On 03/08/2023 09.10, Milan Zamazal wrote:
> > Fabiano Rosas <farosas@suse.de> writes:
> >
> > > Milan Zamazal <mzamazal@redhat.com> writes:
> > >
> > > > The QEMU CI fails in virtio-scmi test occasionally. As reported by
> > > > Thomas Huth, this happens most likely when the system is loaded and it
> > > > fails with the following error:
> > > >
> > > > qemu-system-aarch64: ../../devel/qemu/hw/pci/msix.c:659:
> > > > msix_unset_vector_notifiers: Assertion
> > > > `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier'
> > > > failed.
> > > > ../../devel/qemu/tests/qtest/libqtest.c:200: kill_qemu() detected
> > > > QEMU death from signal 6 (Aborted) (core dumped)
> > > >
> > > > As discovered by Fabiano Rosas, the cause is a duplicate invocation of
> > > > msix_unset_vector_notifiers via duplicate vu_scmi_stop calls:
> > > >
> > > > msix_unset_vector_notifiers
> > > > virtio_pci_set_guest_notifiers
> > > > vu_scmi_stop
> > > > vu_scmi_disconnect
> > > > ...
> > > > qemu_chr_write_buffer
> > > >
> > > > msix_unset_vector_notifiers
> > > > virtio_pci_set_guest_notifiers
> > > > vu_scmi_stop
> > > > vu_scmi_set_status
> > > > ...
> > > > qemu_cleanup
> > > >
> > > > While vu_scmi_stop calls are protected by vhost_dev_is_started()
> > > > check, it's apparently not enough. vhost-user-blk and vhost-user-gpio
> > > > use an extra protection, see f5b22d06fb (vhost: recheck dev state in
> > > > the vhost_migration_log routine) for the motivation. Let's use the
> > > > same in vhost-user-scmi, which fixes the failure above.
> > > >
> > > > Fixes: a5dab090e142 ("hw/virtio: Add boilerplate for vhost-user-scmi device")
> > > > Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
> > >
> > > Reviewed-by: Fabiano Rosas <farosas@suse.de>
> >
> > Please note that this bug fix should IMO definitely go to 8.1, to not
> > have a bug in vhost-user-scmi and to not have broken tests. Any chance
> > to get it merged?
>
> If nobody else is planning a pull request with this patch included, I can
> take it for my next PR (since it is fixing the CI, I just saw another
> failure here:
> https://gitlab.com/qemu-project/qemu/-/jobs/4790457938#L4784 )
>
> Thomas
>
I picked it up but if you like I can drop it.
--
MST
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] hw/virtio: Add a protection against duplicate vu_scmi_stop calls
2023-08-03 9:50 ` Michael S. Tsirkin
@ 2023-08-03 10:46 ` Thomas Huth
0 siblings, 0 replies; 7+ messages in thread
From: Thomas Huth @ 2023-08-03 10:46 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Milan Zamazal, qemu-devel, Fabiano Rosas, Peter Maydell,
Laurent Vivier, Paolo Bonzini, Richard Henderson,
Alex Bennée
On 03/08/2023 11.50, Michael S. Tsirkin wrote:
> On Thu, Aug 03, 2023 at 09:38:20AM +0200, Thomas Huth wrote:
>> On 03/08/2023 09.10, Milan Zamazal wrote:
>>> Fabiano Rosas <farosas@suse.de> writes:
>>>
>>>> Milan Zamazal <mzamazal@redhat.com> writes:
>>>>
>>>>> The QEMU CI fails in virtio-scmi test occasionally. As reported by
>>>>> Thomas Huth, this happens most likely when the system is loaded and it
>>>>> fails with the following error:
>>>>>
>>>>> qemu-system-aarch64: ../../devel/qemu/hw/pci/msix.c:659:
>>>>> msix_unset_vector_notifiers: Assertion
>>>>> `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier'
>>>>> failed.
>>>>> ../../devel/qemu/tests/qtest/libqtest.c:200: kill_qemu() detected
>>>>> QEMU death from signal 6 (Aborted) (core dumped)
>>>>>
>>>>> As discovered by Fabiano Rosas, the cause is a duplicate invocation of
>>>>> msix_unset_vector_notifiers via duplicate vu_scmi_stop calls:
>>>>>
>>>>> msix_unset_vector_notifiers
>>>>> virtio_pci_set_guest_notifiers
>>>>> vu_scmi_stop
>>>>> vu_scmi_disconnect
>>>>> ...
>>>>> qemu_chr_write_buffer
>>>>>
>>>>> msix_unset_vector_notifiers
>>>>> virtio_pci_set_guest_notifiers
>>>>> vu_scmi_stop
>>>>> vu_scmi_set_status
>>>>> ...
>>>>> qemu_cleanup
>>>>>
>>>>> While vu_scmi_stop calls are protected by vhost_dev_is_started()
>>>>> check, it's apparently not enough. vhost-user-blk and vhost-user-gpio
>>>>> use an extra protection, see f5b22d06fb (vhost: recheck dev state in
>>>>> the vhost_migration_log routine) for the motivation. Let's use the
>>>>> same in vhost-user-scmi, which fixes the failure above.
>>>>>
>>>>> Fixes: a5dab090e142 ("hw/virtio: Add boilerplate for vhost-user-scmi device")
>>>>> Signed-off-by: Milan Zamazal <mzamazal@redhat.com>
>>>>
>>>> Reviewed-by: Fabiano Rosas <farosas@suse.de>
>>>
>>> Please note that this bug fix should IMO definitely go to 8.1, to not
>>> have a bug in vhost-user-scmi and to not have broken tests. Any chance
>>> to get it merged?
>>
>> If nobody else is planning a pull request with this patch included, I can
>> take it for my next PR (since it is fixing the CI, I just saw another
>> failure here:
>> https://gitlab.com/qemu-project/qemu/-/jobs/4790457938#L4784 )
>>
>> Thomas
>>
>
> I picked it up but if you like I can drop it.
I think it better fits into your tree, so if you plan a PR before the next
RC, then please go ahead. I'll drop it from my branch again.
Thomas
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2023-08-03 10:46 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-20 10:10 [PATCH] hw/virtio: Add a protection against duplicate vu_scmi_stop calls Milan Zamazal
2023-07-20 10:55 ` Thomas Huth
2023-07-20 12:30 ` Fabiano Rosas
2023-08-03 7:10 ` Milan Zamazal
2023-08-03 7:38 ` Thomas Huth
2023-08-03 9:50 ` Michael S. Tsirkin
2023-08-03 10:46 ` Thomas Huth
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).