* [PATCH v4] monitor/qmp: cleanup SocketChardev listener sources early to avoid fd handling race
@ 2025-11-25 14:07 Jie Song
2025-12-01 6:00 ` Markus Armbruster
0 siblings, 1 reply; 6+ messages in thread
From: Jie Song @ 2025-11-25 14:07 UTC (permalink / raw)
To: marcandre.lureau, eblake, armbru, berrange, qemu-devel
Cc: mail, songjie_yewu, Marc-André Lureau
From: Jie Song <songjie_yewu@cmss.chinamobile.com>
When starting a dummy QEMU process with virsh version, monitor_init_qmp()
enables IOThread monitoring of the QMP fd by default. However, a race
condition exists during the initialization phase: the IOThread only removes
the main thread's fd watch when it reaches qio_net_listener_set_client_func_full(),
which may be delayed under high system load.
This creates a window between monitor_qmp_setup_handlers_bh() and
qio_net_listener_set_client_func_full() where both the main thread and
IOThread are simultaneously monitoring the same fd and processing events.
This race can cause either the main thread or the IOThread to hang and
become unresponsive.
Fix this by proactively cleaning up the listener's IO sources in
monitor_init_qmp() before the IOThread initializes QMP monitoring,
ensuring exclusive fd ownership and eliminating the race condition.
Signed-off-by: Jie Song <songjie_yewu@cmss.chinamobile.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
Changes in v4:
- Correct typos and add braces, reviewed by Marc-André Lureau.
- Link to v3:
https://lists.nongnu.org/archive/html/qemu-devel/2025-11/msg03504.html
- Link to v2:
https://lists.nongnu.org/archive/html/qemu-devel/2025-11/msg02443.html
- Link to v1:
https://lists.nongnu.org/archive/html/qemu-devel/2025-11/msg01621.html
---
chardev/char-io.c | 8 ++++++++
chardev/char-socket.c | 10 ++++++++++
include/chardev/char-io.h | 2 ++
include/chardev/char.h | 2 ++
monitor/qmp.c | 5 +++++
5 files changed, 27 insertions(+)
diff --git a/chardev/char-io.c b/chardev/char-io.c
index 3be17b51ca..beac5cd245 100644
--- a/chardev/char-io.c
+++ b/chardev/char-io.c
@@ -182,3 +182,11 @@ int io_channel_send(QIOChannel *ioc, const void *buf, size_t len)
{
return io_channel_send_full(ioc, buf, len, NULL, 0);
}
+
+void remove_listener_fd_in_watch(Chardev *chr)
+{
+ ChardevClass *cc = CHARDEV_GET_CLASS(chr);
+ if (cc->chr_listener_cleanup) {
+ cc->chr_listener_cleanup(chr);
+ }
+}
diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 26d2f11202..3f45dd2ecd 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -1570,6 +1570,15 @@ char_socket_get_connected(Object *obj, Error **errp)
return s->state == TCP_CHARDEV_STATE_CONNECTED;
}
+static void tcp_chr_listener_cleanup(Chardev *chr)
+{
+ SocketChardev *s = SOCKET_CHARDEV(chr);
+ if (s->listener) {
+ qio_net_listener_set_client_func_full(s->listener, NULL, NULL,
+ NULL, chr->gcontext);
+ }
+}
+
static void char_socket_class_init(ObjectClass *oc, const void *data)
{
ChardevClass *cc = CHARDEV_CLASS(oc);
@@ -1587,6 +1596,7 @@ static void char_socket_class_init(ObjectClass *oc, const void *data)
cc->chr_add_client = tcp_chr_add_client;
cc->chr_add_watch = tcp_chr_add_watch;
cc->chr_update_read_handler = tcp_chr_update_read_handler;
+ cc->chr_listener_cleanup = tcp_chr_listener_cleanup;
object_class_property_add(oc, "addr", "SocketAddress",
char_socket_get_addr, NULL,
diff --git a/include/chardev/char-io.h b/include/chardev/char-io.h
index ac379ea70e..540131346d 100644
--- a/include/chardev/char-io.h
+++ b/include/chardev/char-io.h
@@ -43,4 +43,6 @@ int io_channel_send(QIOChannel *ioc, const void *buf, size_t len);
int io_channel_send_full(QIOChannel *ioc, const void *buf, size_t len,
int *fds, size_t nfds);
+void remove_listener_fd_in_watch(Chardev *chr);
+
#endif /* CHAR_IO_H */
diff --git a/include/chardev/char.h b/include/chardev/char.h
index b65e9981c1..192cad67d4 100644
--- a/include/chardev/char.h
+++ b/include/chardev/char.h
@@ -307,6 +307,8 @@ struct ChardevClass {
/* handle various events */
void (*chr_be_event)(Chardev *s, QEMUChrEvent event);
+
+ void (*chr_listener_cleanup)(Chardev *chr);
};
Chardev *qemu_chardev_new(const char *id, const char *typename,
diff --git a/monitor/qmp.c b/monitor/qmp.c
index cb99a12d94..7ae070dc8d 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -537,6 +537,11 @@ void monitor_init_qmp(Chardev *chr, bool pretty, Error **errp)
* e.g. the chardev is in client mode, with wait=on.
*/
remove_fd_in_watch(chr);
+ /*
+ * Clean up listener IO sources early to prevent racy fd
+ * handling between the main thread and the I/O thread.
+ */
+ remove_listener_fd_in_watch(chr);
/*
* We can't call qemu_chr_fe_set_handlers() directly here
* since chardev might be running in the monitor I/O
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v4] monitor/qmp: cleanup SocketChardev listener sources early to avoid fd handling race
2025-11-25 14:07 [PATCH v4] monitor/qmp: cleanup SocketChardev listener sources early to avoid fd handling race Jie Song
@ 2025-12-01 6:00 ` Markus Armbruster
2025-12-01 7:31 ` Marc-André Lureau
2025-12-01 13:04 ` Jie Song
0 siblings, 2 replies; 6+ messages in thread
From: Markus Armbruster @ 2025-12-01 6:00 UTC (permalink / raw)
To: Jie Song
Cc: marcandre.lureau, eblake, berrange, qemu-devel, songjie_yewu,
Marc-André Lureau
Jie Song, Marc-André, is this bug serious enough and the fix safe enough
to still go into 10.2?
Jie Song <mail@jiesong.me> writes:
> From: Jie Song <songjie_yewu@cmss.chinamobile.com>
>
> When starting a dummy QEMU process with virsh version, monitor_init_qmp()
> enables IOThread monitoring of the QMP fd by default. However, a race
> condition exists during the initialization phase: the IOThread only removes
> the main thread's fd watch when it reaches qio_net_listener_set_client_func_full(),
> which may be delayed under high system load.
>
> This creates a window between monitor_qmp_setup_handlers_bh() and
> qio_net_listener_set_client_func_full() where both the main thread and
> IOThread are simultaneously monitoring the same fd and processing events.
> This race can cause either the main thread or the IOThread to hang and
> become unresponsive.
>
> Fix this by proactively cleaning up the listener's IO sources in
> monitor_init_qmp() before the IOThread initializes QMP monitoring,
> ensuring exclusive fd ownership and eliminating the race condition.
>
> Signed-off-by: Jie Song <songjie_yewu@cmss.chinamobile.com>
> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4] monitor/qmp: cleanup SocketChardev listener sources early to avoid fd handling race
2025-12-01 6:00 ` Markus Armbruster
@ 2025-12-01 7:31 ` Marc-André Lureau
2025-12-01 13:04 ` Jie Song
1 sibling, 0 replies; 6+ messages in thread
From: Marc-André Lureau @ 2025-12-01 7:31 UTC (permalink / raw)
To: Markus Armbruster; +Cc: Jie Song, eblake, berrange, qemu-devel, songjie_yewu
Hi Markus
On Mon, Dec 1, 2025 at 10:00 AM Markus Armbruster <armbru@redhat.com> wrote:
>
> Jie Song, Marc-André, is this bug serious enough and the fix safe enough
> to still go into 10.2?
>
My feeling is that it's a bit late for 10.2, as I suppose this bug has
been present for a long time and the risk of regression is high. Also
not enough people reviewed it.
> Jie Song <mail@jiesong.me> writes:
>
> > From: Jie Song <songjie_yewu@cmss.chinamobile.com>
> >
> > When starting a dummy QEMU process with virsh version, monitor_init_qmp()
> > enables IOThread monitoring of the QMP fd by default. However, a race
> > condition exists during the initialization phase: the IOThread only removes
> > the main thread's fd watch when it reaches qio_net_listener_set_client_func_full(),
> > which may be delayed under high system load.
> >
> > This creates a window between monitor_qmp_setup_handlers_bh() and
> > qio_net_listener_set_client_func_full() where both the main thread and
> > IOThread are simultaneously monitoring the same fd and processing events.
> > This race can cause either the main thread or the IOThread to hang and
> > become unresponsive.
> >
> > Fix this by proactively cleaning up the listener's IO sources in
> > monitor_init_qmp() before the IOThread initializes QMP monitoring,
> > ensuring exclusive fd ownership and eliminating the race condition.
> >
> > Signed-off-by: Jie Song <songjie_yewu@cmss.chinamobile.com>
> > Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>
--
Marc-André Lureau
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4] monitor/qmp: cleanup SocketChardev listener sources early to avoid fd handling race
2025-12-01 6:00 ` Markus Armbruster
2025-12-01 7:31 ` Marc-André Lureau
@ 2025-12-01 13:04 ` Jie Song
2025-12-02 12:07 ` Markus Armbruster
1 sibling, 1 reply; 6+ messages in thread
From: Jie Song @ 2025-12-01 13:04 UTC (permalink / raw)
To: armbru
Cc: berrange, eblake, mail, marcandre.lureau, marcandre.lureau,
qemu-devel, songjie_yewu
Hi Markus,
> Jie Song, Marc-André, is this bug serious enough and the fix safe enough
> to still go into 10.2?
First, regarding the seriousness of this bug, although the probability of encountering
it in a production environment is relatively low, it has existed for quite some time.
Secondly, with regard to the safety of this fix, it has been verified successfully
in the test environment. However, it would be better if more people could help to
review it to further ensure its robustness.
>
> Jie Song <mail@jiesong.me> writes:
>
> > From: Jie Song <songjie_yewu@cmss.chinamobile.com>
> >
> > When starting a dummy QEMU process with virsh version, monitor_init_qmp()
> > enables IOThread monitoring of the QMP fd by default. However, a race
> > condition exists during the initialization phase: the IOThread only removes
> > the main thread's fd watch when it reaches qio_net_listener_set_client_func_full(),
> > which may be delayed under high system load.
> >
> > This creates a window between monitor_qmp_setup_handlers_bh() and
> > qio_net_listener_set_client_func_full() where both the main thread and
> > IOThread are simultaneously monitoring the same fd and processing events.
> > This race can cause either the main thread or the IOThread to hang and
> > become unresponsive.
> >
> > Fix this by proactively cleaning up the listener's IO sources in
> > monitor_init_qmp() before the IOThread initializes QMP monitoring,
> > ensuring exclusive fd ownership and eliminating the race condition.
> >
> > Signed-off-by: Jie Song <songjie_yewu@cmss.chinamobile.com>
> > Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Regards,
Jie Song
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4] monitor/qmp: cleanup SocketChardev listener sources early to avoid fd handling race
2025-12-01 13:04 ` Jie Song
@ 2025-12-02 12:07 ` Markus Armbruster
2025-12-02 13:03 ` Jie Song
0 siblings, 1 reply; 6+ messages in thread
From: Markus Armbruster @ 2025-12-02 12:07 UTC (permalink / raw)
To: Jie Song
Cc: berrange, eblake, marcandre.lureau, marcandre.lureau, qemu-devel,
songjie_yewu
Jie Song <mail@jiesong.me> writes:
> Hi Markus,
>
>> Jie Song, Marc-André, is this bug serious enough and the fix safe enough
>> to still go into 10.2?
>
> First, regarding the seriousness of this bug, although the probability of encountering
> it in a production environment is relatively low, it has existed for quite some time.
>
> Secondly, with regard to the safety of this fix, it has been verified successfully
> in the test environment. However, it would be better if more people could help to
> review it to further ensure its robustness.
This confirms Marc-André's "too late for 10.2" feeling.
I'll track this patch for 11.0. More review would be nice, but if we
can get it, I'll get the patch merged early in the development cycle.
Thank you!
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v4] monitor/qmp: cleanup SocketChardev listener sources early to avoid fd handling race
2025-12-02 12:07 ` Markus Armbruster
@ 2025-12-02 13:03 ` Jie Song
0 siblings, 0 replies; 6+ messages in thread
From: Jie Song @ 2025-12-02 13:03 UTC (permalink / raw)
To: armbru
Cc: berrange, eblake, mail, marcandre.lureau, marcandre.lureau,
qemu-devel, songjie_yewu
> Jie Song <mail@jiesong.me> writes:
>
> > Hi Markus,
> >
> >> Jie Song, Marc-André, is this bug serious enough and the fix safe enough
> >> to still go into 10.2?
> >
> > First, regarding the seriousness of this bug, although the probability of
> > encountering
> > it in a production environment is relatively low, it has existed for quite
> > some time.
> >
> > Secondly, with regard to the safety of this fix, it has been verified
> > successfully
> > in the test environment. However, it would be better if more people could
> > help to
> > review it to further ensure its robustness.
>
> This confirms Marc-André's "too late for 10.2" feeling.
>
> I'll track this patch for 11.0. More review would be nice, but if we
> can get it, I'll get the patch merged early in the development cycle.
>
> Thank you!
Hi Markus,
Thanks for the update and for tracking this patch for 11.0.
I’ll keep following up on this and will address any comments
or issues that come up.
Thanks again for your support.
Best regards,
Jie Song
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-12-02 13:04 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-25 14:07 [PATCH v4] monitor/qmp: cleanup SocketChardev listener sources early to avoid fd handling race Jie Song
2025-12-01 6:00 ` Markus Armbruster
2025-12-01 7:31 ` Marc-André Lureau
2025-12-01 13:04 ` Jie Song
2025-12-02 12:07 ` Markus Armbruster
2025-12-02 13:03 ` Jie Song
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).