* [PATCH 0/1] Fix racy SEGFAULT upon monitor_cleanup()
@ 2025-05-02 21:47 andrey.drobyshev
2025-05-02 21:47 ` [PATCH 1/1] monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup andrey.drobyshev
0 siblings, 1 reply; 3+ messages in thread
From: andrey.drobyshev @ 2025-05-02 21:47 UTC (permalink / raw)
To: qemu-devel; +Cc: armbru, pbonzini, andrey.drobyshev, den
From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
There's a race in monitor cleanup code which might result into SEGFAULT.
When monitor_cleanup() is launched, qmp_dispatcher_co coroutine pointer
is set to NULL (see Paolo's commit 3e6bed61 ("monitor: cleanup detection of
qmp_dispatcher_co shutting down")). If after that we manage to send
another QMP command while monitor is shutting down, we might end up
getting SIGSEGV on aio_co_wake(NULL).
Backtrace:
> #0 0x00005565359b7e19 in aio_co_wake (co=0x0) at ../util/async.c:693
> #1 0x0000556535814903 in qmp_dispatcher_co_wake () at ../monitor/qmp.c:361
> #2 0x0000556535814c17 in handle_qmp_command (opaque=0x55653767bb10, req=0x7fd574007090, err=0x0)
> at ../monitor/qmp.c:426
> #3 0x000055653598d8f1 in json_message_process_token
> (lexer=0x55653767bbd8, input=0x7fd5740055a0, type=JSON_RCURLY, x=61, y=0) at ../qobject/json-streamer.c:99
> #4 0x00005565359dd68b in json_lexer_feed_char (lexer=0x55653767bbd8, ch=125 '}', flush=false)
> at ../qobject/json-lexer.c:313
> #5 0x00005565359dd7f7 in json_lexer_feed (lexer=0x55653767bbd8, buffer=0x7fd57ecf4e50 "}", size=1)
> at ../qobject/json-lexer.c:350
> #6 0x000055653598d9e3 in json_message_parser_feed (parser=0x55653767bbc0, buffer=0x7fd57ecf4e50 "}", size=1)
> at ../qobject/json-streamer.c:121
> #7 0x0000556535814c72 in monitor_qmp_read (opaque=0x55653767bb10, buf=0x7fd57ecf4e50 "}", size=1)
> at ../monitor/qmp.c:433
> #8 0x000055653580b4ca in qemu_chr_be_write_impl (s=0x5565374466a0, buf=0x7fd57ecf4e50 "}", len=1)
> at ../chardev/char.c:214
> #9 0x000055653580b53b in qemu_chr_be_write (s=0x5565374466a0, buf=0x7fd57ecf4e50 "}", len=1) at ../chardev/char.c:226
> #10 0x0000556535806aea in tcp_chr_read (chan=0x7fd574001600, cond=G_IO_IN, opaque=0x5565374466a0)
> at ../chardev/char-socket.c:520
> #11 0x00005565356d7f6e in qio_channel_fd_source_dispatch
> (source=0x7fd574005780, callback=0x55653580696f <tcp_chr_read>, user_data=0x5565374466a0)
> at ../io/channel-watch.c:84
> #12 0x00007fd590ae6f4f in g_main_dispatch (context=0x556537671f50) at ../glib/gmain.c:3364
> #13 g_main_context_dispatch (context=0x556537671f50) at ../glib/gmain.c:4079
> #14 0x00007fd590b3c268 in g_main_context_iterate.constprop.0
> (context=0x556537671f50, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>)
> at ../glib/gmain.c:4155
> #15 0x00007fd590ae65a3 in g_main_loop_run (loop=0x556537672070) at ../glib/gmain.c:4353
> #16 0x000055653570dbb0 in iothread_run (opaque=0x556537345660) at ../iothread.c:70
> #17 0x000055653599baf6 in qemu_thread_start (args=0x556537672090) at ../util/qemu-thread-posix.c:541
> #18 0x00007fd58f49f7f2 in start_thread (arg=<optimized out>) at pthread_create.c:443
> #19 0x00007fd58f43f450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Unfortunately I couldn't reproduce the issue cleanly as the race window
is quite small. If I add sleep as follows, extending the race window
and imitating the thread being preempted by the kernel scheduler:
> diff --git a/monitor/qmp.c b/monitor/qmp.c
> index cb99a12d94..dd608ee717 100644
> --- a/monitor/qmp.c
> +++ b/monitor/qmp.c
> @@ -349,6 +349,7 @@ void coroutine_fn monitor_qmp_dispatcher_co(void *data)
> qmp_request_free(req_obj);
> }
> qatomic_set(&qmp_dispatcher_co, NULL);
> + sleep(1);
> }
>
> void qmp_dispatcher_co_wake(void)
Then the SEGFAULT can be reproduced like that:
> SRCDIR=/path/to/srcdir
> QMPSHELL=$SRCDIR/scripts/qmp/qmp-shell
> QMPSOCK=/path/to/qmpsock
>
> $QMPSHELL -p $QMPSOCK <<EOF
> system_powerdown
> EOF
>
> while /bin/true ; do
> $QMPSHELL -p $QMPSOCK <<EOF
> query-status
> EOF
> done
Andrey Drobyshev (1):
monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup
monitor/qmp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--
2.43.5
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 1/1] monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup
2025-05-02 21:47 [PATCH 0/1] Fix racy SEGFAULT upon monitor_cleanup() andrey.drobyshev
@ 2025-05-02 21:47 ` andrey.drobyshev
2025-05-02 22:06 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 3+ messages in thread
From: andrey.drobyshev @ 2025-05-02 21:47 UTC (permalink / raw)
To: qemu-devel; +Cc: armbru, pbonzini, andrey.drobyshev, den
From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Since the commit 3e6bed61 ("monitor: cleanup detection of qmp_dispatcher_co
shutting down"), coroutine pointer qmp_dispatcher_co is set to NULL upon
cleanup. If a QMP command is sent after monitor_cleanup() (e.g. after
shutdown), this may lead to SEGFAULT on aio_co_wake(NULL).
As mentioned in the comment inside monitor_cleanup(), the intention is to
allow incoming requests while shutting down, but simply leave them
without any response. Let's do exactly that, and if qmp_dispatcher_co
coroutine pointer has already been set to NULL, let's simply skip the
aio_co_wake() part.
Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
---
monitor/qmp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/monitor/qmp.c b/monitor/qmp.c
index 2f46cf9e49..cb99a12d94 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -356,7 +356,8 @@ void qmp_dispatcher_co_wake(void)
/* Write request before reading qmp_dispatcher_co_busy. */
smp_mb__before_rmw();
- if (!qatomic_xchg(&qmp_dispatcher_co_busy, true)) {
+ if (!qatomic_xchg(&qmp_dispatcher_co_busy, true) &&
+ qatomic_read(&qmp_dispatcher_co)) {
aio_co_wake(qmp_dispatcher_co);
}
}
--
2.43.5
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH 1/1] monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup
2025-05-02 21:47 ` [PATCH 1/1] monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup andrey.drobyshev
@ 2025-05-02 22:06 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 3+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-05-02 22:06 UTC (permalink / raw)
To: andrey.drobyshev, qemu-devel; +Cc: armbru, pbonzini, den, qemu-block
On 2/5/25 23:47, andrey.drobyshev@virtuozzo.com wrote:
> From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
>
> Since the commit 3e6bed61 ("monitor: cleanup detection of qmp_dispatcher_co
> shutting down"), coroutine pointer qmp_dispatcher_co is set to NULL upon
> cleanup. If a QMP command is sent after monitor_cleanup() (e.g. after
> shutdown), this may lead to SEGFAULT on aio_co_wake(NULL).
>
> As mentioned in the comment inside monitor_cleanup(), the intention is to
> allow incoming requests while shutting down, but simply leave them
> without any response. Let's do exactly that, and if qmp_dispatcher_co
> coroutine pointer has already been set to NULL, let's simply skip the
> aio_co_wake() part.
>
Cc: qemu-stable@nongnu.org
> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
> ---
> monitor/qmp.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/monitor/qmp.c b/monitor/qmp.c
> index 2f46cf9e49..cb99a12d94 100644
> --- a/monitor/qmp.c
> +++ b/monitor/qmp.c
> @@ -356,7 +356,8 @@ void qmp_dispatcher_co_wake(void)
> /* Write request before reading qmp_dispatcher_co_busy. */
> smp_mb__before_rmw();
>
> - if (!qatomic_xchg(&qmp_dispatcher_co_busy, true)) {
> + if (!qatomic_xchg(&qmp_dispatcher_co_busy, true) &&
> + qatomic_read(&qmp_dispatcher_co)) {
> aio_co_wake(qmp_dispatcher_co);
> }
> }
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-05-02 22:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-02 21:47 [PATCH 0/1] Fix racy SEGFAULT upon monitor_cleanup() andrey.drobyshev
2025-05-02 21:47 ` [PATCH 1/1] monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup andrey.drobyshev
2025-05-02 22:06 ` Philippe Mathieu-Daudé
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).