qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] Fix racy SEGFAULT upon monitor_cleanup()
@ 2025-05-02 21:47 andrey.drobyshev
  2025-05-02 21:47 ` [PATCH 1/1] monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup andrey.drobyshev
  0 siblings, 1 reply; 3+ messages in thread
From: andrey.drobyshev @ 2025-05-02 21:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: armbru, pbonzini, andrey.drobyshev, den

From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>

There's a race in monitor cleanup code which might result into SEGFAULT.
When monitor_cleanup() is launched, qmp_dispatcher_co coroutine pointer
is set to NULL (see Paolo's commit 3e6bed61 ("monitor: cleanup detection of
qmp_dispatcher_co shutting down")).  If after that we manage to send
another QMP command while monitor is shutting down, we might end up
getting SIGSEGV on aio_co_wake(NULL).

Backtrace:

> #0  0x00005565359b7e19 in aio_co_wake (co=0x0) at ../util/async.c:693
> #1  0x0000556535814903 in qmp_dispatcher_co_wake () at ../monitor/qmp.c:361
> #2  0x0000556535814c17 in handle_qmp_command (opaque=0x55653767bb10, req=0x7fd574007090, err=0x0)
>     at ../monitor/qmp.c:426
> #3  0x000055653598d8f1 in json_message_process_token
>     (lexer=0x55653767bbd8, input=0x7fd5740055a0, type=JSON_RCURLY, x=61, y=0) at ../qobject/json-streamer.c:99
> #4  0x00005565359dd68b in json_lexer_feed_char (lexer=0x55653767bbd8, ch=125 '}', flush=false)
>     at ../qobject/json-lexer.c:313
> #5  0x00005565359dd7f7 in json_lexer_feed (lexer=0x55653767bbd8, buffer=0x7fd57ecf4e50 "}", size=1)
>     at ../qobject/json-lexer.c:350
> #6  0x000055653598d9e3 in json_message_parser_feed (parser=0x55653767bbc0, buffer=0x7fd57ecf4e50 "}", size=1)
>     at ../qobject/json-streamer.c:121
> #7  0x0000556535814c72 in monitor_qmp_read (opaque=0x55653767bb10, buf=0x7fd57ecf4e50 "}", size=1)
>     at ../monitor/qmp.c:433
> #8  0x000055653580b4ca in qemu_chr_be_write_impl (s=0x5565374466a0, buf=0x7fd57ecf4e50 "}", len=1)
>     at ../chardev/char.c:214
> #9  0x000055653580b53b in qemu_chr_be_write (s=0x5565374466a0, buf=0x7fd57ecf4e50 "}", len=1) at ../chardev/char.c:226
> #10 0x0000556535806aea in tcp_chr_read (chan=0x7fd574001600, cond=G_IO_IN, opaque=0x5565374466a0)
>     at ../chardev/char-socket.c:520
> #11 0x00005565356d7f6e in qio_channel_fd_source_dispatch
>     (source=0x7fd574005780, callback=0x55653580696f <tcp_chr_read>, user_data=0x5565374466a0)
>     at ../io/channel-watch.c:84
> #12 0x00007fd590ae6f4f in g_main_dispatch (context=0x556537671f50) at ../glib/gmain.c:3364
> #13 g_main_context_dispatch (context=0x556537671f50) at ../glib/gmain.c:4079
> #14 0x00007fd590b3c268 in g_main_context_iterate.constprop.0
>     (context=0x556537671f50, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>)
>     at ../glib/gmain.c:4155
> #15 0x00007fd590ae65a3 in g_main_loop_run (loop=0x556537672070) at ../glib/gmain.c:4353
> #16 0x000055653570dbb0 in iothread_run (opaque=0x556537345660) at ../iothread.c:70
> #17 0x000055653599baf6 in qemu_thread_start (args=0x556537672090) at ../util/qemu-thread-posix.c:541
> #18 0x00007fd58f49f7f2 in start_thread (arg=<optimized out>) at pthread_create.c:443
> #19 0x00007fd58f43f450 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Unfortunately I couldn't reproduce the issue cleanly as the race window
is quite small.  If I add sleep as follows, extending the race window
and imitating the thread being preempted by the kernel scheduler:

> diff --git a/monitor/qmp.c b/monitor/qmp.c
> index cb99a12d94..dd608ee717 100644
> --- a/monitor/qmp.c
> +++ b/monitor/qmp.c
> @@ -349,6 +349,7 @@ void coroutine_fn monitor_qmp_dispatcher_co(void *data)
>          qmp_request_free(req_obj);
>      }
>      qatomic_set(&qmp_dispatcher_co, NULL);
> +    sleep(1);
>  }
>  
>  void qmp_dispatcher_co_wake(void)

Then the SEGFAULT can be reproduced like that:

> SRCDIR=/path/to/srcdir
> QMPSHELL=$SRCDIR/scripts/qmp/qmp-shell
> QMPSOCK=/path/to/qmpsock
> 
> $QMPSHELL -p $QMPSOCK <<EOF
>     system_powerdown
> EOF
> 
> while /bin/true ; do
>     $QMPSHELL -p $QMPSOCK <<EOF
>         query-status
> EOF
> done

Andrey Drobyshev (1):
  monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup

 monitor/qmp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-- 
2.43.5



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH 1/1] monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup
  2025-05-02 21:47 [PATCH 0/1] Fix racy SEGFAULT upon monitor_cleanup() andrey.drobyshev
@ 2025-05-02 21:47 ` andrey.drobyshev
  2025-05-02 22:06   ` Philippe Mathieu-Daudé
  0 siblings, 1 reply; 3+ messages in thread
From: andrey.drobyshev @ 2025-05-02 21:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: armbru, pbonzini, andrey.drobyshev, den

From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>

Since the commit 3e6bed61 ("monitor: cleanup detection of qmp_dispatcher_co
shutting down"), coroutine pointer qmp_dispatcher_co is set to NULL upon
cleanup.  If a QMP command is sent after monitor_cleanup() (e.g. after
shutdown), this may lead to SEGFAULT on aio_co_wake(NULL).

As mentioned in the comment inside monitor_cleanup(), the intention is to
allow incoming requests while shutting down, but simply leave them
without any response.  Let's do exactly that, and if qmp_dispatcher_co
coroutine pointer has already been set to NULL, let's simply skip the
aio_co_wake() part.

Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
---
 monitor/qmp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/monitor/qmp.c b/monitor/qmp.c
index 2f46cf9e49..cb99a12d94 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -356,7 +356,8 @@ void qmp_dispatcher_co_wake(void)
     /* Write request before reading qmp_dispatcher_co_busy.  */
     smp_mb__before_rmw();
 
-    if (!qatomic_xchg(&qmp_dispatcher_co_busy, true)) {
+    if (!qatomic_xchg(&qmp_dispatcher_co_busy, true) &&
+            qatomic_read(&qmp_dispatcher_co)) {
         aio_co_wake(qmp_dispatcher_co);
     }
 }
-- 
2.43.5



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/1] monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup
  2025-05-02 21:47 ` [PATCH 1/1] monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup andrey.drobyshev
@ 2025-05-02 22:06   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 3+ messages in thread
From: Philippe Mathieu-Daudé @ 2025-05-02 22:06 UTC (permalink / raw)
  To: andrey.drobyshev, qemu-devel; +Cc: armbru, pbonzini, den, qemu-block

On 2/5/25 23:47, andrey.drobyshev@virtuozzo.com wrote:
> From: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
> 
> Since the commit 3e6bed61 ("monitor: cleanup detection of qmp_dispatcher_co
> shutting down"), coroutine pointer qmp_dispatcher_co is set to NULL upon
> cleanup.  If a QMP command is sent after monitor_cleanup() (e.g. after
> shutdown), this may lead to SEGFAULT on aio_co_wake(NULL).
> 
> As mentioned in the comment inside monitor_cleanup(), the intention is to
> allow incoming requests while shutting down, but simply leave them
> without any response.  Let's do exactly that, and if qmp_dispatcher_co
> coroutine pointer has already been set to NULL, let's simply skip the
> aio_co_wake() part.
> 

Cc: qemu-stable@nongnu.org

> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
> ---
>   monitor/qmp.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/monitor/qmp.c b/monitor/qmp.c
> index 2f46cf9e49..cb99a12d94 100644
> --- a/monitor/qmp.c
> +++ b/monitor/qmp.c
> @@ -356,7 +356,8 @@ void qmp_dispatcher_co_wake(void)
>       /* Write request before reading qmp_dispatcher_co_busy.  */
>       smp_mb__before_rmw();
>   
> -    if (!qatomic_xchg(&qmp_dispatcher_co_busy, true)) {
> +    if (!qatomic_xchg(&qmp_dispatcher_co_busy, true) &&
> +            qatomic_read(&qmp_dispatcher_co)) {
>           aio_co_wake(qmp_dispatcher_co);
>       }
>   }



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-05-02 22:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-02 21:47 [PATCH 0/1] Fix racy SEGFAULT upon monitor_cleanup() andrey.drobyshev
2025-05-02 21:47 ` [PATCH 1/1] monitor: don't wake up qmp_dispatcher_co coroutine upon cleanup andrey.drobyshev
2025-05-02 22:06   ` Philippe Mathieu-Daudé

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).