qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] chardev: Fix issues found by vhost-user-test
@ 2025-05-15 22:20 Fabiano Rosas
  2025-05-15 22:20 ` [PATCH 1/4] chardev: Fix QIOChannel refcount Fabiano Rosas
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Fabiano Rosas @ 2025-05-15 22:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcandre.lureau, berrange, pbonzini

Running vhost-user-test with ASAN on a loaded machine reveals several
intermittent issues. These show up every time I test the qtest tree so
I'm trying to get rid of them.

1- UAF of IOWatchPoll.
   This one is self explanatory, ASAN caught it.

2- Reference counting issues in glib. It seems it's possible to unref
   a source while adding a callback to it, and glib asserts. This
   shows up on all architectures, only on the ASAN build after
   hundreds of iterations.

3- Extra yank_unregister_function call leads to abort(). This shows up
   on all architectures, but it's quite hidden due to vhost-user-test
   using a dedicated server thread which dies and causes timeouts in
   the test.

   Manifests as assert(s->fds_num) failing. Only on the ASAN build,
   after tens of iterations (quite common).

Thanks

Fabiano Rosas (4):
  chardev: Fix QIOChannel refcount
  chardev: Don't attempt to unregister yank function more than once
  chardev: Consolidate yank registration
  chardev: Introduce a lock for hup_source

 chardev/char-io.c      |  5 +++++
 chardev/char-socket.c  | 38 ++++++++++++--------------------------
 chardev/char.c         |  2 ++
 include/chardev/char.h |  1 +
 4 files changed, 20 insertions(+), 26 deletions(-)

-- 
2.35.3



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] chardev: Fix QIOChannel refcount
  2025-05-15 22:20 [PATCH 0/4] chardev: Fix issues found by vhost-user-test Fabiano Rosas
@ 2025-05-15 22:20 ` Fabiano Rosas
  2025-05-19 10:49   ` Daniel P. Berrangé
  2025-05-15 22:20 ` [PATCH 2/4] chardev: Don't attempt to unregister yank function more than once Fabiano Rosas
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Fabiano Rosas @ 2025-05-15 22:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcandre.lureau, berrange, pbonzini

The IOWatchPoll holds a reference to the iochannel while the "child"
source (iwp->src) is removed from the context and freed. Freeing the
source leads to the iochannel being also freed at
qio_channel_fd_source_finalize().

Later, io_watch_poll_prepare() tries to create another source with the
same iochannel and hits an use after free:

==8241==ERROR: AddressSanitizer: heap-use-after-free on address 0x514000000040
READ of size 8 at 0x514000000040 thread T2
    #0 0x561c2d272fcd in object_get_class  ../qom/object.c:1043:17
    #1 0x561c2d338f84 in QIO_CHANNEL_GET_CLASS  include/io/channel.h:29:1
    #2 0x561c2d33b26f in qio_channel_create_watch  ../io/channel.c:388:30
    #3 0x561c2d2f0993 in io_watch_poll_prepare  ../chardev/char-io.c:65:20
    ...

0x514000000040 is located 0 bytes inside of 392-byte region [0x514000000040,0x5140000001c8)
freed by thread T2 here:
    #0 0x561c2d2319a5 in free
    #1 0x7fb2c0926638 in g_free
    #2 0x561c2d276507 in object_finalize  ../qom/object.c:734:9
    #3 0x561c2d271d0d in object_unref  ../qom/object.c:1231:9
    #4 0x561c2d32ef1d in qio_channel_fd_source_finalize  ../io/channel-watch.c:95:5
    #5 0x7fb2c091d124 in g_source_unref_internal ../glib/gmain.c:2298
    #6 0x561c2d2f0b6c in io_watch_poll_prepare  ../chardev/char-io.c:71:9
    ...

previously allocated by thread T3 (connect) here:
    #0 0x561c2d231c69 in malloc
    #1 0x7fb2c0926518 in g_malloc
    #2 0x561c2d27246e in object_new_with_type  ../qom/object.c:767:15
    #3 0x561c2d272530 in object_new  ../qom/object.c:789:12
    #4 0x561c2d320193 in qio_channel_socket_new  ../io/channel-socket.c:64:31
    #5 0x561c2d308013 in tcp_chr_connect_client_async  ../chardev/char-socket.c:1181:12
    #6 0x561c2d3002e7 in qmp_chardev_open_socket_client  ../chardev/char-socket.c:1281:9
    ...

Fix the issue by incrementing the iochannel reference count when the
IOWatchPoll takes a reference and decrementing when it is finalized.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 chardev/char-io.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/chardev/char-io.c b/chardev/char-io.c
index 3be17b51ca..d9b11f335f 100644
--- a/chardev/char-io.c
+++ b/chardev/char-io.c
@@ -88,6 +88,9 @@ static gboolean io_watch_poll_dispatch(GSource *source, GSourceFunc callback,
 static void io_watch_poll_finalize(GSource *source)
 {
     IOWatchPoll *iwp = io_watch_poll_from_source(source);
+
+    object_unref(OBJECT(iwp->ioc));
+
     if (iwp->src) {
         g_source_destroy(iwp->src);
         g_source_unref(iwp->src);
@@ -117,6 +120,8 @@ GSource *io_add_watch_poll(Chardev *chr,
     iwp->fd_can_read = fd_can_read;
     iwp->opaque = user_data;
     iwp->ioc = ioc;
+    object_ref(OBJECT(iwp->ioc));
+
     iwp->fd_read = (GSourceFunc) fd_read;
     iwp->src = NULL;
     iwp->context = context;
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] chardev: Don't attempt to unregister yank function more than once
  2025-05-15 22:20 [PATCH 0/4] chardev: Fix issues found by vhost-user-test Fabiano Rosas
  2025-05-15 22:20 ` [PATCH 1/4] chardev: Fix QIOChannel refcount Fabiano Rosas
@ 2025-05-15 22:20 ` Fabiano Rosas
  2025-05-19 10:52   ` Daniel P. Berrangé
  2025-05-15 22:20 ` [PATCH 3/4] chardev: Consolidate yank registration Fabiano Rosas
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Fabiano Rosas @ 2025-05-15 22:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcandre.lureau, berrange, pbonzini

tcp_chr_free_connection() can be called multiple times in succession,
in which case the yank function will get as argument a NULL s->sioc
that has been cleared by the previous tcp_chr_free_connection() call.

This leads to an abort() at yank_unregister_function().

 #0  __GI_raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:51
 #1  __GI_abort () at abort.c:79
 #2  qtest_check_status (s=0x513000005600) at ../tests/qtest/libqtest.c:209
 #3  qtest_wait_qemu (s=0x513000005600) at ../tests/qtest/libqtest.c:273
 #4  qtest_kill_qemu (s=0x513000005600) at ../tests/qtest/libqtest.c:285
 #5  kill_qemu_hook_func (s=0x513000005600) at ../tests/qtest/libqtest.c:294
 #6  g_hook_list_invoke (hook_list=0x55ea9cc750c0 <abrt_hooks>, may_recurse=0) at ../glib/ghook.c:534
 #7  sigabrt_handler (signo=6) at ../tests/qtest/libqtest.c:299
 #8  <signal handler called>
 #9  __GI_raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:51
 #10 __GI_abort () at abort.c:79
 #11 yank_unregister_function (instance=0x7fb26f2ea9a0,
     func=0x55ea9bcc0a10 <char_socket_yank_iochannel>, opaque=0x0) at
     ../util/yank.c:151
 #12 tcp_chr_free_connection (chr=0x51300000ffc0) at ../chardev/char-socket.c:385
 #13 tcp_chr_disconnect_locked (chr=0x51300000ffc0) at ../chardev/char-socket.c:477
 #14 tcp_chr_disconnect (chr=0x51300000ffc0) at ../chardev/char-socket.c:495
 #15 tcp_chr_hup (channel=0x514000000040, cond=G_IO_HUP, opaque=0x51300000ffc0) at ../chardev/char-socket.c:536
 #16 qio_channel_fd_source_dispatch (source=0x50c0000b5fc0, callback=0x55ea9bcd6770 <tcp_chr_hup>,
     user_data=0x51300000ffc0) at ../io/channel-watch.c:84
 #17 g_main_dispatch (context=0x50f000000040) at ../glib/gmain.c:3381
 #18 g_main_context_dispatch (context=context@entry=0x50f000000040) at ../glib/gmain.c:4099
 #19 g_main_context_iterate (context=0x50f000000040, block=block@entry=1, dispatch=dispatch@entry=1,
     self=<optimized out>) at ../glib/gmain.c:4175
 #20 g_main_loop_run (loop=0x502000055690) at ../glib/gmain.c:4373

Commit ebae6477dc ("chardev: check if the chardev is registered for
yanking") seems to have encountered a similar issue, but checking
s->registered_yank is not a complete solution because that flag
pertains to the yank instance, not to each individual function.

Skip the yank_unregister_function() in case s->sioc is already NULL,
which indicates the last yank function was already removed.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
Can tcp_chr_free_connection() race with itself? I'm assuming no.

Could we just make yank_unregister_instance() remove all yank
functions at once? Those asserts/abort in the yank code are a bit
masochistic.
---
 chardev/char-socket.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index e8dd2931dc..8ae225d953 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -378,7 +378,8 @@ static void tcp_chr_free_connection(Chardev *chr)
 
     tcp_set_msgfds(chr, NULL, 0);
     remove_fd_in_watch(chr);
-    if (s->registered_yank &&
+
+    if (s->registered_yank && s->sioc &&
         (s->state == TCP_CHARDEV_STATE_CONNECTING
         || s->state == TCP_CHARDEV_STATE_CONNECTED)) {
         yank_unregister_function(CHARDEV_YANK_INSTANCE(chr->label),
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] chardev: Consolidate yank registration
  2025-05-15 22:20 [PATCH 0/4] chardev: Fix issues found by vhost-user-test Fabiano Rosas
  2025-05-15 22:20 ` [PATCH 1/4] chardev: Fix QIOChannel refcount Fabiano Rosas
  2025-05-15 22:20 ` [PATCH 2/4] chardev: Don't attempt to unregister yank function more than once Fabiano Rosas
@ 2025-05-15 22:20 ` Fabiano Rosas
  2025-05-19 10:53   ` Daniel P. Berrangé
  2025-05-15 22:20 ` [PATCH 4/4] chardev: Introduce a lock for hup_source Fabiano Rosas
  2025-05-24 17:50 ` [PATCH 0/4] chardev: Fix issues found by vhost-user-test Marc-André Lureau
  4 siblings, 1 reply; 11+ messages in thread
From: Fabiano Rosas @ 2025-05-15 22:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcandre.lureau, berrange, pbonzini

There's currently five places where the yank function is being
registered and they all come right before tcp_chr_new_client(). Fold
them into it.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 chardev/char-socket.c | 31 ++++++-------------------------
 1 file changed, 6 insertions(+), 25 deletions(-)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 8ae225d953..d16608f1ed 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -904,6 +904,12 @@ static int tcp_chr_new_client(Chardev *chr, QIOChannelSocket *sioc)
     s->sioc = sioc;
     object_ref(OBJECT(sioc));
 
+    if (s->registered_yank) {
+        yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
+                               char_socket_yank_iochannel,
+                               QIO_CHANNEL(sioc));
+    }
+
     qio_channel_set_blocking(s->ioc, false, NULL);
 
     if (s->do_nodelay) {
@@ -944,11 +950,6 @@ static int tcp_chr_add_client(Chardev *chr, int fd)
     }
     tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
     tcp_chr_set_client_ioc_name(chr, sioc);
-    if (s->registered_yank) {
-        yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
-                               char_socket_yank_iochannel,
-                               QIO_CHANNEL(sioc));
-    }
     ret = tcp_chr_new_client(chr, sioc);
     object_unref(OBJECT(sioc));
     return ret;
@@ -963,11 +964,6 @@ static void tcp_chr_accept(QIONetListener *listener,
 
     tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
     tcp_chr_set_client_ioc_name(chr, cioc);
-    if (s->registered_yank) {
-        yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
-                               char_socket_yank_iochannel,
-                               QIO_CHANNEL(cioc));
-    }
     tcp_chr_new_client(chr, cioc);
 }
 
@@ -983,11 +979,6 @@ static int tcp_chr_connect_client_sync(Chardev *chr, Error **errp)
         object_unref(OBJECT(sioc));
         return -1;
     }
-    if (s->registered_yank) {
-        yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
-                               char_socket_yank_iochannel,
-                               QIO_CHANNEL(sioc));
-    }
     tcp_chr_new_client(chr, sioc);
     object_unref(OBJECT(sioc));
     return 0;
@@ -1003,11 +994,6 @@ static void tcp_chr_accept_server_sync(Chardev *chr)
     tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
     sioc = qio_net_listener_wait_client(s->listener);
     tcp_chr_set_client_ioc_name(chr, sioc);
-    if (s->registered_yank) {
-        yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
-                               char_socket_yank_iochannel,
-                               QIO_CHANNEL(sioc));
-    }
     tcp_chr_new_client(chr, sioc);
     object_unref(OBJECT(sioc));
 }
@@ -1181,11 +1167,6 @@ static void tcp_chr_connect_client_async(Chardev *chr)
     tcp_chr_change_state(s, TCP_CHARDEV_STATE_CONNECTING);
     sioc = qio_channel_socket_new();
     tcp_chr_set_client_ioc_name(chr, sioc);
-    if (s->registered_yank) {
-        yank_register_function(CHARDEV_YANK_INSTANCE(chr->label),
-                               char_socket_yank_iochannel,
-                               QIO_CHANNEL(sioc));
-    }
     /*
      * Normally code would use the qio_channel_socket_connect_async
      * method which uses a QIOTask + qio_task_set_error internally
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] chardev: Introduce a lock for hup_source
  2025-05-15 22:20 [PATCH 0/4] chardev: Fix issues found by vhost-user-test Fabiano Rosas
                   ` (2 preceding siblings ...)
  2025-05-15 22:20 ` [PATCH 3/4] chardev: Consolidate yank registration Fabiano Rosas
@ 2025-05-15 22:20 ` Fabiano Rosas
  2025-05-19 11:00   ` Daniel P. Berrangé
  2025-05-24 17:50 ` [PATCH 0/4] chardev: Fix issues found by vhost-user-test Marc-André Lureau
  4 siblings, 1 reply; 11+ messages in thread
From: Fabiano Rosas @ 2025-05-15 22:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcandre.lureau, berrange, pbonzini

It's possible for the hup_source to have its reference decremented by
remove_hup_source() while it's still being added to the context,
leading to asserts in glib:

g_source_set_callback_indirect: assertion 'g_atomic_int_get
(&source->ref_count) > 0'

g_source_attach: assertion 'g_atomic_int_get (&source->ref_count) > 0'
failed

Add a lock to serialize removal and creation.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 chardev/char-socket.c  | 4 ++++
 chardev/char.c         | 2 ++
 include/chardev/char.h | 1 +
 3 files changed, 7 insertions(+)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index d16608f1ed..88db9acd0d 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -374,7 +374,9 @@ static void tcp_chr_free_connection(Chardev *chr)
         s->read_msgfds_num = 0;
     }
 
+    qemu_mutex_lock(&chr->hup_source_lock);
     remove_hup_source(s);
+    qemu_mutex_unlock(&chr->hup_source_lock);
 
     tcp_set_msgfds(chr, NULL, 0);
     remove_fd_in_watch(chr);
@@ -613,6 +615,7 @@ static void update_ioc_handlers(SocketChardev *s)
                                      tcp_chr_read, chr,
                                      chr->gcontext);
 
+    qemu_mutex_lock(&chr->hup_source_lock);
     remove_hup_source(s);
     s->hup_source = qio_channel_create_watch(s->ioc, G_IO_HUP);
     /*
@@ -634,6 +637,7 @@ static void update_ioc_handlers(SocketChardev *s)
     g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup,
                           chr, NULL);
     g_source_attach(s->hup_source, chr->gcontext);
+    qemu_mutex_unlock(&chr->hup_source_lock);
 }
 
 static void tcp_chr_connect(void *opaque)
diff --git a/chardev/char.c b/chardev/char.c
index bbebd246c3..d03f698b38 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -279,6 +279,7 @@ static void char_init(Object *obj)
     chr->handover_yank_instance = false;
     chr->logfd = -1;
     qemu_mutex_init(&chr->chr_write_lock);
+    qemu_mutex_init(&chr->hup_source_lock);
 
     /*
      * Assume if chr_update_read_handler is implemented it will
@@ -316,6 +317,7 @@ static void char_finalize(Object *obj)
         close(chr->logfd);
     }
     qemu_mutex_destroy(&chr->chr_write_lock);
+    qemu_mutex_destroy(&chr->hup_source_lock);
 }
 
 static const TypeInfo char_type_info = {
diff --git a/include/chardev/char.h b/include/chardev/char.h
index 429852f8d9..064184153d 100644
--- a/include/chardev/char.h
+++ b/include/chardev/char.h
@@ -60,6 +60,7 @@ struct Chardev {
     Object parent_obj;
 
     QemuMutex chr_write_lock;
+    QemuMutex hup_source_lock;
     CharBackend *be;
     char *label;
     char *filename;
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] chardev: Fix QIOChannel refcount
  2025-05-15 22:20 ` [PATCH 1/4] chardev: Fix QIOChannel refcount Fabiano Rosas
@ 2025-05-19 10:49   ` Daniel P. Berrangé
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel P. Berrangé @ 2025-05-19 10:49 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, marcandre.lureau, pbonzini

On Thu, May 15, 2025 at 07:20:11PM -0300, Fabiano Rosas wrote:
> The IOWatchPoll holds a reference to the iochannel while the "child"
> source (iwp->src) is removed from the context and freed. Freeing the
> source leads to the iochannel being also freed at
> qio_channel_fd_source_finalize().
> 
> Later, io_watch_poll_prepare() tries to create another source with the
> same iochannel and hits an use after free:
> 
> ==8241==ERROR: AddressSanitizer: heap-use-after-free on address 0x514000000040
> READ of size 8 at 0x514000000040 thread T2
>     #0 0x561c2d272fcd in object_get_class  ../qom/object.c:1043:17
>     #1 0x561c2d338f84 in QIO_CHANNEL_GET_CLASS  include/io/channel.h:29:1
>     #2 0x561c2d33b26f in qio_channel_create_watch  ../io/channel.c:388:30
>     #3 0x561c2d2f0993 in io_watch_poll_prepare  ../chardev/char-io.c:65:20
>     ...
> 
> 0x514000000040 is located 0 bytes inside of 392-byte region [0x514000000040,0x5140000001c8)
> freed by thread T2 here:
>     #0 0x561c2d2319a5 in free
>     #1 0x7fb2c0926638 in g_free
>     #2 0x561c2d276507 in object_finalize  ../qom/object.c:734:9
>     #3 0x561c2d271d0d in object_unref  ../qom/object.c:1231:9
>     #4 0x561c2d32ef1d in qio_channel_fd_source_finalize  ../io/channel-watch.c:95:5
>     #5 0x7fb2c091d124 in g_source_unref_internal ../glib/gmain.c:2298
>     #6 0x561c2d2f0b6c in io_watch_poll_prepare  ../chardev/char-io.c:71:9
>     ...
> 
> previously allocated by thread T3 (connect) here:
>     #0 0x561c2d231c69 in malloc
>     #1 0x7fb2c0926518 in g_malloc
>     #2 0x561c2d27246e in object_new_with_type  ../qom/object.c:767:15
>     #3 0x561c2d272530 in object_new  ../qom/object.c:789:12
>     #4 0x561c2d320193 in qio_channel_socket_new  ../io/channel-socket.c:64:31
>     #5 0x561c2d308013 in tcp_chr_connect_client_async  ../chardev/char-socket.c:1181:12
>     #6 0x561c2d3002e7 in qmp_chardev_open_socket_client  ../chardev/char-socket.c:1281:9
>     ...
> 
> Fix the issue by incrementing the iochannel reference count when the
> IOWatchPoll takes a reference and decrementing when it is finalized.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  chardev/char-io.c | 5 +++++
>  1 file changed, 5 insertions(+)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] chardev: Don't attempt to unregister yank function more than once
  2025-05-15 22:20 ` [PATCH 2/4] chardev: Don't attempt to unregister yank function more than once Fabiano Rosas
@ 2025-05-19 10:52   ` Daniel P. Berrangé
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel P. Berrangé @ 2025-05-19 10:52 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, marcandre.lureau, pbonzini

On Thu, May 15, 2025 at 07:20:12PM -0300, Fabiano Rosas wrote:
> tcp_chr_free_connection() can be called multiple times in succession,
> in which case the yank function will get as argument a NULL s->sioc
> that has been cleared by the previous tcp_chr_free_connection() call.
> 
> This leads to an abort() at yank_unregister_function().
> 
>  #0  __GI_raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:51
>  #1  __GI_abort () at abort.c:79
>  #2  qtest_check_status (s=0x513000005600) at ../tests/qtest/libqtest.c:209
>  #3  qtest_wait_qemu (s=0x513000005600) at ../tests/qtest/libqtest.c:273
>  #4  qtest_kill_qemu (s=0x513000005600) at ../tests/qtest/libqtest.c:285
>  #5  kill_qemu_hook_func (s=0x513000005600) at ../tests/qtest/libqtest.c:294
>  #6  g_hook_list_invoke (hook_list=0x55ea9cc750c0 <abrt_hooks>, may_recurse=0) at ../glib/ghook.c:534
>  #7  sigabrt_handler (signo=6) at ../tests/qtest/libqtest.c:299
>  #8  <signal handler called>
>  #9  __GI_raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:51
>  #10 __GI_abort () at abort.c:79
>  #11 yank_unregister_function (instance=0x7fb26f2ea9a0,
>      func=0x55ea9bcc0a10 <char_socket_yank_iochannel>, opaque=0x0) at
>      ../util/yank.c:151
>  #12 tcp_chr_free_connection (chr=0x51300000ffc0) at ../chardev/char-socket.c:385
>  #13 tcp_chr_disconnect_locked (chr=0x51300000ffc0) at ../chardev/char-socket.c:477
>  #14 tcp_chr_disconnect (chr=0x51300000ffc0) at ../chardev/char-socket.c:495
>  #15 tcp_chr_hup (channel=0x514000000040, cond=G_IO_HUP, opaque=0x51300000ffc0) at ../chardev/char-socket.c:536
>  #16 qio_channel_fd_source_dispatch (source=0x50c0000b5fc0, callback=0x55ea9bcd6770 <tcp_chr_hup>,
>      user_data=0x51300000ffc0) at ../io/channel-watch.c:84
>  #17 g_main_dispatch (context=0x50f000000040) at ../glib/gmain.c:3381
>  #18 g_main_context_dispatch (context=context@entry=0x50f000000040) at ../glib/gmain.c:4099
>  #19 g_main_context_iterate (context=0x50f000000040, block=block@entry=1, dispatch=dispatch@entry=1,
>      self=<optimized out>) at ../glib/gmain.c:4175
>  #20 g_main_loop_run (loop=0x502000055690) at ../glib/gmain.c:4373
> 
> Commit ebae6477dc ("chardev: check if the chardev is registered for
> yanking") seems to have encountered a similar issue, but checking
> s->registered_yank is not a complete solution because that flag
> pertains to the yank instance, not to each individual function.
> 
> Skip the yank_unregister_function() in case s->sioc is already NULL,
> which indicates the last yank function was already removed.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
> Can tcp_chr_free_connection() race with itself? I'm assuming no.
> 
> Could we just make yank_unregister_instance() remove all yank
> functions at once? Those asserts/abort in the yank code are a bit
> masochistic.
> ---
>  chardev/char-socket.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] chardev: Consolidate yank registration
  2025-05-15 22:20 ` [PATCH 3/4] chardev: Consolidate yank registration Fabiano Rosas
@ 2025-05-19 10:53   ` Daniel P. Berrangé
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel P. Berrangé @ 2025-05-19 10:53 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, marcandre.lureau, pbonzini

On Thu, May 15, 2025 at 07:20:13PM -0300, Fabiano Rosas wrote:
> There's currently five places where the yank function is being
> registered and they all come right before tcp_chr_new_client(). Fold
> them into it.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  chardev/char-socket.c | 31 ++++++-------------------------
>  1 file changed, 6 insertions(+), 25 deletions(-)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] chardev: Introduce a lock for hup_source
  2025-05-15 22:20 ` [PATCH 4/4] chardev: Introduce a lock for hup_source Fabiano Rosas
@ 2025-05-19 11:00   ` Daniel P. Berrangé
  2025-05-19 14:21     ` Fabiano Rosas
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel P. Berrangé @ 2025-05-19 11:00 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, marcandre.lureau, pbonzini

On Thu, May 15, 2025 at 07:20:14PM -0300, Fabiano Rosas wrote:
> It's possible for the hup_source to have its reference decremented by
> remove_hup_source() while it's still being added to the context,
> leading to asserts in glib:

IIUC this must mean that

  tcp_chr_free_connection

is being called concurrently with

  update_ioc_handlers

I'm wondering if that is really intended, or a sign of a deeper
bug that we'll just paper over if we add the mutex proposed here.

Are you able to provide stack traces showing the 2 concurrent
operations that are triggering this problem ?

> 
> g_source_set_callback_indirect: assertion 'g_atomic_int_get
> (&source->ref_count) > 0'
> 
> g_source_attach: assertion 'g_atomic_int_get (&source->ref_count) > 0'
> failed
> 
> Add a lock to serialize removal and creation.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  chardev/char-socket.c  | 4 ++++
>  chardev/char.c         | 2 ++
>  include/chardev/char.h | 1 +
>  3 files changed, 7 insertions(+)
> 
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index d16608f1ed..88db9acd0d 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -374,7 +374,9 @@ static void tcp_chr_free_connection(Chardev *chr)
>          s->read_msgfds_num = 0;
>      }
>  
> +    qemu_mutex_lock(&chr->hup_source_lock);
>      remove_hup_source(s);
> +    qemu_mutex_unlock(&chr->hup_source_lock);
>  
>      tcp_set_msgfds(chr, NULL, 0);
>      remove_fd_in_watch(chr);
> @@ -613,6 +615,7 @@ static void update_ioc_handlers(SocketChardev *s)
>                                       tcp_chr_read, chr,
>                                       chr->gcontext);
>  
> +    qemu_mutex_lock(&chr->hup_source_lock);
>      remove_hup_source(s);
>      s->hup_source = qio_channel_create_watch(s->ioc, G_IO_HUP);
>      /*
> @@ -634,6 +637,7 @@ static void update_ioc_handlers(SocketChardev *s)
>      g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup,
>                            chr, NULL);
>      g_source_attach(s->hup_source, chr->gcontext);
> +    qemu_mutex_unlock(&chr->hup_source_lock);
>  }
>  
>  static void tcp_chr_connect(void *opaque)
> diff --git a/chardev/char.c b/chardev/char.c
> index bbebd246c3..d03f698b38 100644
> --- a/chardev/char.c
> +++ b/chardev/char.c
> @@ -279,6 +279,7 @@ static void char_init(Object *obj)
>      chr->handover_yank_instance = false;
>      chr->logfd = -1;
>      qemu_mutex_init(&chr->chr_write_lock);
> +    qemu_mutex_init(&chr->hup_source_lock);
>  
>      /*
>       * Assume if chr_update_read_handler is implemented it will
> @@ -316,6 +317,7 @@ static void char_finalize(Object *obj)
>          close(chr->logfd);
>      }
>      qemu_mutex_destroy(&chr->chr_write_lock);
> +    qemu_mutex_destroy(&chr->hup_source_lock);
>  }
>  
>  static const TypeInfo char_type_info = {
> diff --git a/include/chardev/char.h b/include/chardev/char.h
> index 429852f8d9..064184153d 100644
> --- a/include/chardev/char.h
> +++ b/include/chardev/char.h
> @@ -60,6 +60,7 @@ struct Chardev {
>      Object parent_obj;
>  
>      QemuMutex chr_write_lock;
> +    QemuMutex hup_source_lock;
>      CharBackend *be;
>      char *label;
>      char *filename;
> -- 
> 2.35.3
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] chardev: Introduce a lock for hup_source
  2025-05-19 11:00   ` Daniel P. Berrangé
@ 2025-05-19 14:21     ` Fabiano Rosas
  0 siblings, 0 replies; 11+ messages in thread
From: Fabiano Rosas @ 2025-05-19 14:21 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel, marcandre.lureau, pbonzini

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Thu, May 15, 2025 at 07:20:14PM -0300, Fabiano Rosas wrote:
>> It's possible for the hup_source to have its reference decremented by
>> remove_hup_source() while it's still being added to the context,
>> leading to asserts in glib:
>
> IIUC this must mean that
>
>   tcp_chr_free_connection
>
> is being called concurrently with
>
>   update_ioc_handlers
>
> I'm wondering if that is really intended, or a sign of a deeper
> bug that we'll just paper over if we add the mutex proposed here.
>

Yeah... I can't tell, I'm new to this code. But I agree that this smells
of a bug somewhere else.

> Are you able to provide stack traces showing the 2 concurrent
> operations that are triggering this problem ?
>

I wasn't able to, it triggers in the glib subprocess which is a pain to
debug. I'll give it another try now that there's fixes for the other
bugs.

>> 
>> g_source_set_callback_indirect: assertion 'g_atomic_int_get
>> (&source->ref_count) > 0'
>> 
>> g_source_attach: assertion 'g_atomic_int_get (&source->ref_count) > 0'
>> failed
>> 
>> Add a lock to serialize removal and creation.
>> 
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>>  chardev/char-socket.c  | 4 ++++
>>  chardev/char.c         | 2 ++
>>  include/chardev/char.h | 1 +
>>  3 files changed, 7 insertions(+)
>> 
>> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
>> index d16608f1ed..88db9acd0d 100644
>> --- a/chardev/char-socket.c
>> +++ b/chardev/char-socket.c
>> @@ -374,7 +374,9 @@ static void tcp_chr_free_connection(Chardev *chr)
>>          s->read_msgfds_num = 0;
>>      }
>>  
>> +    qemu_mutex_lock(&chr->hup_source_lock);
>>      remove_hup_source(s);
>> +    qemu_mutex_unlock(&chr->hup_source_lock);
>>  
>>      tcp_set_msgfds(chr, NULL, 0);
>>      remove_fd_in_watch(chr);
>> @@ -613,6 +615,7 @@ static void update_ioc_handlers(SocketChardev *s)
>>                                       tcp_chr_read, chr,
>>                                       chr->gcontext);
>>  
>> +    qemu_mutex_lock(&chr->hup_source_lock);
>>      remove_hup_source(s);
>>      s->hup_source = qio_channel_create_watch(s->ioc, G_IO_HUP);
>>      /*
>> @@ -634,6 +637,7 @@ static void update_ioc_handlers(SocketChardev *s)
>>      g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup,
>>                            chr, NULL);
>>      g_source_attach(s->hup_source, chr->gcontext);
>> +    qemu_mutex_unlock(&chr->hup_source_lock);
>>  }
>>  
>>  static void tcp_chr_connect(void *opaque)
>> diff --git a/chardev/char.c b/chardev/char.c
>> index bbebd246c3..d03f698b38 100644
>> --- a/chardev/char.c
>> +++ b/chardev/char.c
>> @@ -279,6 +279,7 @@ static void char_init(Object *obj)
>>      chr->handover_yank_instance = false;
>>      chr->logfd = -1;
>>      qemu_mutex_init(&chr->chr_write_lock);
>> +    qemu_mutex_init(&chr->hup_source_lock);
>>  
>>      /*
>>       * Assume if chr_update_read_handler is implemented it will
>> @@ -316,6 +317,7 @@ static void char_finalize(Object *obj)
>>          close(chr->logfd);
>>      }
>>      qemu_mutex_destroy(&chr->chr_write_lock);
>> +    qemu_mutex_destroy(&chr->hup_source_lock);
>>  }
>>  
>>  static const TypeInfo char_type_info = {
>> diff --git a/include/chardev/char.h b/include/chardev/char.h
>> index 429852f8d9..064184153d 100644
>> --- a/include/chardev/char.h
>> +++ b/include/chardev/char.h
>> @@ -60,6 +60,7 @@ struct Chardev {
>>      Object parent_obj;
>>  
>>      QemuMutex chr_write_lock;
>> +    QemuMutex hup_source_lock;
>>      CharBackend *be;
>>      char *label;
>>      char *filename;
>> -- 
>> 2.35.3
>> 
>
> With regards,
> Daniel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/4] chardev: Fix issues found by vhost-user-test
  2025-05-15 22:20 [PATCH 0/4] chardev: Fix issues found by vhost-user-test Fabiano Rosas
                   ` (3 preceding siblings ...)
  2025-05-15 22:20 ` [PATCH 4/4] chardev: Introduce a lock for hup_source Fabiano Rosas
@ 2025-05-24 17:50 ` Marc-André Lureau
  4 siblings, 0 replies; 11+ messages in thread
From: Marc-André Lureau @ 2025-05-24 17:50 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, berrange, pbonzini

Hi

On Fri, May 16, 2025 at 12:21 AM Fabiano Rosas <farosas@suse.de> wrote:
>
> Running vhost-user-test with ASAN on a loaded machine reveals several
> intermittent issues. These show up every time I test the qtest tree so
> I'm trying to get rid of them.
>
> 1- UAF of IOWatchPoll.
>    This one is self explanatory, ASAN caught it.
>
> 2- Reference counting issues in glib. It seems it's possible to unref
>    a source while adding a callback to it, and glib asserts. This
>    shows up on all architectures, only on the ASAN build after
>    hundreds of iterations.
>
> 3- Extra yank_unregister_function call leads to abort(). This shows up
>    on all architectures, but it's quite hidden due to vhost-user-test
>    using a dedicated server thread which dies and causes timeouts in
>    the test.
>
>    Manifests as assert(s->fds_num) failing. Only on the ASAN build,
>    after tens of iterations (quite common).
>
> Thanks
>
> Fabiano Rosas (4):
>   chardev: Fix QIOChannel refcount
>   chardev: Don't attempt to unregister yank function more than once
>   chardev: Consolidate yank registration
>   chardev: Introduce a lock for hup_source

Daniel, would you take the first 3 (or all) patches in your next I/O PR ?

>
>  chardev/char-io.c      |  5 +++++
>  chardev/char-socket.c  | 38 ++++++++++++--------------------------
>  chardev/char.c         |  2 ++
>  include/chardev/char.h |  1 +
>  4 files changed, 20 insertions(+), 26 deletions(-)
>
> --
> 2.35.3
>
>


-- 
Marc-André Lureau


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-05-24 17:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-15 22:20 [PATCH 0/4] chardev: Fix issues found by vhost-user-test Fabiano Rosas
2025-05-15 22:20 ` [PATCH 1/4] chardev: Fix QIOChannel refcount Fabiano Rosas
2025-05-19 10:49   ` Daniel P. Berrangé
2025-05-15 22:20 ` [PATCH 2/4] chardev: Don't attempt to unregister yank function more than once Fabiano Rosas
2025-05-19 10:52   ` Daniel P. Berrangé
2025-05-15 22:20 ` [PATCH 3/4] chardev: Consolidate yank registration Fabiano Rosas
2025-05-19 10:53   ` Daniel P. Berrangé
2025-05-15 22:20 ` [PATCH 4/4] chardev: Introduce a lock for hup_source Fabiano Rosas
2025-05-19 11:00   ` Daniel P. Berrangé
2025-05-19 14:21     ` Fabiano Rosas
2025-05-24 17:50 ` [PATCH 0/4] chardev: Fix issues found by vhost-user-test Marc-André Lureau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).