From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: Fabiano Rosas <farosas@suse.de>,
qemu-devel@nongnu.org, marcandre.lureau@redhat.com,
pbonzini@redhat.com
Subject: Re: [PATCH 4/4] chardev: Introduce a lock for hup_source
Date: Fri, 6 Mar 2026 14:49:22 +0000 [thread overview]
Message-ID: <aarpcvImyt35Hido@redhat.com> (raw)
In-Reply-To: <CAFEAcA88RR05VXhz=j_+EiDYHUUtsMW0kX9PV04B1toK_dd95Q@mail.gmail.com>
On Fri, Mar 06, 2026 at 01:57:45PM +0000, Peter Maydell wrote:
> On Wed, 4 Mar 2026 at 10:34, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > On Sat, Feb 28, 2026 at 03:16:11PM +0000, Peter Maydell wrote:
> > > I had a look at this, and although I couldn't get a specific
> > > backtrace, I got enough information from asan leak backtraces
> > > and looking at the code that I think I understand where the race
> > > is here:
> > >
> > > vhost-user-test.c:test_server_create_chr() sets up a chardev with:
> > > qemu_chr_new()
> > > qemu_chr_fe_init()
> > > qemu_chr_fe_set_handlers()
> > >
> > > For the "connect-fail" and "flags-mismatch" tests, it includes in
> > > the options to the chardev "reconnect-ms=1000".
> > >
> > > qemu_chr_new() ends up in tcp_chr_open(), which calls
> > > qmp_chardev_open_socket_client().
> > >
> > > If reconnect_ms is not 0, qmp_chardev_open_socket_client() calls
> > > tcp_chr_connect_client_async() instead of doing the connect synchronously.
> > > That kicks off a qio_task_new() to do the connecting, and immediately
> > > returns, so we return successfully from qemu_chr_new(). So
> > > test_server_create_chr() will continue execution into qemu_chr_fe_init()
> > > and qemu_chr_fe_set_handlers().
> > >
> > > qemu_chr_fe_set_handlers() calls qemu_chr_be_update_read_handlers(), which
> > > calls the chr_update_read_handler method for the ChardevClass, which here
> > > is tcp_chr_update_read_handler(). That function calls update_ioc_handlers().
> > >
> > > Meanwhile, the thread that got kicked off to do the connect is
> > > running in parallel. If the connect succeeds then qemu_chr_socket_conneted()
> > > will call tcp_chr_connect(), which will also call
> > > update_ioc_handlers(). (If the connect fails then presumably we end
> > > up in the tcp_char_disconnect or otherwise destroying the chardev,
> > > though I haven't figured out exactly what happens in this case.)
> > >
> > > None of this appears to be taking any kind of lock while it modifies
> > > the SocketChardev fields, so we could have two threads in
> > > update_ioc_handlers() simultaneously.
> >
> > It is subtle, as the QIOTask object deals with 2 separate callbacks.
> >
> > The callback that is provided to qio_task_new() is what is invoked when
> > the task is marked as completed. This is guaranteed to always be
> > invoked in main event loop context.
> >
> > The callback that is provided to qio_task_run_in_thread() is what runs
> > in the throwaway background thread. When this callback returns, the
> > QIOTask creates a glib idle source in the main event loop to run the
> > completion callback.
> >
> >
> > So wrt char-socket.c, tcp_chr_connect_client_task will run in the
> > background thread, but it only calls QIOChannel APIs so is safe.
> >
> > The qemu_chr_socket_connected method will run in main event loop
> > context, so it does not need locking wrt other chardev I/O
> > callbacks.
>
> This is starting to sound like the problem is the test code. The
> test code is calling qemu_chr_new() and qemu_chr_fe_set_handlers()
> from a random thread it created by directly calling g_thread_new().
> So these calls from this thread do not have any kind of lock and
> can run in parallel with the qemu_chr_socket_connected method from
> the main event loop.
>
> I guess we need to take a lock here, but which one?
>
> > > How is this intended to work ? Is the test case code incorrect to assume
> > > it can work with the chardev it got back from qemu_chr_new(), or should
> > > the chardev be handling this case?
> >
> > The test is really rather had to understand, but I feel like something
> > in the test ought to be looking at the CHR_EVENT_OPENED event to
> > identify when the connection is actually established ?
>
> The way to get a CHR_EVENT_OPENED would be to set the frontend
> handlers with qemu_chr_fe_set_handlers(). It's the call to set
> the frontend handlers that is already going wrong. (But the test
> doesn't really need to care about the EVENT_OPENED: it can
> just wait til it gets the "you have some data to read" callback.)
I'm rather feeling that the test program would be better off not
using chardevs at all, and instead using QIOChannelSocket directly.
The chardevs are horribly complex when trying to deal with
connection oriented semantics, as they are really not a connection
based API; the event stuff is a hacky afterthought that barely
gets us by.
With regards,
Daniel
--
|: https://berrange.com ~~ https://hachyderm.io/@berrange :|
|: https://libvirt.org ~~ https://entangle-photo.org :|
|: https://pixelfed.art/berrange ~~ https://fstop138.berrange.com :|
next prev parent reply other threads:[~2026-03-06 14:50 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-15 22:20 [PATCH 0/4] chardev: Fix issues found by vhost-user-test Fabiano Rosas
2025-05-15 22:20 ` [PATCH 1/4] chardev: Fix QIOChannel refcount Fabiano Rosas
2025-05-19 10:49 ` Daniel P. Berrangé
2025-05-15 22:20 ` [PATCH 2/4] chardev: Don't attempt to unregister yank function more than once Fabiano Rosas
2025-05-19 10:52 ` Daniel P. Berrangé
2025-05-15 22:20 ` [PATCH 3/4] chardev: Consolidate yank registration Fabiano Rosas
2025-05-19 10:53 ` Daniel P. Berrangé
2025-05-15 22:20 ` [PATCH 4/4] chardev: Introduce a lock for hup_source Fabiano Rosas
2025-05-19 11:00 ` Daniel P. Berrangé
2025-05-19 14:21 ` Fabiano Rosas
2026-02-28 15:16 ` Peter Maydell
2026-03-04 10:34 ` Daniel P. Berrangé
2026-03-06 13:57 ` Peter Maydell
2026-03-06 14:49 ` Daniel P. Berrangé [this message]
2026-03-06 14:55 ` Peter Maydell
2025-05-24 17:50 ` [PATCH 0/4] chardev: Fix issues found by vhost-user-test Marc-André Lureau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aarpcvImyt35Hido@redhat.com \
--to=berrange@redhat.com \
--cc=farosas@suse.de \
--cc=marcandre.lureau@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.