qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: mail@jiesong.me
To: eblake@redhat.com
Cc: armbru@redhat.com, berrange@redhat.com, mail@jiesong.me,
	qemu-devel@nongnu.org, songjie_yewu@cmss.chinamobile.com
Subject: Re: [PATCH] monitor/qmp: cleanup socket listener sources early to avoid fd handling race
Date: Thu, 13 Nov 2025 23:10:23 +0800	[thread overview]
Message-ID: <20251113151023.41440-1-mail@jiesong.me> (raw)
In-Reply-To: <cslgtf2v45soo57j2qfrkefitokjkhpywllz6wait4d2nx5yt6@q34i6bk37gvk>

> On Tue, Nov 11, 2025 at 11:01:44PM +0800, Jie Song wrote:
> > From: Jie Song <songjie_yewu@cmss.chinamobile.com>
> > 
> > When starting a dummy QEMU process with virsh, monitor_init_qmp() enables
> > IOThread monitoring of the QMP fd by default. However, a race condition
> > exists during the initialization phase: the IOThread only removes the
> > main thread's fd watch when it reaches qio_net_listener_set_client_func_full(),
> > which may be delayed under high system load.
> > 
> > This creates a window between monitor_qmp_setup_handlers_bh() and
> > qio_net_listener_set_client_func_full() where both the main thread and
> > IOThread are simultaneously monitoring the same fd and processing events.
> > This race can cause either the main thread or the IOThread to hang and
> > become unresponsive.
> > 
> > Fix this by proactively cleaning up the listener's IO sources in
> > monitor_init_qmp() before the IOThread initializes QMP monitoring,
> > ensuring exclusive fd ownership and eliminating the race condition.
> > 
> > The fix introduces socket_chr_listener_cleanup() to destroy and unref
> > all existing IO sources on the socket chardev listener, guaranteeing
> > that no concurrent fd monitoring occurs during the transition to
> > IOThread handling.
> > 
> > Signed-off-by: Jie Song <songjie_yewu@cmss.chinamobile.com>
> > ---
> >  chardev/char-socket.c         | 18 ++++++++++++++++++
> >  include/chardev/char-socket.h |  2 ++
> >  monitor/qmp.c                 |  6 ++++++
> >  3 files changed, 26 insertions(+)
> > 
> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > index 62852e3caf..073a9da855 100644
> > --- a/chardev/char-socket.c
> > +++ b/chardev/char-socket.c
> > @@ -656,6 +656,24 @@ static void tcp_chr_telnet_destroy(SocketChardev *s)
> >      }
> >  }
> >  
> > +void socket_chr_listener_cleanup(Chardev *chr)
> > +{
> > +    SocketChardev *s = SOCKET_CHARDEV(chr);
> > +
> > +    if (s->listener) {
> > +        QIONetListener *listener = s->listener;
> > +        size_t i;
> > +
> > +        for (i = 0; i < listener->nsioc; i++) {
> 
> This directly accesses listener->nsioc outside of net-listener.c.
> I've got a pending patch that frowns on this type of usage (here's the
> link to v2; v3 is coming soon):
> 
> https://lore.kernel.org/qemu-devel/20251108230525.3169174-14-eblake@redhat.com/T/#m69a13da54c24ad55351b6a004ec1c0cba7a7b49c
> 
> But it might be possible to do what you want without peeking inside
> the listener; have you tested calling
> qio_net_listener_set_client_func_full() to change the callback to NULL
> prior to doing the handover to iothread, and then reregistering
> tcp_chr_accept after that point?
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.
> Virtualization:  qemu.org | libguestfs.org

Hi Eric,

Thanks a lot for the detailed feedback! You're absolutely right—the current
patch does have issues with encapsulation by directly accessing internal 
listener structures like nsioc. 

I took your advice and tested it: before the I/O thread initializes 
the QMP listener, call qio_net_listener_set_client_func_full() with the callback
set to NULL. This effectively purges the main thread's fd watch without any 
new helper functions, and then the I/O thread can safely re-register tcp_chr_accept. 

Besides, Daniel also raised a solid concern in his reply: the chardev backend 
isn't guaranteed to be a SocketChardev type, so blindly calling a socket-specific 
cleanup could lead to crashes. 
https://lore.kernel.org/qemu-devel/20251112145743.15075-1-mail@jiesong.me/

The v1 patch needs to be modified. I'll iterate on a v2 incorporating 
both your NULL-callback advice and Daniel's robustness improvements. 

Best regards,
Jie Song 


  reply	other threads:[~2025-11-13 15:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-11 15:01 [PATCH] monitor/qmp: cleanup socket listener sources early to avoid fd handling race Jie Song
2025-11-12  8:59 ` Markus Armbruster
2025-11-12 15:31   ` Jie Song
2025-11-12  9:05 ` Daniel P. Berrangé
2025-11-12 14:57   ` Jie Song
2025-11-12 21:48 ` Eric Blake
2025-11-13 15:10   ` mail [this message]
2025-11-13 15:13   ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20251113151023.41440-1-mail@jiesong.me \
    --to=mail@jiesong.me \
    --cc=armbru@redhat.com \
    --cc=berrange@redhat.com \
    --cc=eblake@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=songjie_yewu@cmss.chinamobile.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).