qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: qemu-devel@nongnu.org, Juraj Marcin <jmarcin@redhat.com>,
	Fabiano Rosas <farosas@suse.de>
Subject: Re: [PATCH v2 2/3] io/tls: Make qio_channel_tls_bye() always synchronous
Date: Mon, 15 Sep 2025 16:41:25 -0400	[thread overview]
Message-ID: <aMh59RW2nPEEuF0a@x1.local> (raw)
In-Reply-To: <aMhdoWKdY_lxghVu@redhat.com>

On Mon, Sep 15, 2025 at 07:40:33PM +0100, Daniel P. Berrangé wrote:
> On Fri, Sep 12, 2025 at 11:36:51AM -0400, Peter Xu wrote:
> > On Fri, Sep 12, 2025 at 12:27:52PM +0100, Daniel P. Berrangé wrote:
> > > On Thu, Sep 11, 2025 at 05:23:54PM -0400, Peter Xu wrote:
> > > > No issue I hit, the change is only from code observation when I am looking
> > > > at a TLS premature termination issue.
> > > > 
> > > > qio_channel_tls_bye() API needs to be synchronous.  When it's not, the
> > > > previous impl will attach an asynchronous task retrying but only until when
> > > > the channel gets the relevant GIO event. It may be problematic, because the
> > > > caller of qio_channel_tls_bye() may have invoked channel close() before
> > > > that, leading to premature termination of the TLS session.
> > > > 
> > > > Remove the asynchronous handling, instead retry it immediately.  Currently,
> > > > the only two possible cases that may lead to async task is either INTERRUPT
> > > > or EAGAIN.  It should be suffice to spin retry as of now, until a solid
> > > > proof showing that a more complicated retry logic is needed.
> > > > 
> > > > With that, we can remove the whole async model for the bye task.
> > > > 
> > > > When at it, making the function return bool, which looks like a common
> > > > pattern in QEMU when errp is used.
> > > > 
> > > > Side note on the tracepoints: previously the tracepoint bye_complete()
> > > > isn't used.  Start to use it in this patch.  bye_pending() and bye_cancel()
> > > > can be dropped now.  Adding bye_retry() instead.
> > > > 
> > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > ---
> > > >  include/io/channel-tls.h |  5 ++-
> > > >  io/channel-tls.c         | 86 +++++-----------------------------------
> > > >  io/trace-events          |  3 +-
> > > >  3 files changed, 15 insertions(+), 79 deletions(-)
> > > > 
> 
> > > > diff --git a/io/channel-tls.c b/io/channel-tls.c
> > > > index 5a2c8188ce..8510a187a8 100644
> > > > --- a/io/channel-tls.c
> > > > +++ b/io/channel-tls.c
> > > > @@ -253,84 +253,25 @@ void qio_channel_tls_handshake(QIOChannelTLS *ioc,
> > > >      qio_channel_tls_handshake_task(ioc, task, context);
> > > >  }
> > > >  
> > > > -static gboolean qio_channel_tls_bye_io(QIOChannel *ioc, GIOCondition condition,
> > > > -                                       gpointer user_data);
> > > > -
> > > > -static void qio_channel_tls_bye_task(QIOChannelTLS *ioc, QIOTask *task,
> > > > -                                     GMainContext *context)
> > > > +bool qio_channel_tls_bye(QIOChannelTLS *ioc, Error **errp)
> > > >  {
> > > > -    GIOCondition condition;
> > > > -    QIOChannelTLSData *data;
> > > >      int status;
> > > > -    Error *err = NULL;
> > > >  
> > > > -    status = qcrypto_tls_session_bye(ioc->session, &err);
> > > > +    trace_qio_channel_tls_bye_start(ioc);
> > > > +retry:
> > > > +    status = qcrypto_tls_session_bye(ioc->session, errp);
> > > >  
> > > >      if (status < 0) {
> > > >          trace_qio_channel_tls_bye_fail(ioc);
> > > 
> > > snip
> > > 
> > > > +        return false;
> > > > +    } else if (status != QCRYPTO_TLS_BYE_COMPLETE) {
> > > > +        /* BYE event must be synchronous, retry immediately */
> > > > +        trace_qio_channel_tls_bye_retry(ioc, status);
> > > > +        goto retry;
> > > >      }
> > > 
> > > We cannot do this. If the gnutls_bye() API needs to perform
> > > socket I/O, and so when we're running over a non-blocking
> > > socket we must expect EAGAIN. With this handling, QEMU will
> > > busy loop burning 100% CPU when the socket is not ready.
> > 
> > Right.  That was the plan when drafting this, the hope is spinning will
> > almost not happen, and even if it happens it'll finish soon (migration is
> > completing, it means network mustn't be so bad), unless the network is
> > stuck exactly at when we send the bye().
> 
> Don't forget that the machine can be running 5 migrations
> of different VMs concurrently, and so may not be as quick
> to finish sending traffic as we expect. Since QEMU's mig
> protocol is essentially undirectional, I wonder if the
> send buffer could still be full of VMstate data waiting
> to be sent ? Perhaps its fine, but I don't like relying
> on luck, or hard-to-prove scenarios.

Very rare conditional spinning is, IMHO, totally OK.  I wished all our
problems are as simple as cpu spinning.. then it's super straightforward to
debug when hit, and also benign.  We can add whatever smart tech after that.

I would just guess even if with this patch goes in, we will never observe
it considering the write buffer shouldn't be more than tens of MBs..

Said that..

> 
> > > A second point is that from a QIOChannel POV, we need to
> > > ensure that all APIs can be used in a non-blocking scenario.
> > > This is why in the QIOChannelSocket impl connect/listen APIs
> > > we provide both _sync and _async variants of the APIs, or
> > > in the QIOChannelTLS impl, the handshake API is always
> > > async with a callback to be invokved on completion.
> > 
> > I agree.  The issue is if so, migration code needs to be always be prepared
> > with a possible async op even if in 99.9999% cases it won't happen... we
> > need to complicate the multifd logic a lot for this, but the gain is
> > little..
> > 
> > This series still used patch 1 to fix the problem (rather than do real BYE
> > on preempt channels, for example) only because it's the easiest, after all
> > it's still a contract in tls channel impl to allow premature termination
> > for explicit shutdown()s on the host.
> > 
> > If we want to do 100% graceful shutdowns, we'll need to apply this to all
> > channels, and the async-possible model can definitely add more complexity
> > more than multifd.  I hope it won't be necessary.. but just to mention it.
> 
> Even if the migration code is relying on non-blocking sockets
> for most of its work, at the time we're ready to invoke "bye",
> perhaps the migration code could simply call
> 
>  qio_channel_set_blocking(ioc, true)
> 
> to switch the socket over to blocking mode.

... I think this is a good idea and should solve the problem indeed.  I
hope there's no loophole that it could still trigger the async path.

I'll respin with that, thanks!

-- 
Peter Xu



  reply	other threads:[~2025-09-15 20:44 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-11 21:23 [PATCH v2 0/3] migration/tls: Graceful shutdowns for main and postcopy channels Peter Xu
2025-09-11 21:23 ` [PATCH v2 1/3] io/crypto: Move tls premature termination handling into QIO layer Peter Xu
2025-09-12 11:18   ` Daniel P. Berrangé
2025-09-12 15:24     ` Peter Xu
2025-09-15 18:31       ` Daniel P. Berrangé
2025-09-12 12:05   ` Juraj Marcin
2025-09-18 14:12   ` Fabiano Rosas
2025-09-11 21:23 ` [PATCH v2 2/3] io/tls: Make qio_channel_tls_bye() always synchronous Peter Xu
2025-09-12 11:27   ` Daniel P. Berrangé
2025-09-12 15:36     ` Peter Xu
2025-09-15 18:40       ` Daniel P. Berrangé
2025-09-15 20:41         ` Peter Xu [this message]
2025-09-18 14:47   ` Fabiano Rosas
2025-09-18 18:12     ` Peter Xu
2025-09-11 21:23 ` [PATCH v2 3/3] migration: Make migration_has_failed() work even for CANCELLING Peter Xu
2025-09-12 12:07   ` Juraj Marcin
2025-09-18 14:52   ` Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMh59RW2nPEEuF0a@x1.local \
    --to=peterx@redhat.com \
    --cc=berrange@redhat.com \
    --cc=farosas@suse.de \
    --cc=jmarcin@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).