All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: "Daniel P. Berrangé" <berrange@redhat.com>
Cc: qemu-devel@nongnu.org, Juraj Marcin <jmarcin@redhat.com>,
	Fabiano Rosas <farosas@suse.de>
Subject: Re: [PATCH v2 2/3] io/tls: Make qio_channel_tls_bye() always synchronous
Date: Mon, 15 Sep 2025 16:41:25 -0400	[thread overview]
Message-ID: <aMh59RW2nPEEuF0a@x1.local> (raw)
In-Reply-To: <aMhdoWKdY_lxghVu@redhat.com>

On Mon, Sep 15, 2025 at 07:40:33PM +0100, Daniel P. Berrangé wrote:
> On Fri, Sep 12, 2025 at 11:36:51AM -0400, Peter Xu wrote:
> > On Fri, Sep 12, 2025 at 12:27:52PM +0100, Daniel P. Berrangé wrote:
> > > On Thu, Sep 11, 2025 at 05:23:54PM -0400, Peter Xu wrote:
> > > > No issue I hit, the change is only from code observation when I am looking
> > > > at a TLS premature termination issue.
> > > > 
> > > > qio_channel_tls_bye() API needs to be synchronous.  When it's not, the
> > > > previous impl will attach an asynchronous task retrying but only until when
> > > > the channel gets the relevant GIO event. It may be problematic, because the
> > > > caller of qio_channel_tls_bye() may have invoked channel close() before
> > > > that, leading to premature termination of the TLS session.
> > > > 
> > > > Remove the asynchronous handling, instead retry it immediately.  Currently,
> > > > the only two possible cases that may lead to async task is either INTERRUPT
> > > > or EAGAIN.  It should be suffice to spin retry as of now, until a solid
> > > > proof showing that a more complicated retry logic is needed.
> > > > 
> > > > With that, we can remove the whole async model for the bye task.
> > > > 
> > > > When at it, making the function return bool, which looks like a common
> > > > pattern in QEMU when errp is used.
> > > > 
> > > > Side note on the tracepoints: previously the tracepoint bye_complete()
> > > > isn't used.  Start to use it in this patch.  bye_pending() and bye_cancel()
> > > > can be dropped now.  Adding bye_retry() instead.
> > > > 
> > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > ---
> > > >  include/io/channel-tls.h |  5 ++-
> > > >  io/channel-tls.c         | 86 +++++-----------------------------------
> > > >  io/trace-events          |  3 +-
> > > >  3 files changed, 15 insertions(+), 79 deletions(-)
> > > > 
> 
> > > > diff --git a/io/channel-tls.c b/io/channel-tls.c
> > > > index 5a2c8188ce..8510a187a8 100644
> > > > --- a/io/channel-tls.c
> > > > +++ b/io/channel-tls.c
> > > > @@ -253,84 +253,25 @@ void qio_channel_tls_handshake(QIOChannelTLS *ioc,
> > > >      qio_channel_tls_handshake_task(ioc, task, context);
> > > >  }
> > > >  
> > > > -static gboolean qio_channel_tls_bye_io(QIOChannel *ioc, GIOCondition condition,
> > > > -                                       gpointer user_data);
> > > > -
> > > > -static void qio_channel_tls_bye_task(QIOChannelTLS *ioc, QIOTask *task,
> > > > -                                     GMainContext *context)
> > > > +bool qio_channel_tls_bye(QIOChannelTLS *ioc, Error **errp)
> > > >  {
> > > > -    GIOCondition condition;
> > > > -    QIOChannelTLSData *data;
> > > >      int status;
> > > > -    Error *err = NULL;
> > > >  
> > > > -    status = qcrypto_tls_session_bye(ioc->session, &err);
> > > > +    trace_qio_channel_tls_bye_start(ioc);
> > > > +retry:
> > > > +    status = qcrypto_tls_session_bye(ioc->session, errp);
> > > >  
> > > >      if (status < 0) {
> > > >          trace_qio_channel_tls_bye_fail(ioc);
> > > 
> > > snip
> > > 
> > > > +        return false;
> > > > +    } else if (status != QCRYPTO_TLS_BYE_COMPLETE) {
> > > > +        /* BYE event must be synchronous, retry immediately */
> > > > +        trace_qio_channel_tls_bye_retry(ioc, status);
> > > > +        goto retry;
> > > >      }
> > > 
> > > We cannot do this. If the gnutls_bye() API needs to perform
> > > socket I/O, and so when we're running over a non-blocking
> > > socket we must expect EAGAIN. With this handling, QEMU will
> > > busy loop burning 100% CPU when the socket is not ready.
> > 
> > Right.  That was the plan when drafting this, the hope is spinning will
> > almost not happen, and even if it happens it'll finish soon (migration is
> > completing, it means network mustn't be so bad), unless the network is
> > stuck exactly at when we send the bye().
> 
> Don't forget that the machine can be running 5 migrations
> of different VMs concurrently, and so may not be as quick
> to finish sending traffic as we expect. Since QEMU's mig
> protocol is essentially undirectional, I wonder if the
> send buffer could still be full of VMstate data waiting
> to be sent ? Perhaps its fine, but I don't like relying
> on luck, or hard-to-prove scenarios.

Very rare conditional spinning is, IMHO, totally OK.  I wished all our
problems are as simple as cpu spinning.. then it's super straightforward to
debug when hit, and also benign.  We can add whatever smart tech after that.

I would just guess even if with this patch goes in, we will never observe
it considering the write buffer shouldn't be more than tens of MBs..

Said that..

> 
> > > A second point is that from a QIOChannel POV, we need to
> > > ensure that all APIs can be used in a non-blocking scenario.
> > > This is why in the QIOChannelSocket impl connect/listen APIs
> > > we provide both _sync and _async variants of the APIs, or
> > > in the QIOChannelTLS impl, the handshake API is always
> > > async with a callback to be invokved on completion.
> > 
> > I agree.  The issue is if so, migration code needs to be always be prepared
> > with a possible async op even if in 99.9999% cases it won't happen... we
> > need to complicate the multifd logic a lot for this, but the gain is
> > little..
> > 
> > This series still used patch 1 to fix the problem (rather than do real BYE
> > on preempt channels, for example) only because it's the easiest, after all
> > it's still a contract in tls channel impl to allow premature termination
> > for explicit shutdown()s on the host.
> > 
> > If we want to do 100% graceful shutdowns, we'll need to apply this to all
> > channels, and the async-possible model can definitely add more complexity
> > more than multifd.  I hope it won't be necessary.. but just to mention it.
> 
> Even if the migration code is relying on non-blocking sockets
> for most of its work, at the time we're ready to invoke "bye",
> perhaps the migration code could simply call
> 
>  qio_channel_set_blocking(ioc, true)
> 
> to switch the socket over to blocking mode.

... I think this is a good idea and should solve the problem indeed.  I
hope there's no loophole that it could still trigger the async path.

I'll respin with that, thanks!

-- 
Peter Xu



  reply	other threads:[~2025-09-15 20:44 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-11 21:23 [PATCH v2 0/3] migration/tls: Graceful shutdowns for main and postcopy channels Peter Xu
2025-09-11 21:23 ` [PATCH v2 1/3] io/crypto: Move tls premature termination handling into QIO layer Peter Xu
2025-09-12 11:18   ` Daniel P. Berrangé
2025-09-12 15:24     ` Peter Xu
2025-09-15 18:31       ` Daniel P. Berrangé
2025-09-12 12:05   ` Juraj Marcin
2025-09-18 14:12   ` Fabiano Rosas
2025-09-11 21:23 ` [PATCH v2 2/3] io/tls: Make qio_channel_tls_bye() always synchronous Peter Xu
2025-09-12 11:27   ` Daniel P. Berrangé
2025-09-12 15:36     ` Peter Xu
2025-09-15 18:40       ` Daniel P. Berrangé
2025-09-15 20:41         ` Peter Xu [this message]
2025-09-18 14:47   ` Fabiano Rosas
2025-09-18 18:12     ` Peter Xu
2025-09-11 21:23 ` [PATCH v2 3/3] migration: Make migration_has_failed() work even for CANCELLING Peter Xu
2025-09-12 12:07   ` Juraj Marcin
2025-09-18 14:52   ` Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMh59RW2nPEEuF0a@x1.local \
    --to=peterx@redhat.com \
    --cc=berrange@redhat.com \
    --cc=farosas@suse.de \
    --cc=jmarcin@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.