qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v3 0/8] crypto,io,migration: Add support to gnutls_bye()
@ 2025-02-07 19:53 Fabiano Rosas
  2025-02-07 19:53 ` [RFC PATCH v3 1/8] crypto: Allow gracefully ending the TLS session Fabiano Rosas
                   ` (7 more replies)
  0 siblings, 8 replies; 12+ messages in thread
From: Fabiano Rosas @ 2025-02-07 19:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Maciej S . Szmigiero, Daniel P . Berrangé

(cover-letter update I forgot on v2:)

This series now contains the two approches we've been discussing to
avoid the TLS termination error on the multifd_recv threads.

The source machine now ends the TLS session with gnutls_bye() and the
destination will consider a premature termination an error. The only
exception is the src <9.1 case where there's a compatibility issue, in
which case the presence of multifd-tls-clean-termination=false will
cause the destination to (always) ignore a premature termination
error.

changes in v3:

Reordered the patches to have the io/crypto stuff at the start and the
compat property before the code that breaks compat.

Commit message improvements.

Turned assert into an warning when gnutls_bye() fails but migration
succeeded (should never happen).

Other minor fixes asked by Daniel.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/1661172595

v2:
https://lore.kernel.org/r/20250207142758.6936-1-farosas@suse.de

v1:
https://lore.kernel.org/r/20250206175824.22664-1-farosas@suse.de

Hi,

We've been discussing a way to stop multifd recv threads from getting
an error at the end of migration when the source threads close the
iochannel without ending the TLS session.

The original issue was introduced by commit 1d457daf86
("migration/multifd: Further remove the SYNC on complete") which
altered the synchronization of the source and destination in a manner
that causes the destination to already be waiting at recv() when the
source closes the connection.

One approach would be to issue gnutls_bye() at the source after all
the data has been sent. The destination would then gracefully exit
when it gets EOF.

Aside from stopping the recv thread from seeing an error, this also
creates a contract that all connections should be closed only after
the TLS session is ended. This helps to avoid masking a legitimate
issue where the connection is closed prematurely.

Fabiano Rosas (8):
  crypto: Allow gracefully ending the TLS session
  io: tls: Add qio_channel_tls_bye
  crypto: Remove qcrypto_tls_session_get_handshake_status
  io: Add flags argument to qio_channel_readv_full_all_eof
  io: Add a read flag for relaxed EOF
  migration/multifd: Terminate the TLS connection
  migration/multifd: Add a compat property for TLS termination
  migration: Check migration error after loadvm

 crypto/tlssession.c                 | 96 ++++++++++++++++++-----------
 hw/core/machine.c                   |  1 +
 hw/remote/mpqemu-link.c             |  2 +-
 include/crypto/tlssession.h         | 46 ++++++++------
 include/io/channel-tls.h            | 12 ++++
 include/io/channel.h                |  3 +
 io/channel-tls.c                    | 92 ++++++++++++++++++++++++++-
 io/channel.c                        |  9 ++-
 io/trace-events                     |  5 ++
 migration/migration.h               | 33 ++++++++++
 migration/multifd.c                 | 53 +++++++++++++++-
 migration/multifd.h                 |  2 +
 migration/options.c                 |  2 +
 migration/savevm.c                  |  6 +-
 migration/tls.c                     |  5 ++
 migration/tls.h                     |  2 +-
 tests/unit/test-crypto-tlssession.c | 12 ++--
 17 files changed, 305 insertions(+), 76 deletions(-)

-- 
2.35.3



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH v3 1/8] crypto: Allow gracefully ending the TLS session
  2025-02-07 19:53 [RFC PATCH v3 0/8] crypto,io,migration: Add support to gnutls_bye() Fabiano Rosas
@ 2025-02-07 19:53 ` Fabiano Rosas
  2025-02-07 19:53 ` [RFC PATCH v3 2/8] io: tls: Add qio_channel_tls_bye Fabiano Rosas
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Fabiano Rosas @ 2025-02-07 19:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Maciej S . Szmigiero, Daniel P . Berrangé

QEMU's TLS session code provides no way to call gnutls_bye() to
terminate a TLS session. Callers of qcrypto_tls_session_read() can
choose to ignore a GNUTLS_E_PREMATURE_TERMINATION error by setting the
gracefulTermination argument.

The QIOChannelTLS ignores the premature termination error whenever
shutdown() has already been issued. This was found to be not enough for
the migration code because shutdown() might not have been issued before
the connection is terminated.

Add support for calling gnutls_bye() in the tlssession layer so users
of QIOChannelTLS can clearly identify the end of a TLS session.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 crypto/tlssession.c         | 41 +++++++++++++++++++++++++++++++++++++
 include/crypto/tlssession.h | 22 ++++++++++++++++++++
 2 files changed, 63 insertions(+)

diff --git a/crypto/tlssession.c b/crypto/tlssession.c
index 77286e23f4..d769d7a304 100644
--- a/crypto/tlssession.c
+++ b/crypto/tlssession.c
@@ -585,6 +585,40 @@ qcrypto_tls_session_get_handshake_status(QCryptoTLSSession *session)
     }
 }
 
+int
+qcrypto_tls_session_bye(QCryptoTLSSession *session, Error **errp)
+{
+    int ret;
+
+    if (!session->handshakeComplete) {
+        return 0;
+    }
+
+    ret = gnutls_bye(session->handle, GNUTLS_SHUT_WR);
+
+    if (!ret) {
+        return QCRYPTO_TLS_BYE_COMPLETE;
+    }
+
+    if (ret == GNUTLS_E_INTERRUPTED || ret == GNUTLS_E_AGAIN) {
+        int direction = gnutls_record_get_direction(session->handle);
+        return direction ? QCRYPTO_TLS_BYE_SENDING : QCRYPTO_TLS_BYE_RECVING;
+    }
+
+    if (session->rerr || session->werr) {
+        error_setg(errp, "TLS termination failed: %s: %s", gnutls_strerror(ret),
+                   error_get_pretty(session->rerr ?
+                                    session->rerr : session->werr));
+    } else {
+        error_setg(errp, "TLS termination failed: %s", gnutls_strerror(ret));
+    }
+
+    error_free(session->rerr);
+    error_free(session->werr);
+    session->rerr = session->werr = NULL;
+
+    return -1;
+}
 
 int
 qcrypto_tls_session_get_key_size(QCryptoTLSSession *session,
@@ -699,6 +733,13 @@ qcrypto_tls_session_get_handshake_status(QCryptoTLSSession *sess)
 }
 
 
+int
+qcrypto_tls_session_bye(QCryptoTLSSession *session, Error **errp)
+{
+    return QCRYPTO_TLS_BYE_COMPLETE;
+}
+
+
 int
 qcrypto_tls_session_get_key_size(QCryptoTLSSession *sess,
                                  Error **errp)
diff --git a/include/crypto/tlssession.h b/include/crypto/tlssession.h
index f694a5c3c5..c0f64ce989 100644
--- a/include/crypto/tlssession.h
+++ b/include/crypto/tlssession.h
@@ -323,6 +323,28 @@ typedef enum {
 QCryptoTLSSessionHandshakeStatus
 qcrypto_tls_session_get_handshake_status(QCryptoTLSSession *sess);
 
+typedef enum {
+    QCRYPTO_TLS_BYE_COMPLETE,
+    QCRYPTO_TLS_BYE_SENDING,
+    QCRYPTO_TLS_BYE_RECVING,
+} QCryptoTLSSessionByeStatus;
+
+/**
+ * qcrypto_tls_session_bye:
+ * @session: the TLS session object
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Start, or continue, a TLS termination sequence. If the underlying
+ * data channel is non-blocking, then this method may return control
+ * before the termination is complete. The return value will indicate
+ * whether the termination has completed, or is waiting to send or
+ * receive data. In the latter cases, the caller should setup an event
+ * loop watch and call this method again once the underlying data
+ * channel is ready to read or write again.
+ */
+int
+qcrypto_tls_session_bye(QCryptoTLSSession *session, Error **errp);
+
 /**
  * qcrypto_tls_session_get_key_size:
  * @sess: the TLS session object
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v3 2/8] io: tls: Add qio_channel_tls_bye
  2025-02-07 19:53 [RFC PATCH v3 0/8] crypto,io,migration: Add support to gnutls_bye() Fabiano Rosas
  2025-02-07 19:53 ` [RFC PATCH v3 1/8] crypto: Allow gracefully ending the TLS session Fabiano Rosas
@ 2025-02-07 19:53 ` Fabiano Rosas
  2025-02-07 19:53 ` [RFC PATCH v3 3/8] crypto: Remove qcrypto_tls_session_get_handshake_status Fabiano Rosas
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Fabiano Rosas @ 2025-02-07 19:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Maciej S . Szmigiero, Daniel P . Berrangé

Add a task dispatcher for gnutls_bye similar to the
qio_channel_tls_handshake_task(). The gnutls_bye() call might be
interrupted and so it needs to be rescheduled.

The migration code will make use of this to help the migration
destination identify a premature EOF. Once the session termination is
in place, any EOF that happens before the source issued gnutls_bye()
will be considered an error.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 include/io/channel-tls.h | 12 ++++++
 io/channel-tls.c         | 84 ++++++++++++++++++++++++++++++++++++++++
 io/trace-events          |  5 +++
 3 files changed, 101 insertions(+)

diff --git a/include/io/channel-tls.h b/include/io/channel-tls.h
index 26c67f17e2..7e9023570d 100644
--- a/include/io/channel-tls.h
+++ b/include/io/channel-tls.h
@@ -49,8 +49,20 @@ struct QIOChannelTLS {
     QCryptoTLSSession *session;
     QIOChannelShutdown shutdown;
     guint hs_ioc_tag;
+    guint bye_ioc_tag;
 };
 
+/**
+ * qio_channel_tls_bye:
+ * @ioc: the TLS channel object
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Perform the TLS session termination. This method will return
+ * immediately and the termination will continue in the background,
+ * provided the main loop is running.
+ */
+void qio_channel_tls_bye(QIOChannelTLS *ioc, Error **errp);
+
 /**
  * qio_channel_tls_new_server:
  * @master: the underlying channel object
diff --git a/io/channel-tls.c b/io/channel-tls.c
index aab630e5ae..517ce190a4 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -247,6 +247,85 @@ void qio_channel_tls_handshake(QIOChannelTLS *ioc,
     qio_channel_tls_handshake_task(ioc, task, context);
 }
 
+static gboolean qio_channel_tls_bye_io(QIOChannel *ioc, GIOCondition condition,
+                                       gpointer user_data);
+
+static void qio_channel_tls_bye_task(QIOChannelTLS *ioc, QIOTask *task,
+                                     GMainContext *context)
+{
+    GIOCondition condition;
+    QIOChannelTLSData *data;
+    int status;
+    Error *err = NULL;
+
+    status = qcrypto_tls_session_bye(ioc->session, &err);
+
+    if (status < 0) {
+        trace_qio_channel_tls_bye_fail(ioc);
+        qio_task_set_error(task, err);
+        qio_task_complete(task);
+        return;
+    }
+
+    if (status == QCRYPTO_TLS_BYE_COMPLETE) {
+        qio_task_complete(task);
+        return;
+    }
+
+    data = g_new0(typeof(*data), 1);
+    data->task = task;
+    data->context = context;
+
+    if (context) {
+        g_main_context_ref(context);
+    }
+
+    if (status == QCRYPTO_TLS_BYE_SENDING) {
+        condition = G_IO_OUT;
+    } else {
+        condition = G_IO_IN;
+    }
+
+    trace_qio_channel_tls_bye_pending(ioc, status);
+    ioc->bye_ioc_tag = qio_channel_add_watch_full(ioc->master, condition,
+                                                  qio_channel_tls_bye_io,
+                                                  data, NULL, context);
+}
+
+
+static gboolean qio_channel_tls_bye_io(QIOChannel *ioc, GIOCondition condition,
+                                       gpointer user_data)
+{
+    QIOChannelTLSData *data = user_data;
+    QIOTask *task = data->task;
+    GMainContext *context = data->context;
+    QIOChannelTLS *tioc = QIO_CHANNEL_TLS(qio_task_get_source(task));
+
+    tioc->bye_ioc_tag = 0;
+    g_free(data);
+    qio_channel_tls_bye_task(tioc, task, context);
+
+    if (context) {
+        g_main_context_unref(context);
+    }
+
+    return FALSE;
+}
+
+static void propagate_error(QIOTask *task, gpointer opaque)
+{
+    qio_task_propagate_error(task, opaque);
+}
+
+void qio_channel_tls_bye(QIOChannelTLS *ioc, Error **errp)
+{
+    QIOTask *task;
+
+    task = qio_task_new(OBJECT(ioc), propagate_error, errp, NULL);
+
+    trace_qio_channel_tls_bye_start(ioc);
+    qio_channel_tls_bye_task(ioc, task, NULL);
+}
 
 static void qio_channel_tls_init(Object *obj G_GNUC_UNUSED)
 {
@@ -379,6 +458,11 @@ static int qio_channel_tls_close(QIOChannel *ioc,
         g_clear_handle_id(&tioc->hs_ioc_tag, g_source_remove);
     }
 
+    if (tioc->bye_ioc_tag) {
+        trace_qio_channel_tls_bye_cancel(ioc);
+        g_clear_handle_id(&tioc->bye_ioc_tag, g_source_remove);
+    }
+
     return qio_channel_close(tioc->master, errp);
 }
 
diff --git a/io/trace-events b/io/trace-events
index d4c0f84a9a..dc3a63ba1f 100644
--- a/io/trace-events
+++ b/io/trace-events
@@ -44,6 +44,11 @@ qio_channel_tls_handshake_pending(void *ioc, int status) "TLS handshake pending
 qio_channel_tls_handshake_fail(void *ioc) "TLS handshake fail ioc=%p"
 qio_channel_tls_handshake_complete(void *ioc) "TLS handshake complete ioc=%p"
 qio_channel_tls_handshake_cancel(void *ioc) "TLS handshake cancel ioc=%p"
+qio_channel_tls_bye_start(void *ioc) "TLS termination start ioc=%p"
+qio_channel_tls_bye_pending(void *ioc, int status) "TLS termination pending ioc=%p status=%d"
+qio_channel_tls_bye_fail(void *ioc) "TLS termination fail ioc=%p"
+qio_channel_tls_bye_complete(void *ioc) "TLS termination complete ioc=%p"
+qio_channel_tls_bye_cancel(void *ioc) "TLS termination cancel ioc=%p"
 qio_channel_tls_credentials_allow(void *ioc) "TLS credentials allow ioc=%p"
 qio_channel_tls_credentials_deny(void *ioc) "TLS credentials deny ioc=%p"
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v3 3/8] crypto: Remove qcrypto_tls_session_get_handshake_status
  2025-02-07 19:53 [RFC PATCH v3 0/8] crypto,io,migration: Add support to gnutls_bye() Fabiano Rosas
  2025-02-07 19:53 ` [RFC PATCH v3 1/8] crypto: Allow gracefully ending the TLS session Fabiano Rosas
  2025-02-07 19:53 ` [RFC PATCH v3 2/8] io: tls: Add qio_channel_tls_bye Fabiano Rosas
@ 2025-02-07 19:53 ` Fabiano Rosas
  2025-02-07 19:53 ` [RFC PATCH v3 4/8] io: Add flags argument to qio_channel_readv_full_all_eof Fabiano Rosas
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Fabiano Rosas @ 2025-02-07 19:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Maciej S . Szmigiero, Daniel P . Berrangé

The correct way of calling qcrypto_tls_session_handshake() requires
calling qcrypto_tls_session_get_handshake_status() right after it so
there's no reason to have a separate method.

Refactor qcrypto_tls_session_handshake() to inform the status in its
own return value and alter the callers accordingly.

No functional change.

Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 crypto/tlssession.c                 | 63 +++++++++++------------------
 include/crypto/tlssession.h         | 32 ++++-----------
 io/channel-tls.c                    |  7 ++--
 tests/unit/test-crypto-tlssession.c | 12 ++----
 4 files changed, 39 insertions(+), 75 deletions(-)

diff --git a/crypto/tlssession.c b/crypto/tlssession.c
index d769d7a304..6d8f8df623 100644
--- a/crypto/tlssession.c
+++ b/crypto/tlssession.c
@@ -546,45 +546,35 @@ qcrypto_tls_session_handshake(QCryptoTLSSession *session,
                               Error **errp)
 {
     int ret = gnutls_handshake(session->handle);
-    if (ret == 0) {
+    if (!ret) {
         session->handshakeComplete = true;
-    } else {
-        if (ret == GNUTLS_E_INTERRUPTED ||
-            ret == GNUTLS_E_AGAIN) {
-            ret = 1;
-        } else {
-            if (session->rerr || session->werr) {
-                error_setg(errp, "TLS handshake failed: %s: %s",
-                           gnutls_strerror(ret),
-                           error_get_pretty(session->rerr ?
-                                            session->rerr : session->werr));
-            } else {
-                error_setg(errp, "TLS handshake failed: %s",
-                           gnutls_strerror(ret));
-            }
-            ret = -1;
-        }
-    }
-    error_free(session->rerr);
-    error_free(session->werr);
-    session->rerr = session->werr = NULL;
-
-    return ret;
-}
-
-
-QCryptoTLSSessionHandshakeStatus
-qcrypto_tls_session_get_handshake_status(QCryptoTLSSession *session)
-{
-    if (session->handshakeComplete) {
         return QCRYPTO_TLS_HANDSHAKE_COMPLETE;
-    } else if (gnutls_record_get_direction(session->handle) == 0) {
-        return QCRYPTO_TLS_HANDSHAKE_RECVING;
+    }
+
+    if (ret == GNUTLS_E_INTERRUPTED || ret == GNUTLS_E_AGAIN) {
+        int direction = gnutls_record_get_direction(session->handle);
+        return direction ? QCRYPTO_TLS_HANDSHAKE_SENDING :
+            QCRYPTO_TLS_HANDSHAKE_RECVING;
+    }
+
+    if (session->rerr || session->werr) {
+        error_setg(errp, "TLS handshake failed: %s: %s",
+                   gnutls_strerror(ret),
+                   error_get_pretty(session->rerr ?
+                                    session->rerr : session->werr));
     } else {
-        return QCRYPTO_TLS_HANDSHAKE_SENDING;
+        error_setg(errp, "TLS handshake failed: %s",
+                   gnutls_strerror(ret));
     }
+
+    error_free(session->rerr);
+    error_free(session->werr);
+    session->rerr = session->werr = NULL;
+
+    return -1;
 }
 
+
 int
 qcrypto_tls_session_bye(QCryptoTLSSession *session, Error **errp)
 {
@@ -726,13 +716,6 @@ qcrypto_tls_session_handshake(QCryptoTLSSession *sess,
 }
 
 
-QCryptoTLSSessionHandshakeStatus
-qcrypto_tls_session_get_handshake_status(QCryptoTLSSession *sess)
-{
-    return QCRYPTO_TLS_HANDSHAKE_COMPLETE;
-}
-
-
 int
 qcrypto_tls_session_bye(QCryptoTLSSession *session, Error **errp)
 {
diff --git a/include/crypto/tlssession.h b/include/crypto/tlssession.h
index c0f64ce989..d77ae0d423 100644
--- a/include/crypto/tlssession.h
+++ b/include/crypto/tlssession.h
@@ -75,12 +75,14 @@
  *                                      GINT_TO_POINTER(fd));
  *
  *    while (1) {
- *       if (qcrypto_tls_session_handshake(sess, errp) < 0) {
+ *       int ret = qcrypto_tls_session_handshake(sess, errp);
+ *
+ *       if (ret < 0) {
  *           qcrypto_tls_session_free(sess);
  *           return -1;
  *       }
  *
- *       switch(qcrypto_tls_session_get_handshake_status(sess)) {
+ *       switch(ret) {
  *       case QCRYPTO_TLS_HANDSHAKE_COMPLETE:
  *           if (qcrypto_tls_session_check_credentials(sess, errp) < )) {
  *               qcrypto_tls_session_free(sess);
@@ -170,7 +172,7 @@ G_DEFINE_AUTOPTR_CLEANUP_FUNC(QCryptoTLSSession, qcrypto_tls_session_free)
  *
  * Validate the peer's credentials after a successful
  * TLS handshake. It is an error to call this before
- * qcrypto_tls_session_get_handshake_status() returns
+ * qcrypto_tls_session_handshake() returns
  * QCRYPTO_TLS_HANDSHAKE_COMPLETE
  *
  * Returns 0 if the credentials validated, -1 on error
@@ -226,7 +228,7 @@ void qcrypto_tls_session_set_callbacks(QCryptoTLSSession *sess,
  * registered with qcrypto_tls_session_set_callbacks()
  *
  * It is an error to call this before
- * qcrypto_tls_session_get_handshake_status() returns
+ * qcrypto_tls_session_handshake() returns
  * QCRYPTO_TLS_HANDSHAKE_COMPLETE
  *
  * Returns: the number of bytes sent,
@@ -256,7 +258,7 @@ ssize_t qcrypto_tls_session_write(QCryptoTLSSession *sess,
  * opposed to an error.
  *
  * It is an error to call this before
- * qcrypto_tls_session_get_handshake_status() returns
+ * qcrypto_tls_session_handshake() returns
  * QCRYPTO_TLS_HANDSHAKE_COMPLETE
  *
  * Returns: the number of bytes received,
@@ -289,8 +291,7 @@ size_t qcrypto_tls_session_check_pending(QCryptoTLSSession *sess);
  * the underlying data channel is non-blocking, then
  * this method may return control before the handshake
  * is complete. On non-blocking channels the
- * qcrypto_tls_session_get_handshake_status() method
- * should be used to determine whether the handshake
+ * return value determines whether the handshake
  * has completed, or is waiting to send or receive
  * data. In the latter cases, the caller should setup
  * an event loop watch and call this method again
@@ -306,23 +307,6 @@ typedef enum {
     QCRYPTO_TLS_HANDSHAKE_RECVING,
 } QCryptoTLSSessionHandshakeStatus;
 
-/**
- * qcrypto_tls_session_get_handshake_status:
- * @sess: the TLS session object
- *
- * Check the status of the TLS handshake. This
- * is used with non-blocking data channels to
- * determine whether the handshake is waiting
- * to send or receive further data to/from the
- * remote peer.
- *
- * Once this returns QCRYPTO_TLS_HANDSHAKE_COMPLETE
- * it is permitted to send/receive payload data on
- * the channel
- */
-QCryptoTLSSessionHandshakeStatus
-qcrypto_tls_session_get_handshake_status(QCryptoTLSSession *sess);
-
 typedef enum {
     QCRYPTO_TLS_BYE_COMPLETE,
     QCRYPTO_TLS_BYE_SENDING,
diff --git a/io/channel-tls.c b/io/channel-tls.c
index 517ce190a4..ecde6b57bf 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -162,16 +162,17 @@ static void qio_channel_tls_handshake_task(QIOChannelTLS *ioc,
                                            GMainContext *context)
 {
     Error *err = NULL;
-    QCryptoTLSSessionHandshakeStatus status;
+    int status;
 
-    if (qcrypto_tls_session_handshake(ioc->session, &err) < 0) {
+    status = qcrypto_tls_session_handshake(ioc->session, &err);
+
+    if (status < 0) {
         trace_qio_channel_tls_handshake_fail(ioc);
         qio_task_set_error(task, err);
         qio_task_complete(task);
         return;
     }
 
-    status = qcrypto_tls_session_get_handshake_status(ioc->session);
     if (status == QCRYPTO_TLS_HANDSHAKE_COMPLETE) {
         trace_qio_channel_tls_handshake_complete(ioc);
         if (qcrypto_tls_session_check_credentials(ioc->session,
diff --git a/tests/unit/test-crypto-tlssession.c b/tests/unit/test-crypto-tlssession.c
index 3395f73560..554054e934 100644
--- a/tests/unit/test-crypto-tlssession.c
+++ b/tests/unit/test-crypto-tlssession.c
@@ -158,8 +158,7 @@ static void test_crypto_tls_session_psk(void)
             rv = qcrypto_tls_session_handshake(serverSess,
                                                &error_abort);
             g_assert(rv >= 0);
-            if (qcrypto_tls_session_get_handshake_status(serverSess) ==
-                QCRYPTO_TLS_HANDSHAKE_COMPLETE) {
+            if (rv == QCRYPTO_TLS_HANDSHAKE_COMPLETE) {
                 serverShake = true;
             }
         }
@@ -167,8 +166,7 @@ static void test_crypto_tls_session_psk(void)
             rv = qcrypto_tls_session_handshake(clientSess,
                                                &error_abort);
             g_assert(rv >= 0);
-            if (qcrypto_tls_session_get_handshake_status(clientSess) ==
-                QCRYPTO_TLS_HANDSHAKE_COMPLETE) {
+            if (rv == QCRYPTO_TLS_HANDSHAKE_COMPLETE) {
                 clientShake = true;
             }
         }
@@ -352,8 +350,7 @@ static void test_crypto_tls_session_x509(const void *opaque)
             rv = qcrypto_tls_session_handshake(serverSess,
                                                &error_abort);
             g_assert(rv >= 0);
-            if (qcrypto_tls_session_get_handshake_status(serverSess) ==
-                QCRYPTO_TLS_HANDSHAKE_COMPLETE) {
+            if (rv == QCRYPTO_TLS_HANDSHAKE_COMPLETE) {
                 serverShake = true;
             }
         }
@@ -361,8 +358,7 @@ static void test_crypto_tls_session_x509(const void *opaque)
             rv = qcrypto_tls_session_handshake(clientSess,
                                                &error_abort);
             g_assert(rv >= 0);
-            if (qcrypto_tls_session_get_handshake_status(clientSess) ==
-                QCRYPTO_TLS_HANDSHAKE_COMPLETE) {
+            if (rv == QCRYPTO_TLS_HANDSHAKE_COMPLETE) {
                 clientShake = true;
             }
         }
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v3 4/8] io: Add flags argument to qio_channel_readv_full_all_eof
  2025-02-07 19:53 [RFC PATCH v3 0/8] crypto,io,migration: Add support to gnutls_bye() Fabiano Rosas
                   ` (2 preceding siblings ...)
  2025-02-07 19:53 ` [RFC PATCH v3 3/8] crypto: Remove qcrypto_tls_session_get_handshake_status Fabiano Rosas
@ 2025-02-07 19:53 ` Fabiano Rosas
  2025-02-10  9:04   ` Daniel P. Berrangé
  2025-02-07 19:53 ` [RFC PATCH v3 5/8] io: Add a read flag for relaxed EOF Fabiano Rosas
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 12+ messages in thread
From: Fabiano Rosas @ 2025-02-07 19:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Maciej S . Szmigiero, Daniel P . Berrangé,
	Elena Ufimtseva, Jagannathan Raman

We want to pass flags into qio_channel_tls_readv() but
qio_channel_readv_full_all_eof() doesn't take a flags argument.

No functional change.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 hw/remote/mpqemu-link.c | 2 +-
 include/io/channel.h    | 2 ++
 io/channel.c            | 9 ++++++---
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c
index e25f97680d..49885a1db6 100644
--- a/hw/remote/mpqemu-link.c
+++ b/hw/remote/mpqemu-link.c
@@ -110,7 +110,7 @@ static ssize_t mpqemu_read(QIOChannel *ioc, void *buf, size_t len, int **fds,
         bql_unlock();
     }
 
-    ret = qio_channel_readv_full_all_eof(ioc, &iov, 1, fds, nfds, errp);
+    ret = qio_channel_readv_full_all_eof(ioc, &iov, 1, fds, nfds, 0, errp);
 
     if (drop_bql && !iothread && !qemu_in_coroutine()) {
         bql_lock();
diff --git a/include/io/channel.h b/include/io/channel.h
index bdf0bca92a..58940eead5 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -885,6 +885,7 @@ void qio_channel_set_aio_fd_handler(QIOChannel *ioc,
  * @niov: the length of the @iov array
  * @fds: an array of file handles to read
  * @nfds: number of file handles in @fds
+ * @flags: read flags (QIO_CHANNEL_READ_FLAG_*)
  * @errp: pointer to a NULL-initialized error object
  *
  *
@@ -903,6 +904,7 @@ int coroutine_mixed_fn qio_channel_readv_full_all_eof(QIOChannel *ioc,
                                                       const struct iovec *iov,
                                                       size_t niov,
                                                       int **fds, size_t *nfds,
+                                                      int flags,
                                                       Error **errp);
 
 /**
diff --git a/io/channel.c b/io/channel.c
index e3f17c24a0..ebd9322765 100644
--- a/io/channel.c
+++ b/io/channel.c
@@ -115,7 +115,8 @@ int coroutine_mixed_fn qio_channel_readv_all_eof(QIOChannel *ioc,
                                                  size_t niov,
                                                  Error **errp)
 {
-    return qio_channel_readv_full_all_eof(ioc, iov, niov, NULL, NULL, errp);
+    return qio_channel_readv_full_all_eof(ioc, iov, niov, NULL, NULL, 0,
+                                          errp);
 }
 
 int coroutine_mixed_fn qio_channel_readv_all(QIOChannel *ioc,
@@ -130,6 +131,7 @@ int coroutine_mixed_fn qio_channel_readv_full_all_eof(QIOChannel *ioc,
                                                       const struct iovec *iov,
                                                       size_t niov,
                                                       int **fds, size_t *nfds,
+                                                      int flags,
                                                       Error **errp)
 {
     int ret = -1;
@@ -155,7 +157,7 @@ int coroutine_mixed_fn qio_channel_readv_full_all_eof(QIOChannel *ioc,
     while ((nlocal_iov > 0) || local_fds) {
         ssize_t len;
         len = qio_channel_readv_full(ioc, local_iov, nlocal_iov, local_fds,
-                                     local_nfds, 0, errp);
+                                     local_nfds, flags, errp);
         if (len == QIO_CHANNEL_ERR_BLOCK) {
             if (qemu_in_coroutine()) {
                 qio_channel_yield(ioc, G_IO_IN);
@@ -222,7 +224,8 @@ int coroutine_mixed_fn qio_channel_readv_full_all(QIOChannel *ioc,
                                                   int **fds, size_t *nfds,
                                                   Error **errp)
 {
-    int ret = qio_channel_readv_full_all_eof(ioc, iov, niov, fds, nfds, errp);
+    int ret = qio_channel_readv_full_all_eof(ioc, iov, niov, fds, nfds, 0,
+                                             errp);
 
     if (ret == 0) {
         error_setg(errp, "Unexpected end-of-file before all data were read");
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v3 5/8] io: Add a read flag for relaxed EOF
  2025-02-07 19:53 [RFC PATCH v3 0/8] crypto,io,migration: Add support to gnutls_bye() Fabiano Rosas
                   ` (3 preceding siblings ...)
  2025-02-07 19:53 ` [RFC PATCH v3 4/8] io: Add flags argument to qio_channel_readv_full_all_eof Fabiano Rosas
@ 2025-02-07 19:53 ` Fabiano Rosas
  2025-02-07 19:53 ` [RFC PATCH v3 6/8] migration/multifd: Terminate the TLS connection Fabiano Rosas
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 12+ messages in thread
From: Fabiano Rosas @ 2025-02-07 19:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Maciej S . Szmigiero, Daniel P . Berrangé

Add a read flag that can inform a channel that it's ok to receive an
EOF at any moment. Channels that have some form of strict EOF
tracking, such as TLS session termination, may choose to ignore EOF
errors with the use of this flag.

This is being added for compatibility with older migration streams
that do not include a TLS termination step.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 include/io/channel.h | 1 +
 io/channel-tls.c     | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/io/channel.h b/include/io/channel.h
index 58940eead5..62b657109c 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -35,6 +35,7 @@ OBJECT_DECLARE_TYPE(QIOChannel, QIOChannelClass,
 #define QIO_CHANNEL_WRITE_FLAG_ZERO_COPY 0x1
 
 #define QIO_CHANNEL_READ_FLAG_MSG_PEEK 0x1
+#define QIO_CHANNEL_READ_FLAG_RELAXED_EOF 0x2
 
 typedef enum QIOChannelFeature QIOChannelFeature;
 
diff --git a/io/channel-tls.c b/io/channel-tls.c
index ecde6b57bf..caf8301a9e 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -359,6 +359,7 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc,
             tioc->session,
             iov[i].iov_base,
             iov[i].iov_len,
+            flags & QIO_CHANNEL_READ_FLAG_RELAXED_EOF ||
             qatomic_load_acquire(&tioc->shutdown) & QIO_CHANNEL_SHUTDOWN_READ,
             errp);
         if (ret == QCRYPTO_TLS_SESSION_ERR_BLOCK) {
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v3 6/8] migration/multifd: Terminate the TLS connection
  2025-02-07 19:53 [RFC PATCH v3 0/8] crypto,io,migration: Add support to gnutls_bye() Fabiano Rosas
                   ` (4 preceding siblings ...)
  2025-02-07 19:53 ` [RFC PATCH v3 5/8] io: Add a read flag for relaxed EOF Fabiano Rosas
@ 2025-02-07 19:53 ` Fabiano Rosas
  2025-02-07 19:53 ` [RFC PATCH v3 7/8] migration/multifd: Add a compat property for TLS termination Fabiano Rosas
  2025-02-07 19:53 ` [RFC PATCH v3 8/8] migration: Check migration error after loadvm Fabiano Rosas
  7 siblings, 0 replies; 12+ messages in thread
From: Fabiano Rosas @ 2025-02-07 19:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Maciej S . Szmigiero, Daniel P . Berrangé

The multifd recv side has been getting a TLS error of
GNUTLS_E_PREMATURE_TERMINATION at the end of migration when the send
side closes the sockets without ending the TLS session. This has been
masked by the code not checking the migration error after loadvm.

Start ending the TLS session at multifd_send_shutdown() so the recv
side always sees a clean termination (EOF) and we can start to
differentiate that from an actual premature termination that might
possibly happen in the middle of the migration.

There's nothing to be done if a previous migration error has already
broken the connection, so add a comment explaining it and ignore any
errors coming from gnutls_bye().

This doesn't break compat with older recv-side QEMUs because EOF has
always caused the recv thread to exit cleanly.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/multifd.c | 38 +++++++++++++++++++++++++++++++++++++-
 migration/tls.c     |  5 +++++
 migration/tls.h     |  2 +-
 3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index ab73d6d984..0296758c08 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -490,6 +490,36 @@ void multifd_send_shutdown(void)
         return;
     }
 
+    for (i = 0; i < migrate_multifd_channels(); i++) {
+        MultiFDSendParams *p = &multifd_send_state->params[i];
+
+        /* thread_created implies the TLS handshake has succeeded */
+        if (p->tls_thread_created && p->thread_created) {
+            Error *local_err = NULL;
+            /*
+             * The destination expects the TLS session to always be
+             * properly terminated. This helps to detect a premature
+             * termination in the middle of the stream.  Note that
+             * older QEMUs always break the connection on the source
+             * and the destination always sees
+             * GNUTLS_E_PREMATURE_TERMINATION.
+             */
+            migration_tls_channel_end(p->c, &local_err);
+
+            /*
+             * The above can return an error in case the migration has
+             * already failed. If the migration succeeded, errors are
+             * not expected but there's no need to kill the source.
+             */
+            if (local_err && !migration_has_failed(migrate_get_current())) {
+                warn_report(
+                    "multifd_send_%d: Failed to terminate TLS connection: %s",
+                    p->id, error_get_pretty(local_err));
+                break;
+            }
+        }
+    }
+
     multifd_send_terminate_threads();
 
     for (i = 0; i < migrate_multifd_channels(); i++) {
@@ -1141,7 +1171,13 @@ static void *multifd_recv_thread(void *opaque)
 
             ret = qio_channel_read_all_eof(p->c, (void *)p->packet,
                                            p->packet_len, &local_err);
-            if (ret == 0 || ret == -1) {   /* 0: EOF  -1: Error */
+            if (!ret) {
+                /* EOF */
+                assert(!local_err);
+                break;
+            }
+
+            if (ret == -1) {
                 break;
             }
 
diff --git a/migration/tls.c b/migration/tls.c
index fa03d9136c..5cbf952383 100644
--- a/migration/tls.c
+++ b/migration/tls.c
@@ -156,6 +156,11 @@ void migration_tls_channel_connect(MigrationState *s,
                               NULL);
 }
 
+void migration_tls_channel_end(QIOChannel *ioc, Error **errp)
+{
+    qio_channel_tls_bye(QIO_CHANNEL_TLS(ioc), errp);
+}
+
 bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc)
 {
     if (!migrate_tls()) {
diff --git a/migration/tls.h b/migration/tls.h
index 5797d153cb..58b25e1228 100644
--- a/migration/tls.h
+++ b/migration/tls.h
@@ -36,7 +36,7 @@ void migration_tls_channel_connect(MigrationState *s,
                                    QIOChannel *ioc,
                                    const char *hostname,
                                    Error **errp);
-
+void migration_tls_channel_end(QIOChannel *ioc, Error **errp);
 /* Whether the QIO channel requires further TLS handshake? */
 bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc);
 
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v3 7/8] migration/multifd: Add a compat property for TLS termination
  2025-02-07 19:53 [RFC PATCH v3 0/8] crypto,io,migration: Add support to gnutls_bye() Fabiano Rosas
                   ` (5 preceding siblings ...)
  2025-02-07 19:53 ` [RFC PATCH v3 6/8] migration/multifd: Terminate the TLS connection Fabiano Rosas
@ 2025-02-07 19:53 ` Fabiano Rosas
  2025-02-10 17:11   ` Peter Xu
  2025-02-07 19:53 ` [RFC PATCH v3 8/8] migration: Check migration error after loadvm Fabiano Rosas
  7 siblings, 1 reply; 12+ messages in thread
From: Fabiano Rosas @ 2025-02-07 19:53 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Maciej S . Szmigiero, Daniel P . Berrangé,
	Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Zhao Liu

We're currently changing the way the source multifd migration handles
the shutdown of the multifd channels when TLS is in use to perform a
clean termination by calling gnutls_bye().

Older src QEMUs will always close the channel without terminating the
TLS session. New dst QEMUs treat an unclean termination as an error.

Add multifd_clean_tls_termination (default true) that can be switched
on the destination whenever a src QEMU <= 9.2 is in use.

(Note that the compat property is only strictly necessary for src
QEMUs older than 9.1. Due to synchronization coincidences, src QEMUs
9.1 and 9.2 can put the destination in a condition where it doesn't
see the unclean termination. Still, make the property more inclusive
to facilitate potential backports.)

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 hw/core/machine.c     |  1 +
 migration/migration.h | 33 +++++++++++++++++++++++++++++++++
 migration/multifd.c   | 15 +++++++++++++--
 migration/multifd.h   |  2 ++
 migration/options.c   |  2 ++
 5 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 254cc20c4c..02cff735b3 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -42,6 +42,7 @@ GlobalProperty hw_compat_9_2[] = {
     { "virtio-balloon-pci-transitional", "vectors", "0" },
     { "virtio-balloon-pci-non-transitional", "vectors", "0" },
     { "virtio-mem-pci", "vectors", "0" },
+    { "migration", "multifd-clean-tls-termination", "false" },
 };
 const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2);
 
diff --git a/migration/migration.h b/migration/migration.h
index 4c1fafc2b5..77def0b437 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -443,6 +443,39 @@ struct MigrationState {
      * Default value is false. (since 8.1)
      */
     bool multifd_flush_after_each_section;
+
+    /*
+     * This variable only makes sense when set on the machine that is
+     * the destination of a multifd migration with TLS enabled. It
+     * affects the behavior of the last send->recv iteration with
+     * regards to termination of the TLS session.
+     *
+     * When set:
+     *
+     * - the destination QEMU instance can expect to never get a
+     *   GNUTLS_E_PREMATURE_TERMINATION error. Manifested as the error
+     *   message: "The TLS connection was non-properly terminated".
+     *
+     * When clear:
+     *
+     * - the destination QEMU instance can expect to see a
+     *   GNUTLS_E_PREMATURE_TERMINATION error in any multifd channel
+     *   whenever the last recv() call of that channel happens after
+     *   the source QEMU instance has already issued shutdown() on the
+     *   channel.
+     *
+     *   Commit 637280aeb2 (since 9.1) introduced a side effect that
+     *   causes the destination instance to not be affected by the
+     *   premature termination, while commit 1d457daf86 (since 10.0)
+     *   causes the premature termination condition to be once again
+     *   reachable.
+     *
+     * NOTE: Regardless of the state of this option, a premature
+     * termination of the TLS connection might happen due to error at
+     * any moment prior to the last send->recv iteration.
+     */
+    bool multifd_clean_tls_termination;
+
     /*
      * This decides the size of guest memory chunk that will be used
      * to track dirty bitmap clearing.  The size of memory chunk will
diff --git a/migration/multifd.c b/migration/multifd.c
index 0296758c08..8045197be8 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1151,6 +1151,7 @@ void multifd_recv_sync_main(void)
 
 static void *multifd_recv_thread(void *opaque)
 {
+    MigrationState *s = migrate_get_current();
     MultiFDRecvParams *p = opaque;
     Error *local_err = NULL;
     bool use_packets = multifd_use_packets();
@@ -1159,18 +1160,28 @@ static void *multifd_recv_thread(void *opaque)
     trace_multifd_recv_thread_start(p->id);
     rcu_register_thread();
 
+    if (!s->multifd_clean_tls_termination) {
+        p->read_flags = QIO_CHANNEL_READ_FLAG_RELAXED_EOF;
+    }
+
     while (true) {
         uint32_t flags = 0;
         bool has_data = false;
         p->normal_num = 0;
 
+
         if (use_packets) {
+            struct iovec iov = {
+                .iov_base = (void *)p->packet,
+                .iov_len = p->packet_len
+            };
+
             if (multifd_recv_should_exit()) {
                 break;
             }
 
-            ret = qio_channel_read_all_eof(p->c, (void *)p->packet,
-                                           p->packet_len, &local_err);
+            ret = qio_channel_readv_full_all_eof(p->c, &iov, 1, NULL, NULL,
+                                                 p->read_flags, &local_err);
             if (!ret) {
                 /* EOF */
                 assert(!local_err);
diff --git a/migration/multifd.h b/migration/multifd.h
index bd785b9873..cf408ff721 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -244,6 +244,8 @@ typedef struct {
     uint32_t zero_num;
     /* used for de-compression methods */
     void *compress_data;
+    /* Flags for the QIOChannel */
+    int read_flags;
 } MultiFDRecvParams;
 
 typedef struct {
diff --git a/migration/options.c b/migration/options.c
index 1ad950e397..feda354935 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -99,6 +99,8 @@ const Property migration_properties[] = {
                       clear_bitmap_shift, CLEAR_BITMAP_SHIFT_DEFAULT),
     DEFINE_PROP_BOOL("x-preempt-pre-7-2", MigrationState,
                      preempt_pre_7_2, false),
+    DEFINE_PROP_BOOL("multifd-clean-tls-termination", MigrationState,
+                     multifd_clean_tls_termination, true),
 
     /* Migration parameters */
     DEFINE_PROP_UINT8("x-throttle-trigger-threshold", MigrationState,
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH v3 8/8] migration: Check migration error after loadvm
  2025-02-07 19:53 [RFC PATCH v3 0/8] crypto,io,migration: Add support to gnutls_bye() Fabiano Rosas
                   ` (6 preceding siblings ...)
  2025-02-07 19:53 ` [RFC PATCH v3 7/8] migration/multifd: Add a compat property for TLS termination Fabiano Rosas
@ 2025-02-07 19:53 ` Fabiano Rosas
  2025-02-10 17:12   ` Peter Xu
  7 siblings, 1 reply; 12+ messages in thread
From: Fabiano Rosas @ 2025-02-07 19:53 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Maciej S . Szmigiero, Daniel P . Berrangé

We're currently only checking the QEMUFile error after
qemu_loadvm_state(). This was causing a TLS termination error from
multifd recv threads to be ignored.

Start checking the migration error as well to avoid missing further
errors.

Regarding compatibility concerning the TLS termination error that was
being ignored, for QEMUs <= 9.2 - if the old QEMU is being used as
migration source - the recently added migration property
multifd-tls-clean-termination needs to be set to OFF in the
*destination* machine.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/savevm.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index bc375db282..4046faf009 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2940,7 +2940,11 @@ int qemu_loadvm_state(QEMUFile *f)
 
     /* When reaching here, it must be precopy */
     if (ret == 0) {
-        ret = qemu_file_get_error(f);
+        if (migrate_has_error(migrate_get_current())) {
+            ret = -EINVAL;
+        } else {
+            ret = qemu_file_get_error(f);
+        }
     }
 
     /*
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v3 4/8] io: Add flags argument to qio_channel_readv_full_all_eof
  2025-02-07 19:53 ` [RFC PATCH v3 4/8] io: Add flags argument to qio_channel_readv_full_all_eof Fabiano Rosas
@ 2025-02-10  9:04   ` Daniel P. Berrangé
  0 siblings, 0 replies; 12+ messages in thread
From: Daniel P. Berrangé @ 2025-02-10  9:04 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, Peter Xu, Maciej S . Szmigiero, Elena Ufimtseva,
	Jagannathan Raman

On Fri, Feb 07, 2025 at 04:53:55PM -0300, Fabiano Rosas wrote:
> We want to pass flags into qio_channel_tls_readv() but
> qio_channel_readv_full_all_eof() doesn't take a flags argument.
> 
> No functional change.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  hw/remote/mpqemu-link.c | 2 +-
>  include/io/channel.h    | 2 ++
>  io/channel.c            | 9 ++++++---
>  3 files changed, 9 insertions(+), 4 deletions(-)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v3 7/8] migration/multifd: Add a compat property for TLS termination
  2025-02-07 19:53 ` [RFC PATCH v3 7/8] migration/multifd: Add a compat property for TLS termination Fabiano Rosas
@ 2025-02-10 17:11   ` Peter Xu
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2025-02-10 17:11 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, Maciej S . Szmigiero, Daniel P . Berrangé,
	Eduardo Habkost, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Yanan Wang, Zhao Liu

On Fri, Feb 07, 2025 at 04:53:58PM -0300, Fabiano Rosas wrote:
> We're currently changing the way the source multifd migration handles
> the shutdown of the multifd channels when TLS is in use to perform a
> clean termination by calling gnutls_bye().
> 
> Older src QEMUs will always close the channel without terminating the
> TLS session. New dst QEMUs treat an unclean termination as an error.
> 
> Add multifd_clean_tls_termination (default true) that can be switched
> on the destination whenever a src QEMU <= 9.2 is in use.
> 
> (Note that the compat property is only strictly necessary for src
> QEMUs older than 9.1. Due to synchronization coincidences, src QEMUs
> 9.1 and 9.2 can put the destination in a condition where it doesn't
> see the unclean termination. Still, make the property more inclusive
> to facilitate potential backports.)
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

Reviewed-by: Peter Xu <peterx@redhat.com>

One nitpick..

> ---
>  hw/core/machine.c     |  1 +
>  migration/migration.h | 33 +++++++++++++++++++++++++++++++++
>  migration/multifd.c   | 15 +++++++++++++--
>  migration/multifd.h   |  2 ++
>  migration/options.c   |  2 ++
>  5 files changed, 51 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index 254cc20c4c..02cff735b3 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -42,6 +42,7 @@ GlobalProperty hw_compat_9_2[] = {
>      { "virtio-balloon-pci-transitional", "vectors", "0" },
>      { "virtio-balloon-pci-non-transitional", "vectors", "0" },
>      { "virtio-mem-pci", "vectors", "0" },
> +    { "migration", "multifd-clean-tls-termination", "false" },
>  };
>  const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2);
>  
> diff --git a/migration/migration.h b/migration/migration.h
> index 4c1fafc2b5..77def0b437 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -443,6 +443,39 @@ struct MigrationState {
>       * Default value is false. (since 8.1)
>       */
>      bool multifd_flush_after_each_section;
> +
> +    /*
> +     * This variable only makes sense when set on the machine that is
> +     * the destination of a multifd migration with TLS enabled. It
> +     * affects the behavior of the last send->recv iteration with
> +     * regards to termination of the TLS session.
> +     *
> +     * When set:
> +     *
> +     * - the destination QEMU instance can expect to never get a
> +     *   GNUTLS_E_PREMATURE_TERMINATION error. Manifested as the error
> +     *   message: "The TLS connection was non-properly terminated".
> +     *
> +     * When clear:
> +     *
> +     * - the destination QEMU instance can expect to see a
> +     *   GNUTLS_E_PREMATURE_TERMINATION error in any multifd channel
> +     *   whenever the last recv() call of that channel happens after
> +     *   the source QEMU instance has already issued shutdown() on the
> +     *   channel.
> +     *
> +     *   Commit 637280aeb2 (since 9.1) introduced a side effect that
> +     *   causes the destination instance to not be affected by the
> +     *   premature termination, while commit 1d457daf86 (since 10.0)
> +     *   causes the premature termination condition to be once again
> +     *   reachable.
> +     *
> +     * NOTE: Regardless of the state of this option, a premature
> +     * termination of the TLS connection might happen due to error at
> +     * any moment prior to the last send->recv iteration.
> +     */
> +    bool multifd_clean_tls_termination;
> +
>      /*
>       * This decides the size of guest memory chunk that will be used
>       * to track dirty bitmap clearing.  The size of memory chunk will
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 0296758c08..8045197be8 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -1151,6 +1151,7 @@ void multifd_recv_sync_main(void)
>  
>  static void *multifd_recv_thread(void *opaque)
>  {
> +    MigrationState *s = migrate_get_current();
>      MultiFDRecvParams *p = opaque;
>      Error *local_err = NULL;
>      bool use_packets = multifd_use_packets();
> @@ -1159,18 +1160,28 @@ static void *multifd_recv_thread(void *opaque)
>      trace_multifd_recv_thread_start(p->id);
>      rcu_register_thread();
>  
> +    if (!s->multifd_clean_tls_termination) {
> +        p->read_flags = QIO_CHANNEL_READ_FLAG_RELAXED_EOF;
> +    }
> +
>      while (true) {
>          uint32_t flags = 0;
>          bool has_data = false;
>          p->normal_num = 0;
>  
> +

Extra newline (can be fixed when merge)

>          if (use_packets) {
> +            struct iovec iov = {
> +                .iov_base = (void *)p->packet,
> +                .iov_len = p->packet_len
> +            };
> +
>              if (multifd_recv_should_exit()) {
>                  break;
>              }
>  
> -            ret = qio_channel_read_all_eof(p->c, (void *)p->packet,
> -                                           p->packet_len, &local_err);
> +            ret = qio_channel_readv_full_all_eof(p->c, &iov, 1, NULL, NULL,
> +                                                 p->read_flags, &local_err);
>              if (!ret) {
>                  /* EOF */
>                  assert(!local_err);
> diff --git a/migration/multifd.h b/migration/multifd.h
> index bd785b9873..cf408ff721 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -244,6 +244,8 @@ typedef struct {
>      uint32_t zero_num;
>      /* used for de-compression methods */
>      void *compress_data;
> +    /* Flags for the QIOChannel */
> +    int read_flags;
>  } MultiFDRecvParams;
>  
>  typedef struct {
> diff --git a/migration/options.c b/migration/options.c
> index 1ad950e397..feda354935 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -99,6 +99,8 @@ const Property migration_properties[] = {
>                        clear_bitmap_shift, CLEAR_BITMAP_SHIFT_DEFAULT),
>      DEFINE_PROP_BOOL("x-preempt-pre-7-2", MigrationState,
>                       preempt_pre_7_2, false),
> +    DEFINE_PROP_BOOL("multifd-clean-tls-termination", MigrationState,
> +                     multifd_clean_tls_termination, true),
>  
>      /* Migration parameters */
>      DEFINE_PROP_UINT8("x-throttle-trigger-threshold", MigrationState,
> -- 
> 2.35.3
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC PATCH v3 8/8] migration: Check migration error after loadvm
  2025-02-07 19:53 ` [RFC PATCH v3 8/8] migration: Check migration error after loadvm Fabiano Rosas
@ 2025-02-10 17:12   ` Peter Xu
  0 siblings, 0 replies; 12+ messages in thread
From: Peter Xu @ 2025-02-10 17:12 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Maciej S . Szmigiero, Daniel P . Berrangé

On Fri, Feb 07, 2025 at 04:53:59PM -0300, Fabiano Rosas wrote:
> We're currently only checking the QEMUFile error after
> qemu_loadvm_state(). This was causing a TLS termination error from
> multifd recv threads to be ignored.
> 
> Start checking the migration error as well to avoid missing further
> errors.
> 
> Regarding compatibility concerning the TLS termination error that was
> being ignored, for QEMUs <= 9.2 - if the old QEMU is being used as
> migration source - the recently added migration property
> multifd-tls-clean-termination needs to be set to OFF in the
> *destination* machine.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-02-10 17:12 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-07 19:53 [RFC PATCH v3 0/8] crypto,io,migration: Add support to gnutls_bye() Fabiano Rosas
2025-02-07 19:53 ` [RFC PATCH v3 1/8] crypto: Allow gracefully ending the TLS session Fabiano Rosas
2025-02-07 19:53 ` [RFC PATCH v3 2/8] io: tls: Add qio_channel_tls_bye Fabiano Rosas
2025-02-07 19:53 ` [RFC PATCH v3 3/8] crypto: Remove qcrypto_tls_session_get_handshake_status Fabiano Rosas
2025-02-07 19:53 ` [RFC PATCH v3 4/8] io: Add flags argument to qio_channel_readv_full_all_eof Fabiano Rosas
2025-02-10  9:04   ` Daniel P. Berrangé
2025-02-07 19:53 ` [RFC PATCH v3 5/8] io: Add a read flag for relaxed EOF Fabiano Rosas
2025-02-07 19:53 ` [RFC PATCH v3 6/8] migration/multifd: Terminate the TLS connection Fabiano Rosas
2025-02-07 19:53 ` [RFC PATCH v3 7/8] migration/multifd: Add a compat property for TLS termination Fabiano Rosas
2025-02-10 17:11   ` Peter Xu
2025-02-07 19:53 ` [RFC PATCH v3 8/8] migration: Check migration error after loadvm Fabiano Rosas
2025-02-10 17:12   ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).