qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] migration mapped-ram fixes
@ 2024-03-13 21:28 Fabiano Rosas
  2024-03-13 21:28 ` [PATCH v2 1/2] migration: Fix iocs leaks during file and fd migration Fabiano Rosas
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Fabiano Rosas @ 2024-03-13 21:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: Daniel P . Berrangé, Peter Xu

Hi,

In this v2:

patch 1 - The fix for the ioc leaks, now including the main channel

patch 2 - A fix for an fd: migration case I thought I had written code
          for, but obviously didn't.

Thank you for your patience.

based-on: https://gitlab.com/peterx/qemu/-/commits/migration-stable
CI run: https://gitlab.com/farosas/qemu/-/pipelines/1212483701

Fabiano Rosas (2):
  migration: Fix iocs leaks during file and fd migration
  migration/multifd: Ensure we're not given a socket for file migration

 migration/fd.c   | 35 +++++++++++---------------
 migration/file.c | 65 ++++++++++++++++++++++++++++++++----------------
 migration/file.h |  1 +
 3 files changed, 60 insertions(+), 41 deletions(-)

-- 
2.35.3



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/2] migration: Fix iocs leaks during file and fd migration
  2024-03-13 21:28 [PATCH v2 0/2] migration mapped-ram fixes Fabiano Rosas
@ 2024-03-13 21:28 ` Fabiano Rosas
  2024-03-14 15:10   ` Peter Xu
  2024-03-13 21:28 ` [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration Fabiano Rosas
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Fabiano Rosas @ 2024-03-13 21:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: Daniel P . Berrangé, Peter Xu

The memory for the io channels is being leaked in three different ways
during file migration:

1) if the offset check fails we never drop the ioc reference;

2) we allocate an extra channel for no reason;

3) if multifd is enabled but channel creation fails when calling
   dup(), we leave the previous channels around along with the glib
   polling;

Fix all issues by restructuring the code to first allocate the
channels and only register the watches when all channels have been
created.

For multifd, the file and fd migrations can share code because both
are backed by a QIOChannelFile. For the non-multifd case, the fd needs
to be separate because it is backed by a QIOChannelSocket.

Fixes: 2dd7ee7a51 ("migration/multifd: Add incoming QIOChannelFile support")
Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
Reported-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/fd.c   | 29 +++++++-----------------
 migration/file.c | 58 ++++++++++++++++++++++++++++++------------------
 migration/file.h |  1 +
 3 files changed, 46 insertions(+), 42 deletions(-)

diff --git a/migration/fd.c b/migration/fd.c
index 4e2a63a73d..39a52e5c90 100644
--- a/migration/fd.c
+++ b/migration/fd.c
@@ -18,6 +18,7 @@
 #include "qapi/error.h"
 #include "channel.h"
 #include "fd.h"
+#include "file.h"
 #include "migration.h"
 #include "monitor/monitor.h"
 #include "io/channel-file.h"
@@ -80,7 +81,6 @@ static gboolean fd_accept_incoming_migration(QIOChannel *ioc,
 void fd_start_incoming_migration(const char *fdname, Error **errp)
 {
     QIOChannel *ioc;
-    QIOChannelFile *fioc;
     int fd = monitor_fd_param(monitor_cur(), fdname, errp);
     if (fd == -1) {
         return;
@@ -94,26 +94,13 @@ void fd_start_incoming_migration(const char *fdname, Error **errp)
         return;
     }
 
-    qio_channel_set_name(ioc, "migration-fd-incoming");
-    qio_channel_add_watch_full(ioc, G_IO_IN,
-                               fd_accept_incoming_migration,
-                               NULL, NULL,
-                               g_main_context_get_thread_default());
-
     if (migrate_multifd()) {
-        int channels = migrate_multifd_channels();
-
-        while (channels--) {
-            fioc = qio_channel_file_new_dupfd(fd, errp);
-            if (!fioc) {
-                return;
-            }
-
-            qio_channel_set_name(ioc, "migration-fd-incoming");
-            qio_channel_add_watch_full(QIO_CHANNEL(fioc), G_IO_IN,
-                                       fd_accept_incoming_migration,
-                                       NULL, NULL,
-                                       g_main_context_get_thread_default());
-        }
+        file_create_incoming_channels(ioc, errp);
+    } else {
+        qio_channel_set_name(ioc, "migration-fd-incoming");
+        qio_channel_add_watch_full(ioc, G_IO_IN,
+                                   fd_accept_incoming_migration,
+                                   NULL, NULL,
+                                   g_main_context_get_thread_default());
     }
 }
diff --git a/migration/file.c b/migration/file.c
index e56c5eb0a5..ddde0ca818 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -115,13 +115,46 @@ static gboolean file_accept_incoming_migration(QIOChannel *ioc,
     return G_SOURCE_REMOVE;
 }
 
+void file_create_incoming_channels(QIOChannel *ioc, Error **errp)
+{
+    int i, fd, channels = 1;
+    g_autofree QIOChannel **iocs = NULL;
+
+    if (migrate_multifd()) {
+        channels += migrate_multifd_channels();
+    }
+
+    iocs = g_new0(QIOChannel *, channels);
+    fd = QIO_CHANNEL_FILE(ioc)->fd;
+    iocs[0] = ioc;
+
+    for (i = 1; i < channels; i++) {
+        QIOChannelFile *fioc = qio_channel_file_new_dupfd(fd, errp);
+
+        if (!fioc) {
+            while (i) {
+                object_unref(iocs[--i]);
+            }
+            return;
+        }
+
+        iocs[i] = QIO_CHANNEL(fioc);
+    }
+
+    for (i = 0; i < channels; i++) {
+        qio_channel_set_name(iocs[i], "migration-file-incoming");
+        qio_channel_add_watch_full(iocs[i], G_IO_IN,
+                                   file_accept_incoming_migration,
+                                   NULL, NULL,
+                                   g_main_context_get_thread_default());
+    }
+}
+
 void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp)
 {
     g_autofree char *filename = g_strdup(file_args->filename);
     QIOChannelFile *fioc = NULL;
     uint64_t offset = file_args->offset;
-    int channels = 1;
-    int i = 0;
 
     trace_migration_file_incoming(filename);
 
@@ -132,28 +165,11 @@ void file_start_incoming_migration(FileMigrationArgs *file_args, Error **errp)
 
     if (offset &&
         qio_channel_io_seek(QIO_CHANNEL(fioc), offset, SEEK_SET, errp) < 0) {
+        object_unref(OBJECT(fioc));
         return;
     }
 
-    if (migrate_multifd()) {
-        channels += migrate_multifd_channels();
-    }
-
-    do {
-        QIOChannel *ioc = QIO_CHANNEL(fioc);
-
-        qio_channel_set_name(ioc, "migration-file-incoming");
-        qio_channel_add_watch_full(ioc, G_IO_IN,
-                                   file_accept_incoming_migration,
-                                   NULL, NULL,
-                                   g_main_context_get_thread_default());
-
-        fioc = qio_channel_file_new_dupfd(fioc->fd, errp);
-
-        if (!fioc) {
-            break;
-        }
-    } while (++i < channels);
+    file_create_incoming_channels(QIO_CHANNEL(fioc), errp);
 }
 
 int file_write_ramblock_iov(QIOChannel *ioc, const struct iovec *iov,
diff --git a/migration/file.h b/migration/file.h
index 9f71e87f74..7699c04677 100644
--- a/migration/file.h
+++ b/migration/file.h
@@ -20,6 +20,7 @@ void file_start_outgoing_migration(MigrationState *s,
 int file_parse_offset(char *filespec, uint64_t *offsetp, Error **errp);
 void file_cleanup_outgoing_migration(void);
 bool file_send_channel_create(gpointer opaque, Error **errp);
+void file_create_incoming_channels(QIOChannel *ioc, Error **errp);
 int file_write_ramblock_iov(QIOChannel *ioc, const struct iovec *iov,
                             int niov, RAMBlock *block, Error **errp);
 int multifd_file_recv_data(MultiFDRecvParams *p, Error **errp);
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration
  2024-03-13 21:28 [PATCH v2 0/2] migration mapped-ram fixes Fabiano Rosas
  2024-03-13 21:28 ` [PATCH v2 1/2] migration: Fix iocs leaks during file and fd migration Fabiano Rosas
@ 2024-03-13 21:28 ` Fabiano Rosas
  2024-03-14 15:10   ` Peter Xu
  2024-03-14 15:22 ` [PATCH v2 0/2] migration mapped-ram fixes Peter Xu
  2024-03-14 18:00 ` Peter Xu
  3 siblings, 1 reply; 14+ messages in thread
From: Fabiano Rosas @ 2024-03-13 21:28 UTC (permalink / raw)
  To: qemu-devel; +Cc: Daniel P . Berrangé, Peter Xu

When doing migration using the fd: URI, the incoming migration starts
before the user has passed the file descriptor to QEMU. This means
that the checks at migration_channels_and_transport_compatible()
happen too soon and we need to allow a migration channel of type
SOCKET_ADDRESS_TYPE_FD even though socket migration is not supported
with multifd.

The commit decdc76772 ("migration/multifd: Add mapped-ram support to
fd: URI") was supposed to add a second check prior to starting
migration to make sure a socket fd is not passed instead of a file fd,
but failed to do so.

Add the missing verification.

Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/fd.c   | 8 ++++++++
 migration/file.c | 7 +++++++
 2 files changed, 15 insertions(+)

diff --git a/migration/fd.c b/migration/fd.c
index 39a52e5c90..c07030f715 100644
--- a/migration/fd.c
+++ b/migration/fd.c
@@ -22,6 +22,7 @@
 #include "migration.h"
 #include "monitor/monitor.h"
 #include "io/channel-file.h"
+#include "io/channel-socket.h"
 #include "io/channel-util.h"
 #include "options.h"
 #include "trace.h"
@@ -95,6 +96,13 @@ void fd_start_incoming_migration(const char *fdname, Error **errp)
     }
 
     if (migrate_multifd()) {
+        if (fd_is_socket(fd)) {
+            error_setg(errp,
+                       "Multifd migration to a socket FD is not supported");
+            object_unref(ioc);
+            return;
+        }
+
         file_create_incoming_channels(ioc, errp);
     } else {
         qio_channel_set_name(ioc, "migration-fd-incoming");
diff --git a/migration/file.c b/migration/file.c
index ddde0ca818..b6e8ba13f2 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -15,6 +15,7 @@
 #include "file.h"
 #include "migration.h"
 #include "io/channel-file.h"
+#include "io/channel-socket.h"
 #include "io/channel-util.h"
 #include "options.h"
 #include "trace.h"
@@ -58,6 +59,12 @@ bool file_send_channel_create(gpointer opaque, Error **errp)
     int fd = fd_args_get_fd();
 
     if (fd && fd != -1) {
+        if (fd_is_socket(fd)) {
+            error_setg(errp,
+                       "Multifd migration to a socket FD is not supported");
+            goto out;
+        }
+
         ioc = qio_channel_file_new_dupfd(fd, errp);
     } else {
         ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration
  2024-03-13 21:28 ` [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration Fabiano Rosas
@ 2024-03-14 15:10   ` Peter Xu
  2024-03-14 15:43     ` Peter Xu
  2024-03-14 16:44     ` Fabiano Rosas
  0 siblings, 2 replies; 14+ messages in thread
From: Peter Xu @ 2024-03-14 15:10 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Daniel P . Berrangé

On Wed, Mar 13, 2024 at 06:28:24PM -0300, Fabiano Rosas wrote:
> When doing migration using the fd: URI, the incoming migration starts
> before the user has passed the file descriptor to QEMU. This means
> that the checks at migration_channels_and_transport_compatible()
> happen too soon and we need to allow a migration channel of type
> SOCKET_ADDRESS_TYPE_FD even though socket migration is not supported
> with multifd.

Hmm, bare with me if this is a stupid one.. why the incoming migration can
start _before_ the user passed in the fd?

IOW, why can't we rely on a single fd_is_socket() check for
SOCKET_ADDRESS_TYPE_FD in transport_supports_multi_channels()?

> 
> The commit decdc76772 ("migration/multifd: Add mapped-ram support to
> fd: URI") was supposed to add a second check prior to starting
> migration to make sure a socket fd is not passed instead of a file fd,
> but failed to do so.
> 
> Add the missing verification.
> 
> Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  migration/fd.c   | 8 ++++++++
>  migration/file.c | 7 +++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/migration/fd.c b/migration/fd.c
> index 39a52e5c90..c07030f715 100644
> --- a/migration/fd.c
> +++ b/migration/fd.c
> @@ -22,6 +22,7 @@
>  #include "migration.h"
>  #include "monitor/monitor.h"
>  #include "io/channel-file.h"
> +#include "io/channel-socket.h"
>  #include "io/channel-util.h"
>  #include "options.h"
>  #include "trace.h"
> @@ -95,6 +96,13 @@ void fd_start_incoming_migration(const char *fdname, Error **errp)
>      }
>  
>      if (migrate_multifd()) {
> +        if (fd_is_socket(fd)) {
> +            error_setg(errp,
> +                       "Multifd migration to a socket FD is not supported");
> +            object_unref(ioc);
> +            return;
> +        }
> +
>          file_create_incoming_channels(ioc, errp);
>      } else {
>          qio_channel_set_name(ioc, "migration-fd-incoming");
> diff --git a/migration/file.c b/migration/file.c
> index ddde0ca818..b6e8ba13f2 100644
> --- a/migration/file.c
> +++ b/migration/file.c
> @@ -15,6 +15,7 @@
>  #include "file.h"
>  #include "migration.h"
>  #include "io/channel-file.h"
> +#include "io/channel-socket.h"
>  #include "io/channel-util.h"
>  #include "options.h"
>  #include "trace.h"
> @@ -58,6 +59,12 @@ bool file_send_channel_create(gpointer opaque, Error **errp)
>      int fd = fd_args_get_fd();
>  
>      if (fd && fd != -1) {
> +        if (fd_is_socket(fd)) {
> +            error_setg(errp,
> +                       "Multifd migration to a socket FD is not supported");
> +            goto out;
> +        }
> +
>          ioc = qio_channel_file_new_dupfd(fd, errp);
>      } else {
>          ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
> -- 
> 2.35.3
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/2] migration: Fix iocs leaks during file and fd migration
  2024-03-13 21:28 ` [PATCH v2 1/2] migration: Fix iocs leaks during file and fd migration Fabiano Rosas
@ 2024-03-14 15:10   ` Peter Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2024-03-14 15:10 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Daniel P . Berrangé

On Wed, Mar 13, 2024 at 06:28:23PM -0300, Fabiano Rosas wrote:
> The memory for the io channels is being leaked in three different ways
> during file migration:
> 
> 1) if the offset check fails we never drop the ioc reference;
> 
> 2) we allocate an extra channel for no reason;
> 
> 3) if multifd is enabled but channel creation fails when calling
>    dup(), we leave the previous channels around along with the glib
>    polling;
> 
> Fix all issues by restructuring the code to first allocate the
> channels and only register the watches when all channels have been
> created.
> 
> For multifd, the file and fd migrations can share code because both
> are backed by a QIOChannelFile. For the non-multifd case, the fd needs
> to be separate because it is backed by a QIOChannelSocket.
> 
> Fixes: 2dd7ee7a51 ("migration/multifd: Add incoming QIOChannelFile support")
> Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
> Reported-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] migration mapped-ram fixes
  2024-03-13 21:28 [PATCH v2 0/2] migration mapped-ram fixes Fabiano Rosas
  2024-03-13 21:28 ` [PATCH v2 1/2] migration: Fix iocs leaks during file and fd migration Fabiano Rosas
  2024-03-13 21:28 ` [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration Fabiano Rosas
@ 2024-03-14 15:22 ` Peter Xu
  2024-03-14 16:55   ` Fabiano Rosas
  2024-03-14 18:00 ` Peter Xu
  3 siblings, 1 reply; 14+ messages in thread
From: Peter Xu @ 2024-03-14 15:22 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Daniel P . Berrangé

On Wed, Mar 13, 2024 at 06:28:22PM -0300, Fabiano Rosas wrote:
> Hi,
> 
> In this v2:
> 
> patch 1 - The fix for the ioc leaks, now including the main channel
> 
> patch 2 - A fix for an fd: migration case I thought I had written code
>           for, but obviously didn't.

Maybe I found one more issue.. I'm looking at fd_start_outgoing_migration().

    ioc = qio_channel_new_fd(fd, errp);  <----- here the fd is consumed and
                                                then owned by the IOC
    if (!ioc) {
        close(fd);
        return;
    }

    outgoing_args.fd = fd;               <----- here we use the fd again,
                                                and "owned" by outgoing_args
                                                even if it shouldn't?

The problem is outgoing_args.fd will be cleaned up with a close().  I had a
feeling that it's possible it will close() something else if the fd reused
before that close() but after the IOC's.  We may want yet another dup() for
outgoing_args.fd?

If you agree, we may also want to avoid doing:

    outgoing_args.fd = -1;

We could assert it instead making sure no fd leak.

> 
> Thank you for your patience.
> 
> based-on: https://gitlab.com/peterx/qemu/-/commits/migration-stable
> CI run: https://gitlab.com/farosas/qemu/-/pipelines/1212483701
> 
> Fabiano Rosas (2):
>   migration: Fix iocs leaks during file and fd migration
>   migration/multifd: Ensure we're not given a socket for file migration
> 
>  migration/fd.c   | 35 +++++++++++---------------
>  migration/file.c | 65 ++++++++++++++++++++++++++++++++----------------
>  migration/file.h |  1 +
>  3 files changed, 60 insertions(+), 41 deletions(-)
> 
> -- 
> 2.35.3
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration
  2024-03-14 15:10   ` Peter Xu
@ 2024-03-14 15:43     ` Peter Xu
  2024-03-14 16:50       ` Fabiano Rosas
  2024-03-14 16:44     ` Fabiano Rosas
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Xu @ 2024-03-14 15:43 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Daniel P . Berrangé

On Thu, Mar 14, 2024 at 11:10:12AM -0400, Peter Xu wrote:
> On Wed, Mar 13, 2024 at 06:28:24PM -0300, Fabiano Rosas wrote:
> > When doing migration using the fd: URI, the incoming migration starts
> > before the user has passed the file descriptor to QEMU. This means
> > that the checks at migration_channels_and_transport_compatible()
> > happen too soon and we need to allow a migration channel of type
> > SOCKET_ADDRESS_TYPE_FD even though socket migration is not supported
> > with multifd.
> 
> Hmm, bare with me if this is a stupid one.. why the incoming migration can
> start _before_ the user passed in the fd?
> 
> IOW, why can't we rely on a single fd_is_socket() check for
> SOCKET_ADDRESS_TYPE_FD in transport_supports_multi_channels()?
> 
> > 
> > The commit decdc76772 ("migration/multifd: Add mapped-ram support to
> > fd: URI") was supposed to add a second check prior to starting
> > migration to make sure a socket fd is not passed instead of a file fd,
> > but failed to do so.
> > 
> > Add the missing verification.
> > 
> > Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
> > Signed-off-by: Fabiano Rosas <farosas@suse.de>
> > ---
> >  migration/fd.c   | 8 ++++++++
> >  migration/file.c | 7 +++++++
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/migration/fd.c b/migration/fd.c
> > index 39a52e5c90..c07030f715 100644
> > --- a/migration/fd.c
> > +++ b/migration/fd.c
> > @@ -22,6 +22,7 @@
> >  #include "migration.h"
> >  #include "monitor/monitor.h"
> >  #include "io/channel-file.h"
> > +#include "io/channel-socket.h"
> >  #include "io/channel-util.h"
> >  #include "options.h"
> >  #include "trace.h"
> > @@ -95,6 +96,13 @@ void fd_start_incoming_migration(const char *fdname, Error **errp)
> >      }
> >  
> >      if (migrate_multifd()) {
> > +        if (fd_is_socket(fd)) {
> > +            error_setg(errp,
> > +                       "Multifd migration to a socket FD is not supported");
> > +            object_unref(ioc);
> > +            return;
> > +        }

And... I just noticed this is forbiding multifd+socket+fd in general?  But
isn't that the majority of multifd usage when with libvirt over sockets?

Shouldn't it about fd's seekable-or-not instead when mapped-ram enabled
(IOW, migration_needs_seekable_channel() only)?

> > +
> >          file_create_incoming_channels(ioc, errp);
> >      } else {
> >          qio_channel_set_name(ioc, "migration-fd-incoming");
> > diff --git a/migration/file.c b/migration/file.c
> > index ddde0ca818..b6e8ba13f2 100644
> > --- a/migration/file.c
> > +++ b/migration/file.c
> > @@ -15,6 +15,7 @@
> >  #include "file.h"
> >  #include "migration.h"
> >  #include "io/channel-file.h"
> > +#include "io/channel-socket.h"
> >  #include "io/channel-util.h"
> >  #include "options.h"
> >  #include "trace.h"
> > @@ -58,6 +59,12 @@ bool file_send_channel_create(gpointer opaque, Error **errp)
> >      int fd = fd_args_get_fd();
> >  
> >      if (fd && fd != -1) {
> > +        if (fd_is_socket(fd)) {
> > +            error_setg(errp,
> > +                       "Multifd migration to a socket FD is not supported");
> > +            goto out;
> > +        }
> > +
> >          ioc = qio_channel_file_new_dupfd(fd, errp);
> >      } else {
> >          ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
> > -- 
> > 2.35.3
> > 
> 
> -- 
> Peter Xu

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration
  2024-03-14 15:10   ` Peter Xu
  2024-03-14 15:43     ` Peter Xu
@ 2024-03-14 16:44     ` Fabiano Rosas
  2024-03-14 17:53       ` Peter Xu
  1 sibling, 1 reply; 14+ messages in thread
From: Fabiano Rosas @ 2024-03-14 16:44 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Daniel P . Berrangé

Peter Xu <peterx@redhat.com> writes:

> On Wed, Mar 13, 2024 at 06:28:24PM -0300, Fabiano Rosas wrote:
>> When doing migration using the fd: URI, the incoming migration starts
>> before the user has passed the file descriptor to QEMU. This means
>> that the checks at migration_channels_and_transport_compatible()
>> happen too soon and we need to allow a migration channel of type
>> SOCKET_ADDRESS_TYPE_FD even though socket migration is not supported
>> with multifd.
>
> Hmm, bare with me if this is a stupid one.. why the incoming migration can
> start _before_ the user passed in the fd?

It's been a while since I looked at this. Looking into it once more
today, I think the issue is actually that we only fetch the fds from the
monitor at fd_start_outgoing|incoming_migration().

>
> IOW, why can't we rely on a single fd_is_socket() check for
> SOCKET_ADDRESS_TYPE_FD in transport_supports_multi_channels()?
>

There's no fd at that point. Just a string.

I think the right fix here would be to move the
monitor_fd_get/monitor_fd_param (why two different functions?) earlier
into migrate_uri_parse. And possibly also extend FileMigrationArgs to
contain an fd. Not sure how easy would that be.

>> 
>> The commit decdc76772 ("migration/multifd: Add mapped-ram support to
>> fd: URI") was supposed to add a second check prior to starting
>> migration to make sure a socket fd is not passed instead of a file fd,
>> but failed to do so.
>> 
>> Add the missing verification.
>> 
>> Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>>  migration/fd.c   | 8 ++++++++
>>  migration/file.c | 7 +++++++
>>  2 files changed, 15 insertions(+)
>> 
>> diff --git a/migration/fd.c b/migration/fd.c
>> index 39a52e5c90..c07030f715 100644
>> --- a/migration/fd.c
>> +++ b/migration/fd.c
>> @@ -22,6 +22,7 @@
>>  #include "migration.h"
>>  #include "monitor/monitor.h"
>>  #include "io/channel-file.h"
>> +#include "io/channel-socket.h"
>>  #include "io/channel-util.h"
>>  #include "options.h"
>>  #include "trace.h"
>> @@ -95,6 +96,13 @@ void fd_start_incoming_migration(const char *fdname, Error **errp)
>>      }
>>  
>>      if (migrate_multifd()) {
>> +        if (fd_is_socket(fd)) {
>> +            error_setg(errp,
>> +                       "Multifd migration to a socket FD is not supported");
>> +            object_unref(ioc);
>> +            return;
>> +        }
>> +
>>          file_create_incoming_channels(ioc, errp);
>>      } else {
>>          qio_channel_set_name(ioc, "migration-fd-incoming");
>> diff --git a/migration/file.c b/migration/file.c
>> index ddde0ca818..b6e8ba13f2 100644
>> --- a/migration/file.c
>> +++ b/migration/file.c
>> @@ -15,6 +15,7 @@
>>  #include "file.h"
>>  #include "migration.h"
>>  #include "io/channel-file.h"
>> +#include "io/channel-socket.h"
>>  #include "io/channel-util.h"
>>  #include "options.h"
>>  #include "trace.h"
>> @@ -58,6 +59,12 @@ bool file_send_channel_create(gpointer opaque, Error **errp)
>>      int fd = fd_args_get_fd();
>>  
>>      if (fd && fd != -1) {
>> +        if (fd_is_socket(fd)) {
>> +            error_setg(errp,
>> +                       "Multifd migration to a socket FD is not supported");
>> +            goto out;
>> +        }
>> +
>>          ioc = qio_channel_file_new_dupfd(fd, errp);
>>      } else {
>>          ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
>> -- 
>> 2.35.3
>> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration
  2024-03-14 15:43     ` Peter Xu
@ 2024-03-14 16:50       ` Fabiano Rosas
  2024-03-14 17:58         ` Peter Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Fabiano Rosas @ 2024-03-14 16:50 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Daniel P . Berrangé

Peter Xu <peterx@redhat.com> writes:

> On Thu, Mar 14, 2024 at 11:10:12AM -0400, Peter Xu wrote:
>> On Wed, Mar 13, 2024 at 06:28:24PM -0300, Fabiano Rosas wrote:
>> > When doing migration using the fd: URI, the incoming migration starts
>> > before the user has passed the file descriptor to QEMU. This means
>> > that the checks at migration_channels_and_transport_compatible()
>> > happen too soon and we need to allow a migration channel of type
>> > SOCKET_ADDRESS_TYPE_FD even though socket migration is not supported
>> > with multifd.
>> 
>> Hmm, bare with me if this is a stupid one.. why the incoming migration can
>> start _before_ the user passed in the fd?
>> 
>> IOW, why can't we rely on a single fd_is_socket() check for
>> SOCKET_ADDRESS_TYPE_FD in transport_supports_multi_channels()?
>> 
>> > 
>> > The commit decdc76772 ("migration/multifd: Add mapped-ram support to
>> > fd: URI") was supposed to add a second check prior to starting
>> > migration to make sure a socket fd is not passed instead of a file fd,
>> > but failed to do so.
>> > 
>> > Add the missing verification.
>> > 
>> > Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
>> > Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> > ---
>> >  migration/fd.c   | 8 ++++++++
>> >  migration/file.c | 7 +++++++
>> >  2 files changed, 15 insertions(+)
>> > 
>> > diff --git a/migration/fd.c b/migration/fd.c
>> > index 39a52e5c90..c07030f715 100644
>> > --- a/migration/fd.c
>> > +++ b/migration/fd.c
>> > @@ -22,6 +22,7 @@
>> >  #include "migration.h"
>> >  #include "monitor/monitor.h"
>> >  #include "io/channel-file.h"
>> > +#include "io/channel-socket.h"
>> >  #include "io/channel-util.h"
>> >  #include "options.h"
>> >  #include "trace.h"
>> > @@ -95,6 +96,13 @@ void fd_start_incoming_migration(const char *fdname, Error **errp)
>> >      }
>> >  
>> >      if (migrate_multifd()) {
>> > +        if (fd_is_socket(fd)) {
>> > +            error_setg(errp,
>> > +                       "Multifd migration to a socket FD is not supported");
>> > +            object_unref(ioc);
>> > +            return;
>> > +        }
>
> And... I just noticed this is forbiding multifd+socket+fd in general?  But
> isn't that the majority of multifd usage when with libvirt over sockets?

I didn't think multifd supported socket fds, does it? I don't see code
to create the multiple channels anywhere. How would that work? Multiple
threads writing to a single socket fd? I'm a bit confused.

>
> Shouldn't it about fd's seekable-or-not instead when mapped-ram enabled
> (IOW, migration_needs_seekable_channel() only)?

Yes, that could be a validation to be done if we actually get the fd at
the right moment.

>
>> > +
>> >          file_create_incoming_channels(ioc, errp);
>> >      } else {
>> >          qio_channel_set_name(ioc, "migration-fd-incoming");
>> > diff --git a/migration/file.c b/migration/file.c
>> > index ddde0ca818..b6e8ba13f2 100644
>> > --- a/migration/file.c
>> > +++ b/migration/file.c
>> > @@ -15,6 +15,7 @@
>> >  #include "file.h"
>> >  #include "migration.h"
>> >  #include "io/channel-file.h"
>> > +#include "io/channel-socket.h"
>> >  #include "io/channel-util.h"
>> >  #include "options.h"
>> >  #include "trace.h"
>> > @@ -58,6 +59,12 @@ bool file_send_channel_create(gpointer opaque, Error **errp)
>> >      int fd = fd_args_get_fd();
>> >  
>> >      if (fd && fd != -1) {
>> > +        if (fd_is_socket(fd)) {
>> > +            error_setg(errp,
>> > +                       "Multifd migration to a socket FD is not supported");
>> > +            goto out;
>> > +        }
>> > +
>> >          ioc = qio_channel_file_new_dupfd(fd, errp);
>> >      } else {
>> >          ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp);
>> > -- 
>> > 2.35.3
>> > 
>> 
>> -- 
>> Peter Xu


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] migration mapped-ram fixes
  2024-03-14 15:22 ` [PATCH v2 0/2] migration mapped-ram fixes Peter Xu
@ 2024-03-14 16:55   ` Fabiano Rosas
  2024-03-14 17:35     ` Peter Xu
  0 siblings, 1 reply; 14+ messages in thread
From: Fabiano Rosas @ 2024-03-14 16:55 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, Daniel P . Berrangé

Peter Xu <peterx@redhat.com> writes:

> On Wed, Mar 13, 2024 at 06:28:22PM -0300, Fabiano Rosas wrote:
>> Hi,
>> 
>> In this v2:
>> 
>> patch 1 - The fix for the ioc leaks, now including the main channel
>> 
>> patch 2 - A fix for an fd: migration case I thought I had written code
>>           for, but obviously didn't.
>
> Maybe I found one more issue.. I'm looking at fd_start_outgoing_migration().
>
>     ioc = qio_channel_new_fd(fd, errp);  <----- here the fd is consumed and
>                                                 then owned by the IOC
>     if (!ioc) {
>         close(fd);
>         return;
>     }
>
>     outgoing_args.fd = fd;               <----- here we use the fd again,
>                                                 and "owned" by outgoing_args
>                                                 even if it shouldn't?
>
> The problem is outgoing_args.fd will be cleaned up with a close().  I had a
> feeling that it's possible it will close() something else if the fd reused
> before that close() but after the IOC's.  We may want yet another dup() for
> outgoing_args.fd?

I think the right fix is to not close() it at
fd_cleanup_outgoing_migration(). That fd is already owned by the ioc.

>
> If you agree, we may also want to avoid doing:
>
>     outgoing_args.fd = -1;

We will always need this. This is just initialization of the field
because 0 is a valid fd value. Otherwise the file.c code can't know if
we're actually using an fd at all.

@file_send_channel_create:

int fd = fd_args_get_fd();

if (fd && fd != -1) {
    <new IOC from fd>
} else {
    <new IOC from file name>
}

>
> We could assert it instead making sure no fd leak.
>
>> 
>> Thank you for your patience.
>> 
>> based-on: https://gitlab.com/peterx/qemu/-/commits/migration-stable
>> CI run: https://gitlab.com/farosas/qemu/-/pipelines/1212483701
>> 
>> Fabiano Rosas (2):
>>   migration: Fix iocs leaks during file and fd migration
>>   migration/multifd: Ensure we're not given a socket for file migration
>> 
>>  migration/fd.c   | 35 +++++++++++---------------
>>  migration/file.c | 65 ++++++++++++++++++++++++++++++++----------------
>>  migration/file.h |  1 +
>>  3 files changed, 60 insertions(+), 41 deletions(-)
>> 
>> -- 
>> 2.35.3
>> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] migration mapped-ram fixes
  2024-03-14 16:55   ` Fabiano Rosas
@ 2024-03-14 17:35     ` Peter Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2024-03-14 17:35 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Daniel P . Berrangé

On Thu, Mar 14, 2024 at 01:55:31PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Wed, Mar 13, 2024 at 06:28:22PM -0300, Fabiano Rosas wrote:
> >> Hi,
> >> 
> >> In this v2:
> >> 
> >> patch 1 - The fix for the ioc leaks, now including the main channel
> >> 
> >> patch 2 - A fix for an fd: migration case I thought I had written code
> >>           for, but obviously didn't.
> >
> > Maybe I found one more issue.. I'm looking at fd_start_outgoing_migration().
> >
> >     ioc = qio_channel_new_fd(fd, errp);  <----- here the fd is consumed and
> >                                                 then owned by the IOC
> >     if (!ioc) {
> >         close(fd);
> >         return;
> >     }
> >
> >     outgoing_args.fd = fd;               <----- here we use the fd again,
> >                                                 and "owned" by outgoing_args
> >                                                 even if it shouldn't?
> >
> > The problem is outgoing_args.fd will be cleaned up with a close().  I had a
> > feeling that it's possible it will close() something else if the fd reused
> > before that close() but after the IOC's.  We may want yet another dup() for
> > outgoing_args.fd?
> 
> I think the right fix is to not close() it at
> fd_cleanup_outgoing_migration(). That fd is already owned by the ioc.

But outgoing_args.fd can point to other things if the IOC (along with the
ioc->fd) is released.  Keeping outgoing_args.fd pointing to that fd index
should be dangerous because the integer can be reused.

> 
> >
> > If you agree, we may also want to avoid doing:
> >
> >     outgoing_args.fd = -1;
> 
> We will always need this. This is just initialization of the field
> because 0 is a valid fd value. Otherwise the file.c code can't know if
> we're actually using an fd at all.

I meant avoid setting it to -1 only in fd_start_outgoing_migration().
Using -1 to represent "no fd" is fine.

> 
> @file_send_channel_create:
> 
> int fd = fd_args_get_fd();
> 
> if (fd && fd != -1) {
>     <new IOC from fd>
> } else {
>     <new IOC from file name>
> }
> 
> >
> > We could assert it instead making sure no fd leak.
> >
> >> 
> >> Thank you for your patience.
> >> 
> >> based-on: https://gitlab.com/peterx/qemu/-/commits/migration-stable
> >> CI run: https://gitlab.com/farosas/qemu/-/pipelines/1212483701
> >> 
> >> Fabiano Rosas (2):
> >>   migration: Fix iocs leaks during file and fd migration
> >>   migration/multifd: Ensure we're not given a socket for file migration
> >> 
> >>  migration/fd.c   | 35 +++++++++++---------------
> >>  migration/file.c | 65 ++++++++++++++++++++++++++++++++----------------
> >>  migration/file.h |  1 +
> >>  3 files changed, 60 insertions(+), 41 deletions(-)
> >> 
> >> -- 
> >> 2.35.3
> >> 
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration
  2024-03-14 16:44     ` Fabiano Rosas
@ 2024-03-14 17:53       ` Peter Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2024-03-14 17:53 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Daniel P . Berrangé

On Thu, Mar 14, 2024 at 01:44:13PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Wed, Mar 13, 2024 at 06:28:24PM -0300, Fabiano Rosas wrote:
> >> When doing migration using the fd: URI, the incoming migration starts
> >> before the user has passed the file descriptor to QEMU. This means
> >> that the checks at migration_channels_and_transport_compatible()
> >> happen too soon and we need to allow a migration channel of type
> >> SOCKET_ADDRESS_TYPE_FD even though socket migration is not supported
> >> with multifd.
> >
> > Hmm, bare with me if this is a stupid one.. why the incoming migration can
> > start _before_ the user passed in the fd?
> 
> It's been a while since I looked at this. Looking into it once more
> today, I think the issue is actually that we only fetch the fds from the
> monitor at fd_start_outgoing|incoming_migration().

Yes that looks more reasonable.

It means we may want to touch up transport_supports_seeking()'s comment on
this if needed.

> 
> >
> > IOW, why can't we rely on a single fd_is_socket() check for
> > SOCKET_ADDRESS_TYPE_FD in transport_supports_multi_channels()?
> >
> 
> There's no fd at that point. Just a string.
> 
> I think the right fix here would be to move the
> monitor_fd_get/monitor_fd_param (why two different functions?)

Yes this is all confusing.

I think it makes sense to use the same guideline for both sides on the fd:
protocol, perhaps it means we should always use monitor_fd_param() to be
consistent and compatible.

> earlier into migrate_uri_parse. And possibly also extend
> FileMigrationArgs to contain an fd. Not sure how easy would that be.

But if so IIUC the 'filename' parameter will need to be optional if "fd"
exists.  While that will break QAPI for 8.2.

I think it's fine we keep what we do right now (with the above comment
fixed on why that check needs to be delayed, though), or if you find it
easy to unify the check in some undestructive way.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration
  2024-03-14 16:50       ` Fabiano Rosas
@ 2024-03-14 17:58         ` Peter Xu
  0 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2024-03-14 17:58 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Daniel P . Berrangé

On Thu, Mar 14, 2024 at 01:50:07PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Thu, Mar 14, 2024 at 11:10:12AM -0400, Peter Xu wrote:
> >> On Wed, Mar 13, 2024 at 06:28:24PM -0300, Fabiano Rosas wrote:
> >> > When doing migration using the fd: URI, the incoming migration starts
> >> > before the user has passed the file descriptor to QEMU. This means
> >> > that the checks at migration_channels_and_transport_compatible()
> >> > happen too soon and we need to allow a migration channel of type
> >> > SOCKET_ADDRESS_TYPE_FD even though socket migration is not supported
> >> > with multifd.
> >> 
> >> Hmm, bare with me if this is a stupid one.. why the incoming migration can
> >> start _before_ the user passed in the fd?
> >> 
> >> IOW, why can't we rely on a single fd_is_socket() check for
> >> SOCKET_ADDRESS_TYPE_FD in transport_supports_multi_channels()?
> >> 
> >> > 
> >> > The commit decdc76772 ("migration/multifd: Add mapped-ram support to
> >> > fd: URI") was supposed to add a second check prior to starting
> >> > migration to make sure a socket fd is not passed instead of a file fd,
> >> > but failed to do so.
> >> > 
> >> > Add the missing verification.
> >> > 
> >> > Fixes: decdc76772 ("migration/multifd: Add mapped-ram support to fd: URI")
> >> > Signed-off-by: Fabiano Rosas <farosas@suse.de>
> >> > ---
> >> >  migration/fd.c   | 8 ++++++++
> >> >  migration/file.c | 7 +++++++
> >> >  2 files changed, 15 insertions(+)
> >> > 
> >> > diff --git a/migration/fd.c b/migration/fd.c
> >> > index 39a52e5c90..c07030f715 100644
> >> > --- a/migration/fd.c
> >> > +++ b/migration/fd.c
> >> > @@ -22,6 +22,7 @@
> >> >  #include "migration.h"
> >> >  #include "monitor/monitor.h"
> >> >  #include "io/channel-file.h"
> >> > +#include "io/channel-socket.h"
> >> >  #include "io/channel-util.h"
> >> >  #include "options.h"
> >> >  #include "trace.h"
> >> > @@ -95,6 +96,13 @@ void fd_start_incoming_migration(const char *fdname, Error **errp)
> >> >      }
> >> >  
> >> >      if (migrate_multifd()) {
> >> > +        if (fd_is_socket(fd)) {
> >> > +            error_setg(errp,
> >> > +                       "Multifd migration to a socket FD is not supported");
> >> > +            object_unref(ioc);
> >> > +            return;
> >> > +        }
> >
> > And... I just noticed this is forbiding multifd+socket+fd in general?  But
> > isn't that the majority of multifd usage when with libvirt over sockets?
> 
> I didn't think multifd supported socket fds, does it? I don't see code
> to create the multiple channels anywhere. How would that work? Multiple
> threads writing to a single socket fd? I'm a bit confused.

You're probably right.

I somehow had the assumption that Libvirt always used fds to passover to
QEMU for migration, but indeed multifd at least shouldn't support it as I
read the code again..  It'll be good if Dan would help to clarify when fd
will be used in migrations.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 0/2] migration mapped-ram fixes
  2024-03-13 21:28 [PATCH v2 0/2] migration mapped-ram fixes Fabiano Rosas
                   ` (2 preceding siblings ...)
  2024-03-14 15:22 ` [PATCH v2 0/2] migration mapped-ram fixes Peter Xu
@ 2024-03-14 18:00 ` Peter Xu
  3 siblings, 0 replies; 14+ messages in thread
From: Peter Xu @ 2024-03-14 18:00 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Daniel P . Berrangé

On Wed, Mar 13, 2024 at 06:28:22PM -0300, Fabiano Rosas wrote:
> Hi,
> 
> In this v2:
> 
> patch 1 - The fix for the ioc leaks, now including the main channel
> 
> patch 2 - A fix for an fd: migration case I thought I had written code
>           for, but obviously didn't.

The two issues are separate.  I assume patch 2 will need a rework, but I
queued patch 1 first.  Thanks.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-03-14 18:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-13 21:28 [PATCH v2 0/2] migration mapped-ram fixes Fabiano Rosas
2024-03-13 21:28 ` [PATCH v2 1/2] migration: Fix iocs leaks during file and fd migration Fabiano Rosas
2024-03-14 15:10   ` Peter Xu
2024-03-13 21:28 ` [PATCH v2 2/2] migration/multifd: Ensure we're not given a socket for file migration Fabiano Rosas
2024-03-14 15:10   ` Peter Xu
2024-03-14 15:43     ` Peter Xu
2024-03-14 16:50       ` Fabiano Rosas
2024-03-14 17:58         ` Peter Xu
2024-03-14 16:44     ` Fabiano Rosas
2024-03-14 17:53       ` Peter Xu
2024-03-14 15:22 ` [PATCH v2 0/2] migration mapped-ram fixes Peter Xu
2024-03-14 16:55   ` Fabiano Rosas
2024-03-14 17:35     ` Peter Xu
2024-03-14 18:00 ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).