All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Multifd fixes
@ 2024-08-01 17:40 Fabiano Rosas
  2024-08-01 17:41 ` [PATCH 1/2] migration: Fix cleanup of iochannel in file migration Fabiano Rosas
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Fabiano Rosas @ 2024-08-01 17:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu

Hi, a couple of multifd fixes to issues that Jim spotted while working
on mapped-ram for libvirt.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/1397467740

Fabiano Rosas (2):
  migration: Fix cleanup of iochannel in file migration
  migration/multifd: Fix multifd_send_setup cleanup when channel
    creation fails

 migration/file.c    |  2 --
 migration/multifd.c | 26 +++++++++++++++-----------
 2 files changed, 15 insertions(+), 13 deletions(-)

-- 
2.35.3



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] migration: Fix cleanup of iochannel in file migration
  2024-08-01 17:40 [PATCH 0/2] Multifd fixes Fabiano Rosas
@ 2024-08-01 17:41 ` Fabiano Rosas
  2024-08-01 18:39   ` Peter Xu
  2024-08-01 17:41 ` [PATCH 2/2] migration/multifd: Fix multifd_send_setup cleanup when channel creation fails Fabiano Rosas
  2024-08-02 12:47 ` [PATCH 0/2] Multifd fixes Fabiano Rosas
  2 siblings, 1 reply; 7+ messages in thread
From: Fabiano Rosas @ 2024-08-01 17:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, Jim Fehlig

The QIOChannelFile object already has its reference decremented by
g_autoptr. Trying to unref an extra time causes:

ERROR:../qom/object.c:1241:object_unref: assertion failed: (obj->ref > 0)

Fixes: a701c03dec ("migration: Drop reference to QIOChannel if file seeking fails")
Fixes: 6d3279655a ("migration: Fix file migration with fdset")
Reported-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/file.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/migration/file.c b/migration/file.c
index db870f2cf0..6451a21c86 100644
--- a/migration/file.c
+++ b/migration/file.c
@@ -112,7 +112,6 @@ void file_start_outgoing_migration(MigrationState *s,
         error_setg_errno(errp, errno,
                          "failed to truncate migration file to offset %" PRIx64,
                          offset);
-        object_unref(OBJECT(fioc));
         return;
     }
 
@@ -120,7 +119,6 @@ void file_start_outgoing_migration(MigrationState *s,
 
     ioc = QIO_CHANNEL(fioc);
     if (offset && qio_channel_io_seek(ioc, offset, SEEK_SET, errp) < 0) {
-        object_unref(OBJECT(fioc));
         return;
     }
     qio_channel_set_name(ioc, "migration-file-outgoing");
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] migration/multifd: Fix multifd_send_setup cleanup when channel creation fails
  2024-08-01 17:40 [PATCH 0/2] Multifd fixes Fabiano Rosas
  2024-08-01 17:41 ` [PATCH 1/2] migration: Fix cleanup of iochannel in file migration Fabiano Rosas
@ 2024-08-01 17:41 ` Fabiano Rosas
  2024-08-01 18:38   ` Peter Xu
  2024-08-02 12:47 ` [PATCH 0/2] Multifd fixes Fabiano Rosas
  2 siblings, 1 reply; 7+ messages in thread
From: Fabiano Rosas @ 2024-08-01 17:41 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Xu, qemu-stable, Jim Fehlig

When a channel fails to create, the code currently just returns. This
is wrong for two reasons:

1) Channel n+1 will not get to initialize it's semaphores, leading to
   an assert when terminate_threads tries to post to it:

 qemu-system-x86_64: ../util/qemu-thread-posix.c:92:
 qemu_mutex_lock_impl: Assertion `mutex->initialized' failed.

2) (theoretical) If channel n-1 already started creation it will
   defeat the purpose of the channels_created logic which is in place
   to avoid migrate_fd_cleanup() to run while channels are still being
   created.

   This cannot really happen today because the current failure cases
   for multifd_new_send_channel_create() are all synchronous,
   resulting from qio_channel_file_new_path() getting a bad
   filename. This would hit all channels equally.

   But I don't want to set a trap for future people, so have all
   channels try to create (even if failing), and only fail after the
   channels_created semaphore has been posted.

While here, remove the error_report_err call. There's one already at
migrate_fd_cleanup later on.

Cc: qemu-stable@nongnu.org
Reported-by: Jim Fehlig <jfehlig@suse.com>
Fixes: bd8b0a8f82 ("migration/multifd: Move multifd_send_setup error handling in to the function")
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/multifd.c | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 0b4cbaddfe..552f9723c8 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1156,7 +1156,6 @@ static bool multifd_new_send_channel_create(gpointer opaque, Error **errp)
 bool multifd_send_setup(void)
 {
     MigrationState *s = migrate_get_current();
-    Error *local_err = NULL;
     int thread_count, ret = 0;
     uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size();
     bool use_packets = multifd_use_packets();
@@ -1177,6 +1176,7 @@ bool multifd_send_setup(void)
 
     for (i = 0; i < thread_count; i++) {
         MultiFDSendParams *p = &multifd_send_state->params[i];
+        Error *local_err = NULL;
 
         qemu_sem_init(&p->sem, 0);
         qemu_sem_init(&p->sem_sync, 0);
@@ -1196,7 +1196,8 @@ bool multifd_send_setup(void)
         p->write_flags = 0;
 
         if (!multifd_new_send_channel_create(p, &local_err)) {
-            return false;
+            migrate_set_error(s, local_err);
+            ret = -1;
         }
     }
 
@@ -1209,24 +1210,27 @@ bool multifd_send_setup(void)
         qemu_sem_wait(&multifd_send_state->channels_created);
     }
 
+    if (ret) {
+        goto err;
+    }
+
     for (i = 0; i < thread_count; i++) {
         MultiFDSendParams *p = &multifd_send_state->params[i];
+        Error *local_err = NULL;
 
         ret = multifd_send_state->ops->send_setup(p, &local_err);
         if (ret) {
-            break;
+            migrate_set_error(s, local_err);
+            goto err;
         }
     }
 
-    if (ret) {
-        migrate_set_error(s, local_err);
-        error_report_err(local_err);
-        migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
-                          MIGRATION_STATUS_FAILED);
-        return false;
-    }
-
     return true;
+
+err:
+    migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
+                      MIGRATION_STATUS_FAILED);
+    return false;
 }
 
 bool multifd_recv(void)
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] migration/multifd: Fix multifd_send_setup cleanup when channel creation fails
  2024-08-01 17:41 ` [PATCH 2/2] migration/multifd: Fix multifd_send_setup cleanup when channel creation fails Fabiano Rosas
@ 2024-08-01 18:38   ` Peter Xu
  2024-08-01 19:14     ` Fabiano Rosas
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Xu @ 2024-08-01 18:38 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, qemu-stable, Jim Fehlig

On Thu, Aug 01, 2024 at 02:41:01PM -0300, Fabiano Rosas wrote:
> When a channel fails to create, the code currently just returns. This
> is wrong for two reasons:
> 
> 1) Channel n+1 will not get to initialize it's semaphores, leading to
>    an assert when terminate_threads tries to post to it:
> 
>  qemu-system-x86_64: ../util/qemu-thread-posix.c:92:
>  qemu_mutex_lock_impl: Assertion `mutex->initialized' failed.
> 
> 2) (theoretical) If channel n-1 already started creation it will
>    defeat the purpose of the channels_created logic which is in place
>    to avoid migrate_fd_cleanup() to run while channels are still being
>    created.
> 
>    This cannot really happen today because the current failure cases
>    for multifd_new_send_channel_create() are all synchronous,
>    resulting from qio_channel_file_new_path() getting a bad
>    filename. This would hit all channels equally.
> 
>    But I don't want to set a trap for future people, so have all
>    channels try to create (even if failing), and only fail after the
>    channels_created semaphore has been posted.
> 
> While here, remove the error_report_err call. There's one already at
> migrate_fd_cleanup later on.
> 
> Cc: qemu-stable@nongnu.org
> Reported-by: Jim Fehlig <jfehlig@suse.com>
> Fixes: bd8b0a8f82 ("migration/multifd: Move multifd_send_setup error handling in to the function")

Should it be this one instead?

b7b03eb614 ("migration/multifd: Add outgoing QIOChannelFile support")

> Signed-off-by: Fabiano Rosas <farosas@suse.de>

Reviewed-by: Peter Xu <peterx@redhat.com>

PS: what's your plan on your other multifd SendData series?  I got a bit
overloaded on downstream stuff and I still have plenty review debts
recently (CPR one of them.. needs follow ups), so just to say I may delay a
bit on reading that one.  I assume it's next-release stuff anyway, but let
me know otherwise.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] migration: Fix cleanup of iochannel in file migration
  2024-08-01 17:41 ` [PATCH 1/2] migration: Fix cleanup of iochannel in file migration Fabiano Rosas
@ 2024-08-01 18:39   ` Peter Xu
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Xu @ 2024-08-01 18:39 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, Jim Fehlig

On Thu, Aug 01, 2024 at 02:41:00PM -0300, Fabiano Rosas wrote:
> The QIOChannelFile object already has its reference decremented by
> g_autoptr. Trying to unref an extra time causes:
> 
> ERROR:../qom/object.c:1241:object_unref: assertion failed: (obj->ref > 0)
> 
> Fixes: a701c03dec ("migration: Drop reference to QIOChannel if file seeking fails")
> Fixes: 6d3279655a ("migration: Fix file migration with fdset")
> Reported-by: Jim Fehlig <jfehlig@suse.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>

Ouch..

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] migration/multifd: Fix multifd_send_setup cleanup when channel creation fails
  2024-08-01 18:38   ` Peter Xu
@ 2024-08-01 19:14     ` Fabiano Rosas
  0 siblings, 0 replies; 7+ messages in thread
From: Fabiano Rosas @ 2024-08-01 19:14 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel, qemu-stable, Jim Fehlig

Peter Xu <peterx@redhat.com> writes:

> On Thu, Aug 01, 2024 at 02:41:01PM -0300, Fabiano Rosas wrote:
>> When a channel fails to create, the code currently just returns. This
>> is wrong for two reasons:
>> 
>> 1) Channel n+1 will not get to initialize it's semaphores, leading to
>>    an assert when terminate_threads tries to post to it:
>> 
>>  qemu-system-x86_64: ../util/qemu-thread-posix.c:92:
>>  qemu_mutex_lock_impl: Assertion `mutex->initialized' failed.
>> 
>> 2) (theoretical) If channel n-1 already started creation it will
>>    defeat the purpose of the channels_created logic which is in place
>>    to avoid migrate_fd_cleanup() to run while channels are still being
>>    created.
>> 
>>    This cannot really happen today because the current failure cases
>>    for multifd_new_send_channel_create() are all synchronous,
>>    resulting from qio_channel_file_new_path() getting a bad
>>    filename. This would hit all channels equally.
>> 
>>    But I don't want to set a trap for future people, so have all
>>    channels try to create (even if failing), and only fail after the
>>    channels_created semaphore has been posted.
>> 
>> While here, remove the error_report_err call. There's one already at
>> migrate_fd_cleanup later on.
>> 
>> Cc: qemu-stable@nongnu.org
>> Reported-by: Jim Fehlig <jfehlig@suse.com>
>> Fixes: bd8b0a8f82 ("migration/multifd: Move multifd_send_setup error handling in to the function")
>
> Should it be this one instead?
>
> b7b03eb614 ("migration/multifd: Add outgoing QIOChannelFile support")

Yep, thanks. I'll fix it up.

>
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>
> Reviewed-by: Peter Xu <peterx@redhat.com>
>
> PS: what's your plan on your other multifd SendData series?  I got a bit
> overloaded on downstream stuff and I still have plenty review debts
> recently (CPR one of them.. needs follow ups), so just to say I may delay a
> bit on reading that one.  I assume it's next-release stuff anyway, but let
> me know otherwise.

That one is pretty ready. From my side I don't intend to change anything
else, save for review comments. And it's definitely 9.2 material.

I think CPR is more important at this point because it's been lagging
behind for a while.

I have a PR to send with these fixes and catch up on that virtio-net
discussion. After that I should be able to get some reviews done.

>
> Thanks,


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Multifd fixes
  2024-08-01 17:40 [PATCH 0/2] Multifd fixes Fabiano Rosas
  2024-08-01 17:41 ` [PATCH 1/2] migration: Fix cleanup of iochannel in file migration Fabiano Rosas
  2024-08-01 17:41 ` [PATCH 2/2] migration/multifd: Fix multifd_send_setup cleanup when channel creation fails Fabiano Rosas
@ 2024-08-02 12:47 ` Fabiano Rosas
  2 siblings, 0 replies; 7+ messages in thread
From: Fabiano Rosas @ 2024-08-02 12:47 UTC (permalink / raw)
  To: qemu-devel, Fabiano Rosas; +Cc: Peter Xu

On Thu, 01 Aug 2024 14:40:59 -0300, Fabiano Rosas wrote:
> on mapped-ram for libvirt.
> 
> CI run: https://gitlab.com/farosas/qemu/-/pipelines/1397467740
> 
> Fabiano Rosas (2):
>   migration: Fix cleanup of iochannel in file migration
>   migration/multifd: Fix multifd_send_setup cleanup when channel
>     creation fails
> 
> [...]

Queued, thanks!


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-08-02 12:48 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-01 17:40 [PATCH 0/2] Multifd fixes Fabiano Rosas
2024-08-01 17:41 ` [PATCH 1/2] migration: Fix cleanup of iochannel in file migration Fabiano Rosas
2024-08-01 18:39   ` Peter Xu
2024-08-01 17:41 ` [PATCH 2/2] migration/multifd: Fix multifd_send_setup cleanup when channel creation fails Fabiano Rosas
2024-08-01 18:38   ` Peter Xu
2024-08-01 19:14     ` Fabiano Rosas
2024-08-02 12:47 ` [PATCH 0/2] Multifd fixes Fabiano Rosas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.