qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Avihai Horon <avihaih@nvidia.com>
Cc: qemu-devel@nongnu.org, "Halil Pasic" <pasic@linux.ibm.com>,
	"Christian Borntraeger" <borntraeger@linux.ibm.com>,
	"Eric Farman" <farman@linux.ibm.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"David Hildenbrand" <david@redhat.com>,
	"Ilya Leoshkevich" <iii@linux.ibm.com>,
	"Thomas Huth" <thuth@redhat.com>,
	"Juan Quintela" <quintela@redhat.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	"Cornelia Huck" <cohuck@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Fam Zheng" <fam@euphon.net>, "Eric Blake" <eblake@redhat.com>,
	"Vladimir Sementsov-Ogievskiy" <vsementsov@yandex-team.ru>,
	"John Snow" <jsnow@redhat.com>,
	"Cédric Le Goater" <clg@redhat.com>,
	qemu-s390x@nongnu.org, qemu-block@nongnu.org,
	"Yishai Hadas" <yishaih@nvidia.com>,
	"Jason Gunthorpe" <jgg@nvidia.com>,
	"Maor Gottlieb" <maorg@nvidia.com>,
	"Kirti Wankhede" <kwankhede@nvidia.com>,
	"Tarun Gupta" <targupta@nvidia.com>,
	"Joao Martins" <joao.m.martins@oracle.com>
Subject: Re: [PATCH v9 07/14] vfio/migration: Block multiple devices migration
Date: Wed, 8 Feb 2023 10:22:42 -0700	[thread overview]
Message-ID: <20230208102242.5d028021.alex.williamson@redhat.com> (raw)
In-Reply-To: <238b17d1-17a3-e5d1-2973-4bda83928d6e@nvidia.com>

On Wed, 8 Feb 2023 15:08:15 +0200
Avihai Horon <avihaih@nvidia.com> wrote:

> On 08/02/2023 0:34, Alex Williamson wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On Mon, 6 Feb 2023 14:31:30 +0200
> > Avihai Horon <avihaih@nvidia.com> wrote:
> >  
> >> Currently VFIO migration doesn't implement some kind of intermediate
> >> quiescent state in which P2P DMAs are quiesced before stopping or
> >> running the device. This can cause problems in multi-device migration
> >> where the devices are doing P2P DMAs, since the devices are not stopped
> >> together at the same time.
> >>
> >> Until such support is added, block migration of multiple devices.
> >>
> >> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> >> ---
> >>   include/hw/vfio/vfio-common.h |  2 ++
> >>   hw/vfio/common.c              | 51 +++++++++++++++++++++++++++++++++++
> >>   hw/vfio/migration.c           |  6 +++++
> >>   3 files changed, 59 insertions(+)
> >>
> >> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> >> index e573f5a9f1..56b1683824 100644
> >> --- a/include/hw/vfio/vfio-common.h
> >> +++ b/include/hw/vfio/vfio-common.h
> >> @@ -218,6 +218,8 @@ typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList;
> >>   extern VFIOGroupList vfio_group_list;
> >>
> >>   bool vfio_mig_active(void);
> >> +int vfio_block_multiple_devices_migration(Error **errp);
> >> +void vfio_unblock_multiple_devices_migration(void);
> >>   int64_t vfio_mig_bytes_transferred(void);
> >>
> >>   #ifdef CONFIG_LINUX
> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> >> index 3a35f4afad..01db41b735 100644
> >> --- a/hw/vfio/common.c
> >> +++ b/hw/vfio/common.c
> >> @@ -41,6 +41,7 @@
> >>   #include "qapi/error.h"
> >>   #include "migration/migration.h"
> >>   #include "migration/misc.h"
> >> +#include "migration/blocker.h"
> >>   #include "sysemu/tpm.h"
> >>
> >>   VFIOGroupList vfio_group_list =
> >> @@ -337,6 +338,56 @@ bool vfio_mig_active(void)
> >>       return true;
> >>   }
> >>
> >> +Error *multiple_devices_migration_blocker;
> >> +
> >> +static unsigned int vfio_migratable_device_num(void)
> >> +{
> >> +    VFIOGroup *group;
> >> +    VFIODevice *vbasedev;
> >> +    unsigned int device_num = 0;
> >> +
> >> +    QLIST_FOREACH(group, &vfio_group_list, next) {
> >> +        QLIST_FOREACH(vbasedev, &group->device_list, next) {
> >> +            if (vbasedev->migration) {
> >> +                device_num++;
> >> +            }
> >> +        }
> >> +    }
> >> +
> >> +    return device_num;
> >> +}
> >> +
> >> +int vfio_block_multiple_devices_migration(Error **errp)
> >> +{
> >> +    int ret;
> >> +
> >> +    if (vfio_migratable_device_num() != 2) {
> >> +        return 0;
> >> +    }
> >> +
> >> +    error_setg(&multiple_devices_migration_blocker,
> >> +               "Migration is currently not supported with multiple "
> >> +               "VFIO devices");
> >> +    ret = migrate_add_blocker(multiple_devices_migration_blocker, errp);
> >> +    if (ret < 0) {
> >> +        error_free(multiple_devices_migration_blocker);
> >> +        multiple_devices_migration_blocker = NULL;
> >> +    }
> >> +
> >> +    return ret;
> >> +}
> >> +
> >> +void vfio_unblock_multiple_devices_migration(void)
> >> +{
> >> +    if (vfio_migratable_device_num() != 2) {
> >> +        return;
> >> +    }
> >> +
> >> +    migrate_del_blocker(multiple_devices_migration_blocker);
> >> +    error_free(multiple_devices_migration_blocker);
> >> +    multiple_devices_migration_blocker = NULL;
> >> +}  
> > A couple awkward things here.  First I wish we could do something
> > cleaner or more intuitive than the != 2 test.  I get that we're trying
> > to do this on the addition of the 2nd device supporting migration, or
> > the removal of the next to last device independent of all other devices,
> > but I wonder if it wouldn't be better to remove the multiple-device
> > blocker after migration is torn down for the device so we can test
> > device >1 or ==1 in combination with whether
> > multiple_devices_migration_blocker is NULL.
> >
> > Which comes to the second awkwardness, if we fail to add the blocker we
> > free and clear the blocker, but when we tear down the device due to that
> > failure we'll remove the blocker that doesn't exist, free NULL, and
> > clear it again.  Thanks to the glib slist the migration blocker is
> > using, I think that all works, but I'd rather not be dependent on that
> > implementation to avoid a segfault here.  Incorporating a test of
> > multiple_devices_migration_blocker as above would avoid this too.  
> 
> You mean something like this?
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 3a35f4afad..f3e08eff58 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> 
> [...]
> 
> +int vfio_block_multiple_devices_migration(Error **errp)
> +{
> +    int ret;
> +
> +    if (vfio_migratable_device_num() <= 1 ||
> +        multiple_devices_migration_blocker) {
> +        return 0;
> +    }

Nit, I'd reverse the order of the test here and below, otherwise yes,
this is what I was thinking of.

> +
> +    error_setg(&multiple_devices_migration_blocker,
> +               "Migration is currently not supported with multiple "
> +               "VFIO devices");
> +    ret = migrate_add_blocker(multiple_devices_migration_blocker, errp);
> +    if (ret < 0) {
> +        error_free(multiple_devices_migration_blocker);
> +        multiple_devices_migration_blocker = NULL;
> +    }
> +
> +    return ret;
> +}
> +
> +void vfio_unblock_multiple_devices_migration(void)
> +{
> +    if (vfio_migratable_device_num() > 1 ||
> +        !multiple_devices_migration_blocker) {
> +        return;
> +    }
> +
> +    migrate_del_blocker(multiple_devices_migration_blocker);
> +    error_free(multiple_devices_migration_blocker);
> +    multiple_devices_migration_blocker = NULL;
> +}
> +
>   static bool vfio_devices_all_dirty_tracking(VFIOContainer *container)
>   {
>       VFIOGroup *group;
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 552c2313b2..15b446c0ec 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -880,6 +880,11 @@ int vfio_migration_probe(VFIODevice *vbasedev, 
> Error **errp)
>           goto add_blocker;
>       }
> 
> +    ret = vfio_block_multiple_devices_migration(errp);
> +    if (ret) {
> +        return ret;
> +    }
> +
>       trace_vfio_migration_probe(vbasedev->name, info->index);
>       g_free(info);
>       return 0;
> @@ -906,6 +911,7 @@ void vfio_migration_finalize(VFIODevice *vbasedev)
>           qemu_del_vm_change_state_handler(migration->vm_state);
>           unregister_savevm(VMSTATE_IF(vbasedev->dev), "vfio", vbasedev);
>           vfio_migration_exit(vbasedev);
> +        vfio_unblock_multiple_devices_migration();
>       }
> 
>       if (vbasedev->migration_blocker) {
> 
> 
> Maybe also negate the if conditions and put the add/remove blocker code 
> inside it? Is it more readable this way?

I think the previous aligns more with the success oriented flow that
Jason like to promote.  Thanks,

Alex

> E.g.:
> 
> +int vfio_block_multiple_devices_migration(Error **errp)
> +{
> +    int ret = 0;
> +
> +    if (vfio_migratable_device_num() > 1 &&
> +        !multiple_devices_migration_blocker) {
> +        error_setg(&multiple_devices_migration_blocker,
> +                   "Migration is currently not supported with multiple "
> +                   "VFIO devices");
> +        ret = migrate_add_blocker(multiple_devices_migration_blocker, 
> errp);
> +        if (ret < 0) {
> +            error_free(multiple_devices_migration_blocker);
> +            multiple_devices_migration_blocker = NULL;
> +        }
> +    }
> +
> +    return ret;
> +}
> +
> +void vfio_unblock_multiple_devices_migration(void)
> +{
> +    if (vfio_migratable_device_num() <= 1 &&
> +        multiple_devices_migration_blocker) {
> +        migrate_del_blocker(multiple_devices_migration_blocker);
> +        error_free(multiple_devices_migration_blocker);
> +        multiple_devices_migration_blocker = NULL;
> +    }
> +}
> 
> Thanks.
> 



  parent reply	other threads:[~2023-02-08 17:23 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-06 12:31 [PATCH v9 00/14] vfio/migration: Implement VFIO migration protocol v2 Avihai Horon
2023-02-06 12:31 ` [PATCH v9 01/14] linux-headers: Update to v6.2-rc1 Avihai Horon
2023-02-06 12:31 ` [PATCH v9 02/14] migration: No save_live_pending() method uses the QEMUFile parameter Avihai Horon
2023-02-06 12:31 ` [PATCH v9 03/14] vfio/migration: Fix NULL pointer dereference bug Avihai Horon
2023-02-06 12:31 ` [PATCH v9 04/14] vfio/migration: Allow migration without VFIO IOMMU dirty tracking support Avihai Horon
2023-02-06 12:31 ` [PATCH v9 05/14] migration/qemu-file: Add qemu_file_get_to_fd() Avihai Horon
2023-02-06 12:31 ` [PATCH v9 06/14] vfio/common: Change vfio_devices_all_running_and_saving() logic to equivalent one Avihai Horon
2023-02-06 12:31 ` [PATCH v9 07/14] vfio/migration: Block multiple devices migration Avihai Horon
2023-02-07 22:34   ` Alex Williamson
2023-02-08 13:08     ` Avihai Horon
2023-02-08 16:44       ` Cédric Le Goater
2023-02-08 17:16         ` Avihai Horon
2023-02-08 17:22       ` Alex Williamson [this message]
2023-02-08 17:35         ` Avihai Horon
2023-02-06 12:31 ` [PATCH v9 08/14] vfio/migration: Move migration v1 logic to vfio_migration_init() Avihai Horon
2023-02-06 12:31 ` [PATCH v9 09/14] vfio/migration: Rename functions/structs related to v1 protocol Avihai Horon
2023-02-06 12:31 ` [PATCH v9 10/14] vfio/migration: Implement VFIO migration protocol v2 Avihai Horon
2023-02-07 23:42   ` Alex Williamson
2023-02-08 13:15     ` Avihai Horon
2023-02-06 12:31 ` [PATCH v9 11/14] vfio/migration: Optimize vfio_save_pending() Avihai Horon
2023-02-06 12:31 ` [PATCH v9 12/14] vfio/migration: Remove VFIO migration protocol v1 Avihai Horon
2023-02-06 12:31 ` [PATCH v9 13/14] vfio: Alphabetize migration section of VFIO trace-events file Avihai Horon
2023-02-06 12:31 ` [PATCH v9 14/14] docs/devel: Align VFIO migration docs to v2 protocol Avihai Horon
2023-02-07 23:49   ` Alex Williamson
2023-02-08 13:18     ` Avihai Horon
2023-02-08 17:25   ` Cédric Le Goater
2023-02-08 17:40     ` Avihai Horon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230208102242.5d028021.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=clg@redhat.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eblake@redhat.com \
    --cc=fam@euphon.net \
    --cc=farman@linux.ibm.com \
    --cc=iii@linux.ibm.com \
    --cc=jgg@nvidia.com \
    --cc=joao.m.martins@oracle.com \
    --cc=jsnow@redhat.com \
    --cc=kwankhede@nvidia.com \
    --cc=maorg@nvidia.com \
    --cc=mst@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@redhat.com \
    --cc=targupta@nvidia.com \
    --cc=thuth@redhat.com \
    --cc=vsementsov@yandex-team.ru \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).