qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Kirti Wankhede <kwankhede@nvidia.com>
Cc: cohuck@redhat.com, cjia@nvidia.com, aik@ozlabs.ru,
	Zhengxiao.zx@Alibaba-inc.com, shuangtai.tst@alibaba-inc.com,
	qemu-devel@nongnu.org, peterx@redhat.com, eauger@redhat.com,
	yi.l.liu@intel.com, quintela@redhat.com, ziye.yang@intel.com,
	armbru@redhat.com, mlevitsk@redhat.com, pasic@linux.ibm.com,
	felipe@nutanix.com, zhi.a.wang@intel.com, kevin.tian@intel.com,
	yan.y.zhao@intel.com, dgilbert@redhat.com,
	changpeng.liu@intel.com, eskultet@redhat.com, Ken.Xue@amd.com,
	jonathan.davies@nutanix.com, pbonzini@redhat.com
Subject: Re: [PATCH QEMU v25 07/17] vfio: Register SaveVMHandlers for VFIO device
Date: Tue, 23 Jun 2020 13:50:56 -0600	[thread overview]
Message-ID: <20200623135056.0c4957a3@x1.home> (raw)
In-Reply-To: <542ced5e-0380-19b9-91d2-5f40c5857719@nvidia.com>

On Wed, 24 Jun 2020 00:51:06 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:

> On 6/23/2020 4:20 AM, Alex Williamson wrote:
> > On Sun, 21 Jun 2020 01:51:16 +0530
> > Kirti Wankhede <kwankhede@nvidia.com> wrote:
> >   
> >> Define flags to be used as delimeter in migration file stream.
> >> Added .save_setup and .save_cleanup functions. Mapped & unmapped migration
> >> region from these functions at source during saving or pre-copy phase.
> >> Set VFIO device state depending on VM's state. During live migration, VM is
> >> running when .save_setup is called, _SAVING | _RUNNING state is set for VFIO
> >> device. During save-restore, VM is paused, _SAVING state is set for VFIO device.
> >>
> >> Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
> >> Reviewed-by: Neo Jia <cjia@nvidia.com>
> >> ---
> >>   hw/vfio/migration.c  | 92 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>   hw/vfio/trace-events |  2 ++
> >>   2 files changed, 94 insertions(+)
> >>
> >> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> >> index e30bd8768701..133bb5b1b3b2 100644
> >> --- a/hw/vfio/migration.c
> >> +++ b/hw/vfio/migration.c
> >> @@ -8,12 +8,15 @@
> >>    */
> >>   
> >>   #include "qemu/osdep.h"
> >> +#include "qemu/main-loop.h"
> >> +#include "qemu/cutils.h"
> >>   #include <linux/vfio.h>
> >>   
> >>   #include "sysemu/runstate.h"
> >>   #include "hw/vfio/vfio-common.h"
> >>   #include "cpu.h"
> >>   #include "migration/migration.h"
> >> +#include "migration/vmstate.h"
> >>   #include "migration/qemu-file.h"
> >>   #include "migration/register.h"
> >>   #include "migration/blocker.h"
> >> @@ -24,6 +27,17 @@
> >>   #include "pci.h"
> >>   #include "trace.h"
> >>   
> >> +/*
> >> + * Flags used as delimiter:
> >> + * 0xffffffff => MSB 32-bit all 1s
> >> + * 0xef10     => emulated (virtual) function IO
> >> + * 0x0000     => 16-bits reserved for flags
> >> + */
> >> +#define VFIO_MIG_FLAG_END_OF_STATE      (0xffffffffef100001ULL)
> >> +#define VFIO_MIG_FLAG_DEV_CONFIG_STATE  (0xffffffffef100002ULL)
> >> +#define VFIO_MIG_FLAG_DEV_SETUP_STATE   (0xffffffffef100003ULL)
> >> +#define VFIO_MIG_FLAG_DEV_DATA_STATE    (0xffffffffef100004ULL)
> >> +
> >>   static void vfio_migration_region_exit(VFIODevice *vbasedev)
> >>   {
> >>       VFIOMigration *migration = vbasedev->migration;
> >> @@ -126,6 +140,65 @@ static int vfio_migration_set_state(VFIODevice *vbasedev, uint32_t mask,
> >>       return 0;
> >>   }
> >>   
> >> +/* ---------------------------------------------------------------------- */
> >> +
> >> +static int vfio_save_setup(QEMUFile *f, void *opaque)
> >> +{
> >> +    VFIODevice *vbasedev = opaque;
> >> +    VFIOMigration *migration = vbasedev->migration;
> >> +    int ret;
> >> +
> >> +    trace_vfio_save_setup(vbasedev->name);
> >> +
> >> +    qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
> >> +
> >> +    if (migration->region.mmaps) {
> >> +        qemu_mutex_lock_iothread();
> >> +        ret = vfio_region_mmap(&migration->region);
> >> +        qemu_mutex_unlock_iothread();
> >> +        if (ret) {
> >> +            error_report("%s: Failed to mmap VFIO migration region %d: %s",
> >> +                         vbasedev->name, migration->region.nr,
> >> +                         strerror(-ret));
> >> +            return ret;  
> > 
> > OTOH to my previous comments, this shouldn't be fatal, right?  mmaps
> > are optional anyway so it should be sufficient to push an error report
> > to explain why this might be slower than normal, but we can still
> > proceed.
> >   
> 
> Right, defining region to be sparse mmap is optional.
> migration->region.mmaps is set if vendor driver defines sparse mmapable 
> regions and VFIO_REGION_INFO_FLAG_MMAP flag is set. If this flag is set 
> then error on mmap() should be fatal.
> 
> If there is not mmapable region, then migration will proceed.

It's both optional for the vendor to define sparse mmap support (or any
mmap support) and optional for the user to make use of it.  The user
can recover from an mmap failure by using read/write accesses.  The
vendor MUST support this.  It doesn't make sense to worry about
aborting the VM in replying to comments for 05/17, where it's not clear
how we proceed, yet intentionally cause a fatal error here when there
is a very clear path to proceed.

> >> +        }
> >> +    }
> >> +
> >> +    ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_MASK,
> >> +                                   VFIO_DEVICE_STATE_SAVING);
> >> +    if (ret) {
> >> +        error_report("%s: Failed to set state SAVING", vbasedev->name);
> >> +        return ret;
> >> +    }  
> > 
> > We seem to be lacking support in the callers for detecting if the
> > device is in an error state.  I'm not sure what our options are
> > though, maybe only a hw_error().
> >   
> 
> Returning error here fails migration process. And if device is in error 
> state, any application running inside VM using this device would fail.
> I think, there is no need to take any special action here by detecting 
> device error state.

If QEMU knows a device has failed, it seems like it would make sense to
stop the VM, otherwise we risk an essentially endless assortment of
ways that the user might notice the guest isn't behaving normally, some
maybe even causing the user to lose data.  Thanks,

Alex
 
> >> +
> >> +    qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
> >> +
> >> +    ret = qemu_file_get_error(f);
> >> +    if (ret) {
> >> +        return ret;
> >> +    }
> >> +
> >> +    return 0;
> >> +}
> >> +
> >> +static void vfio_save_cleanup(void *opaque)
> >> +{
> >> +    VFIODevice *vbasedev = opaque;
> >> +    VFIOMigration *migration = vbasedev->migration;
> >> +
> >> +    if (migration->region.mmaps) {
> >> +        vfio_region_unmap(&migration->region);
> >> +    }
> >> +    trace_vfio_save_cleanup(vbasedev->name);
> >> +}
> >> +
> >> +static SaveVMHandlers savevm_vfio_handlers = {
> >> +    .save_setup = vfio_save_setup,
> >> +    .save_cleanup = vfio_save_cleanup,
> >> +};
> >> +
> >> +/* ---------------------------------------------------------------------- */
> >> +
> >>   static void vfio_vmstate_change(void *opaque, int running, RunState state)
> >>   {
> >>       VFIODevice *vbasedev = opaque;
> >> @@ -180,6 +253,7 @@ static int vfio_migration_init(VFIODevice *vbasedev,
> >>                                  struct vfio_region_info *info)
> >>   {
> >>       int ret;
> >> +    char id[256] = "";
> >>   
> >>       vbasedev->migration = g_new0(VFIOMigration, 1);
> >>   
> >> @@ -192,6 +266,24 @@ static int vfio_migration_init(VFIODevice *vbasedev,
> >>           return ret;
> >>       }
> >>   
> >> +    if (vbasedev->ops->vfio_get_object) {  
> > 
> > Nit, vfio_migration_region_init() would have failed already if this were
> > not available.  Perhaps do the test once at the start of this function
> > instead?  Thanks,
> >   
> 
> Ok, will do that.
> 
> Thanks,
> Kirti
> 
> 
> > Alex
> >   
> >> +        Object *obj = vbasedev->ops->vfio_get_object(vbasedev);
> >> +
> >> +        if (obj) {
> >> +            DeviceState *dev = DEVICE(obj);
> >> +            char *oid = vmstate_if_get_id(VMSTATE_IF(dev));
> >> +
> >> +            if (oid) {
> >> +                pstrcpy(id, sizeof(id), oid);
> >> +                pstrcat(id, sizeof(id), "/");
> >> +                g_free(oid);
> >> +            }
> >> +        }
> >> +    }
> >> +    pstrcat(id, sizeof(id), "vfio");
> >> +
> >> +    register_savevm_live(id, VMSTATE_INSTANCE_ID_ANY, 1, &savevm_vfio_handlers,
> >> +                         vbasedev);
> >>       vbasedev->vm_state = qemu_add_vm_change_state_handler(vfio_vmstate_change,
> >>                                                             vbasedev);
> >>       vbasedev->migration_state.notify = vfio_migration_state_notifier;
> >> diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
> >> index bd3d47b005cb..86c18def016e 100644
> >> --- a/hw/vfio/trace-events
> >> +++ b/hw/vfio/trace-events
> >> @@ -149,3 +149,5 @@ vfio_migration_probe(const char *name, uint32_t index) " (%s) Region %d"
> >>   vfio_migration_set_state(const char *name, uint32_t state) " (%s) state %d"
> >>   vfio_vmstate_change(const char *name, int running, const char *reason, uint32_t dev_state) " (%s) running %d reason %s device state %d"
> >>   vfio_migration_state_notifier(const char *name, const char *state) " (%s) state %s"
> >> +vfio_save_setup(const char *name) " (%s)"
> >> +vfio_save_cleanup(const char *name) " (%s)"  
> >   
> 



  reply	other threads:[~2020-06-23 20:14 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-20 20:21 [PATCH QEMU v25 00/17] Add migration support for VFIO devices Kirti Wankhede
2020-06-20 20:21 ` [PATCH QEMU v25 01/17] vfio: Add function to unmap VFIO region Kirti Wankhede
2020-06-20 20:21 ` [PATCH QEMU v25 02/17] vfio: Add vfio_get_object callback to VFIODeviceOps Kirti Wankhede
2020-06-20 20:21 ` [PATCH QEMU v25 03/17] vfio: Add save and load functions for VFIO PCI devices Kirti Wankhede
2020-06-22 20:28   ` Alex Williamson
2020-06-24 14:29     ` Kirti Wankhede
2020-06-24 19:49       ` Alex Williamson
2020-06-26 12:16         ` Dr. David Alan Gilbert
2020-06-26 22:44           ` Alex Williamson
2020-06-29  9:59             ` Dr. David Alan Gilbert
2020-06-20 20:21 ` [PATCH QEMU v25 04/17] vfio: Add migration region initialization and finalize function Kirti Wankhede
2020-06-23  7:54   ` Cornelia Huck
2020-06-20 20:21 ` [PATCH QEMU v25 05/17] vfio: Add VM state change handler to know state of VM Kirti Wankhede
2020-06-22 22:50   ` Alex Williamson
2020-06-23 18:55     ` Kirti Wankhede
2020-06-26 14:51       ` Dr. David Alan Gilbert
2020-06-23  8:07   ` Cornelia Huck
2020-06-20 20:21 ` [PATCH QEMU v25 06/17] vfio: Add migration state change notifier Kirti Wankhede
2020-06-23  8:10   ` Cornelia Huck
2020-06-20 20:21 ` [PATCH QEMU v25 07/17] vfio: Register SaveVMHandlers for VFIO device Kirti Wankhede
2020-06-22 22:50   ` Alex Williamson
2020-06-23 19:21     ` Kirti Wankhede
2020-06-23 19:50       ` Alex Williamson [this message]
2020-06-26 14:22         ` Dr. David Alan Gilbert
2020-06-26 14:31   ` Dr. David Alan Gilbert
2020-06-20 20:21 ` [PATCH QEMU v25 08/17] vfio: Add save state functions to SaveVMHandlers Kirti Wankhede
2020-06-22 22:50   ` Alex Williamson
2020-06-23 20:34     ` Kirti Wankhede
2020-06-23 20:40       ` Alex Williamson
2020-06-20 20:21 ` [PATCH QEMU v25 09/17] vfio: Add load " Kirti Wankhede
2020-06-24 18:54   ` Alex Williamson
2020-06-25 14:16     ` Kirti Wankhede
2020-06-25 14:57       ` Alex Williamson
2020-06-26 14:54     ` Dr. David Alan Gilbert
2020-06-20 20:21 ` [PATCH QEMU v25 10/17] memory: Set DIRTY_MEMORY_MIGRATION when IOMMU is enabled Kirti Wankhede
2020-06-20 20:21 ` [PATCH QEMU v25 11/17] vfio: Get migration capability flags for container Kirti Wankhede
2020-06-24  8:43   ` Cornelia Huck
2020-06-24 18:55   ` Alex Williamson
2020-06-25 14:09     ` Kirti Wankhede
2020-06-25 14:56       ` Alex Williamson
2020-06-20 20:21 ` [PATCH QEMU v25 12/17] vfio: Add function to start and stop dirty pages tracking Kirti Wankhede
2020-06-23 10:32   ` Cornelia Huck
2020-06-23 11:01     ` Dr. David Alan Gilbert
2020-06-23 11:06       ` Cornelia Huck
2020-06-24 18:55   ` Alex Williamson
2020-06-20 20:21 ` [PATCH QEMU v25 13/17] vfio: create mapped iova list when vIOMMU is enabled Kirti Wankhede
2020-06-24 18:55   ` Alex Williamson
2020-06-25 14:34     ` Kirti Wankhede
2020-06-25 17:40       ` Alex Williamson
2020-06-26 14:43         ` Peter Xu
2020-06-20 20:21 ` [PATCH QEMU v25 14/17] vfio: Add vfio_listener_log_sync to mark dirty pages Kirti Wankhede
2020-06-24 18:55   ` Alex Williamson
2020-06-25 14:43     ` Kirti Wankhede
2020-06-25 17:57       ` Alex Williamson
2020-06-20 20:21 ` [PATCH QEMU v25 15/17] vfio: Add ioctl to get dirty pages bitmap during dma unmap Kirti Wankhede
2020-06-23  8:25   ` Cornelia Huck
2020-06-24 18:56   ` Alex Williamson
2020-06-25 15:01     ` Kirti Wankhede
2020-06-25 19:18       ` Alex Williamson
2020-06-26 14:15         ` Dr. David Alan Gilbert
2020-06-20 20:21 ` [PATCH QEMU v25 16/17] vfio: Make vfio-pci device migration capable Kirti Wankhede
2020-06-22 16:51   ` Cornelia Huck
2020-06-20 20:21 ` [PATCH QEMU v25 17/17] qapi: Add VFIO devices migration stats in Migration stats Kirti Wankhede
2020-06-23  7:21   ` Markus Armbruster
2020-06-23 21:16     ` Kirti Wankhede
2020-06-25  5:51       ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200623135056.0c4957a3@x1.home \
    --to=alex.williamson@redhat.com \
    --cc=Ken.Xue@amd.com \
    --cc=Zhengxiao.zx@Alibaba-inc.com \
    --cc=aik@ozlabs.ru \
    --cc=armbru@redhat.com \
    --cc=changpeng.liu@intel.com \
    --cc=cjia@nvidia.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=eauger@redhat.com \
    --cc=eskultet@redhat.com \
    --cc=felipe@nutanix.com \
    --cc=jonathan.davies@nutanix.com \
    --cc=kevin.tian@intel.com \
    --cc=kwankhede@nvidia.com \
    --cc=mlevitsk@redhat.com \
    --cc=pasic@linux.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=shuangtai.tst@alibaba-inc.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhi.a.wang@intel.com \
    --cc=ziye.yang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).