kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Yishai Hadas <yishaih@nvidia.com>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"bhelgaas@google.com" <bhelgaas@google.com>,
	"saeedm@nvidia.com" <saeedm@nvidia.com>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"leonro@nvidia.com" <leonro@nvidia.com>,
	"kwankhede@nvidia.com" <kwankhede@nvidia.com>,
	"mgurtovoy@nvidia.com" <mgurtovoy@nvidia.com>,
	"maorg@nvidia.com" <maorg@nvidia.com>,
	"Raj, Ashok" <ashok.raj@intel.com>,
	"shameerali.kolothum.thodi@huawei.com" 
	<shameerali.kolothum.thodi@huawei.com>
Subject: Re: [PATCH V7 mlx5-next 08/15] vfio: Define device migration protocol v2
Date: Tue, 15 Feb 2022 11:33:43 -0400	[thread overview]
Message-ID: <20220215153343.GA1046125@nvidia.com> (raw)
In-Reply-To: <BN9PR11MB5276B8754BA82E94CF288F8C8C349@BN9PR11MB5276.namprd11.prod.outlook.com>

On Tue, Feb 15, 2022 at 08:04:27AM +0000, Tian, Kevin wrote:
> > From: Yishai Hadas <yishaih@nvidia.com>
> > Sent: Tuesday, February 8, 2022 1:22 AM
> > 
> > +static int vfio_ioctl_device_feature_migration(struct vfio_device *device,
> > +					       u32 flags, void __user *arg,
> > +					       size_t argsz)
> > +{
> > +	struct vfio_device_feature_migration mig = {
> > +		.flags = VFIO_MIGRATION_STOP_COPY,
> > +	};
> > +	int ret;
> > +
> > +	if (!device->ops->migration_set_state)
> > +		return -ENOTTY;
> 
> Miss a check on migration_get_state, as done in last function.

Yep

> > + * @migration_set_state: Optional callback to change the migration state for
> > + *         devices that support migration. The returned FD is used for data
> > + *         transfer according to the FSM definition. The driver is responsible
> > + *         to ensure that FD is isolated whenever the migration FSM leaves a
> > + *         data transfer state or before close_device() returns.
> 
> didn't understand the meaning of 'isolated' here.

It is not a good word. Lets say 'that FD reaches end of stream or
error whenever'
 
> > +#define VFIO_DEVICE_STATE_V1_STOP      (0)
> > +#define VFIO_DEVICE_STATE_V1_RUNNING   (1 << 0)
> > +#define VFIO_DEVICE_STATE_V1_SAVING    (1 << 1)
> > +#define VFIO_DEVICE_STATE_V1_RESUMING  (1 << 2)
> > +#define VFIO_DEVICE_STATE_MASK      (VFIO_DEVICE_STATE_V1_RUNNING
> > | \
> > +				     VFIO_DEVICE_STATE_V1_SAVING |  \
> > +				     VFIO_DEVICE_STATE_V1_RESUMING)
> 
> Does it make sense to also add 'V1' to MASK and also following macros
> given their names are general?

No, the point of this exercise is to avoid trouble for qemu - the
fewest changes we can get away with the better.

Once qemu is updated we'll delete this old stuff from the kernel.

> > +/*
> > + * Indicates the device can support the migration API. See enum
> 
> call it V2? Not necessary to add V2 in code but worthy of a clarification
> in comment.

We've only called it 'v2' for discussions.

If you think it is unclear lets say 'support the migration API through
VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE'

> 
> > + * vfio_device_mig_state for details. If present flags must be non-zero and
> > + * VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE is supported.
> > + *
> > + * VFIO_MIGRATION_STOP_COPY means that RUNNING, STOP, STOP_COPY
> > and
> > + * RESUMING are supported.
> > + */
> 
> Not aligned with other places where 5 states are mentioned. Better add
> ERROR here.

ERROR is not a state that is 'supported'. It could be clarified that
ERROR and RUNNING are always supported.

> 
> > + *
> > + * RUNNING -> STOP
> > + * STOP_COPY -> STOP
> > + *   While in STOP the device must stop the operation of the device. The
> > + *   device must not generate interrupts, DMA, or advance its internal
> > + *   state. When stopped the device and kernel migration driver must accept
> > + *   and respond to interaction to support external subsystems in the STOP
> > + *   state, for example PCI MSI-X and PCI config pace. Failure by the user to
> > + *   restrict device access while in STOP must not result in error conditions
> > + *   outside the user context (ex. host system faults).
> 
> Right above the STOP state is defined as:
> 
>        *  STOP - The device does not change the internal or external state
> 
> 'external state' I assume means P2P activities. For consistency it is clearer
> to also say something about external state in above paragraph.

No, STOP is defined to halt all DMA. I tidied it a bit like this:

 *   While in STOP the device must stop the operation of the device. The device
 *   must not generate interrupts, DMA, or any other change to external state.
 *   It must not change its internal state.


> > + *
> > + *   The STOP_COPY arc will terminate a data transfer session.
>
> remove 'will'

will is correct grammar. It could be 'arc terminates'

> > + *
> > + * STOP -> STOP_COPY
> > + *   This arc begin the process of saving the device state and will return a
> > + *   new data_fd.
> > + *
> > + *   While in the STOP_COPY state the device has the same behavior as STOP
> > + *   with the addition that the data transfers session continues to stream the
> > + *   migration state. End of stream on the FD indicates the entire device
> > + *   state has been transferred.
> > + *
> > + *   The user should take steps to restrict access to vfio device regions while
> > + *   the device is in STOP_COPY or risk corruption of the device migration
> > data
> > + *   stream.
> 
> Restricting access has been explained in the to-STOP arcs and it is stated 
> that while in STOP_COPY the device has the same behavior as STOP. So 
> I think the last paragraph is possibly not required.

It is not the same, the language in STOP is saying that the device
must tolerate external touches without breaking the kernel

This language is saying if external touches happen then the device is
free to corrupt the migration stream.

In both cases we expect good userspace to not have device
touches, the guidance here is for driver authors about what kind of
steps they need to take to protect against hostile userspace.

> > + * STOP -> RESUMING
> > + *   Entering the RESUMING state starts a process of restoring the device
> > + *   state and will return a new data_fd. The data stream fed into the
> > data_fd
> > + *   should be taken from the data transfer output of the saving group states
> 
> No definition of 'group state' (maybe introduced in a later patch?)

Yes, it was added in the P2P patch

We can avoid talking about saving group here entirely, it really just
means a single FD.

 *    The data stream fed into the data_fd should
 *   be taken from the data transfer output of a single FD during saving on a
 *   from a compatible device.

Thanks,
Jason

  reply	other threads:[~2022-02-15 15:40 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-07 17:22 [PATCH V7 mlx5-next 00/15] Add mlx5 live migration driver and v2 migration protocol Yishai Hadas
2022-02-07 17:22 ` [PATCH V7 mlx5-next 01/15] PCI/IOV: Add pci_iov_vf_id() to get VF index Yishai Hadas
2022-02-07 17:22 ` [PATCH V7 mlx5-next 02/15] net/mlx5: Reuse exported virtfn index function call Yishai Hadas
2022-02-07 17:22 ` [PATCH V7 mlx5-next 03/15] net/mlx5: Disable SRIOV before PF removal Yishai Hadas
2022-02-07 17:22 ` [PATCH V7 mlx5-next 04/15] PCI/IOV: Add pci_iov_get_pf_drvdata() to allow VF reaching the drvdata of a PF Yishai Hadas
2022-02-07 17:22 ` [PATCH V7 mlx5-next 05/15] net/mlx5: Expose APIs to get/put the mlx5 core device Yishai Hadas
2022-02-07 17:22 ` [PATCH V7 mlx5-next 06/15] net/mlx5: Introduce migration bits and structures Yishai Hadas
2022-02-07 17:22 ` [PATCH V7 mlx5-next 07/15] vfio: Have the core code decode the VFIO_DEVICE_FEATURE ioctl Yishai Hadas
2022-02-07 17:22 ` [PATCH V7 mlx5-next 08/15] vfio: Define device migration protocol v2 Yishai Hadas
2022-02-09  0:07   ` Alex Williamson
2022-02-09  2:36     ` Jason Gunthorpe
2022-02-15 10:41       ` Tian, Kevin
2022-02-15 16:04         ` Jason Gunthorpe
2022-02-15 23:32           ` Alex Williamson
2022-02-16  1:17             ` Jason Gunthorpe
2022-02-16  3:17           ` Tian, Kevin
2022-02-16 12:14             ` Jason Gunthorpe
2022-02-17  2:29               ` Tian, Kevin
2022-02-15 10:58       ` Tian, Kevin
2022-02-15 13:13         ` Jason Gunthorpe
2022-02-15  8:04   ` Tian, Kevin
2022-02-15 15:33     ` Jason Gunthorpe [this message]
2022-02-16  3:04       ` Tian, Kevin
2022-02-07 17:22 ` [PATCH V7 mlx5-next 09/15] vfio: Extend the device migration protocol with RUNNING_P2P Yishai Hadas
2022-02-15 10:18   ` Tian, Kevin
2022-02-15 15:56     ` Jason Gunthorpe
2022-02-16  2:52       ` Tian, Kevin
2022-02-16 12:11         ` Jason Gunthorpe
2022-02-07 17:22 ` [PATCH V7 mlx5-next 10/15] vfio: Remove migration protocol v1 documentation Yishai Hadas
2022-02-11 11:03   ` Cornelia Huck
2022-02-07 17:22 ` [PATCH V7 mlx5-next 11/15] vfio/mlx5: Expose migration commands over mlx5 device Yishai Hadas
2022-02-07 17:22 ` [PATCH V7 mlx5-next 12/15] vfio/mlx5: Implement vfio_pci driver for mlx5 devices Yishai Hadas
2022-02-09  0:07   ` Alex Williamson
2022-02-07 17:22 ` [PATCH V7 mlx5-next 13/15] vfio/pci: Expose vfio_pci_core_aer_err_detected() Yishai Hadas
2022-02-07 17:22 ` [PATCH V7 mlx5-next 14/15] vfio/mlx5: Use its own PCI reset_done error handler Yishai Hadas
2022-02-09  0:08   ` Alex Williamson
2022-02-09  2:39     ` Jason Gunthorpe
2022-02-10 16:48       ` Alex Williamson
2022-02-10 17:27         ` Jason Gunthorpe
2022-02-07 17:22 ` [PATCH V7 mlx5-next 15/15] vfio: Extend the device migration protocol with PRE_COPY Yishai Hadas
2022-02-17 17:15   ` Alex Williamson
2022-02-18  0:03     ` Jason Gunthorpe
2022-02-18  8:01   ` Tian, Kevin
2022-02-18 14:06     ` Jason Gunthorpe
2022-02-22  1:43       ` Tian, Kevin
2022-02-22 15:50         ` Jason Gunthorpe
2022-02-23  0:40           ` Tian, Kevin
2022-02-23  0:44             ` Jason Gunthorpe
2022-02-23  1:46               ` Tian, Kevin
2022-02-18  8:11 ` [PATCH V7 mlx5-next 00/15] Add mlx5 live migration driver and v2 migration protocol Tarun Gupta (SW-GPU)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220215153343.GA1046125@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=bhelgaas@google.com \
    --cc=kevin.tian@intel.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=maorg@nvidia.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).