linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Yishai Hadas <yishaih@nvidia.com>
Cc: <bhelgaas@google.com>, <jgg@nvidia.com>, <saeedm@nvidia.com>,
	<linux-pci@vger.kernel.org>, <kvm@vger.kernel.org>,
	<netdev@vger.kernel.org>, <kuba@kernel.org>, <leonro@nvidia.com>,
	<kwankhede@nvidia.com>, <mgurtovoy@nvidia.com>,
	<maorg@nvidia.com>, <cohuck@redhat.com>, <ashok.raj@intel.com>,
	<kevin.tian@intel.com>, <shameerali.kolothum.thodi@huawei.com>
Subject: Re: [PATCH V8 mlx5-next 09/15] vfio: Define device migration protocol v2
Date: Tue, 22 Feb 2022 16:53:00 -0700	[thread overview]
Message-ID: <20220222165300.4a8dd044.alex.williamson@redhat.com> (raw)
In-Reply-To: <20220220095716.153757-10-yishaih@nvidia.com>

On Sun, 20 Feb 2022 11:57:10 +0200
Yishai Hadas <yishaih@nvidia.com> wrote:

> From: Jason Gunthorpe <jgg@nvidia.com>
> 
> Replace the existing region based migration protocol with an ioctl based
> protocol. The two protocols have the same general semantic behaviors, but
> the way the data is transported is changed.
> 
> This is the STOP_COPY portion of the new protocol, it defines the 5 states
> for basic stop and copy migration and the protocol to move the migration
> data in/out of the kernel.
> 
> Compared to the clarification of the v1 protocol Alex proposed:
> 
> https://lore.kernel.org/r/163909282574.728533.7460416142511440919.stgit@omen
> 
> This has a few deliberate functional differences:
> 
>  - ERROR arcs allow the device function to remain unchanged.
> 
>  - The protocol is not required to return to the original state on
>    transition failure. Instead userspace can execute an unwind back to
>    the original state, reset, or do something else without needing kernel
>    support. This simplifies the kernel design and should userspace choose
>    a policy like always reset, avoids doing useless work in the kernel
>    on error handling paths.
> 
>  - PRE_COPY is made optional, userspace must discover it before using it.
>    This reflects the fact that the majority of drivers we are aware of
>    right now will not implement PRE_COPY.
> 
>  - segmentation is not part of the data stream protocol, the receiver
>    does not have to reproduce the framing boundaries.

I'm not sure how to reconcile the statement above with:

	"The user must consider the migration data segments carried
	 over the FD to be opaque and non-fungible. During RESUMING, the
	 data segments must be written in the same order they came out
	 of the saving side FD."

This is subtly conflicting that it's not segmented, but segments must
be written in order.  We'll naturally have some segmentation due to
buffering in kernel and userspace, but I think referring to it as a
stream suggests that the user can cut and join segments arbitrarily so
long as byte order is preserved, right?  I suspect the commit log
comment is referring to the driver imposed segmentation and framing
relative to region offsets.

Maybe something like:

	"The user must consider the migration data stream carried over
	 the FD to be opaque and must preserve the byte order of the
	 stream.  The user is not required to preserve buffer
	 segmentation when writing the data stream during the RESUMING
	 operation."

This statement also gives me pause relative to Jason's comments
regarding async support:

> + * The kernel migration driver must fully transition the device to the new state
> + * value before the operation returns to the user.

The above statement certainly doesn't preclude asynchronous
availability of data on the stream FD, but it does demand that the
device state transition itself is synchronous and can cannot be
shortcut.  If the state transition itself exceeds migration SLAs, we're
in a pickle.  Thanks,

Alex


  parent reply	other threads:[~2022-02-22 23:53 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-20  9:57 [PATCH V8 mlx5-next 00/15] Add mlx5 live migration driver and v2 migration protocol Yishai Hadas
2022-02-20  9:57 ` [PATCH V8 mlx5-next 01/15] PCI/IOV: Add pci_iov_vf_id() to get VF index Yishai Hadas
2022-02-20  9:57 ` [PATCH V8 mlx5-next 02/15] net/mlx5: Reuse exported virtfn index function call Yishai Hadas
2022-02-20  9:57 ` [PATCH V8 mlx5-next 03/15] net/mlx5: Disable SRIOV before PF removal Yishai Hadas
2022-02-20  9:57 ` [PATCH V8 mlx5-next 04/15] PCI/IOV: Add pci_iov_get_pf_drvdata() to allow VF reaching the drvdata of a PF Yishai Hadas
2022-02-20  9:57 ` [PATCH V8 mlx5-next 05/15] net/mlx5: Expose APIs to get/put the mlx5 core device Yishai Hadas
2022-02-20  9:57 ` [PATCH V8 mlx5-next 06/15] net/mlx5: Introduce migration bits and structures Yishai Hadas
2022-02-23 19:09   ` Alex Williamson
2022-02-23 19:17     ` Jason Gunthorpe
2022-02-20  9:57 ` [PATCH V8 mlx5-next 07/15] net/mlx5: Add migration commands definitions Yishai Hadas
2022-02-20  9:57 ` [PATCH V8 mlx5-next 08/15] vfio: Have the core code decode the VFIO_DEVICE_FEATURE ioctl Yishai Hadas
2022-02-22 16:48   ` Cornelia Huck
2022-02-22 18:13     ` Jason Gunthorpe
2022-02-20  9:57 ` [PATCH V8 mlx5-next 09/15] vfio: Define device migration protocol v2 Yishai Hadas
2022-02-22  1:55   ` Tian, Kevin
2022-02-22 23:53   ` Alex Williamson [this message]
2022-02-23  0:21     ` Jason Gunthorpe
2022-02-23  1:09       ` Alex Williamson
2022-02-23  2:02         ` Tian, Kevin
2022-02-23  2:47         ` Jason Gunthorpe
2022-02-23 17:06   ` Cornelia Huck
2022-02-24  0:46     ` Jason Gunthorpe
2022-02-24 10:41       ` Cornelia Huck
2022-02-24 12:39         ` Jason Gunthorpe
2022-02-20  9:57 ` [PATCH V8 mlx5-next 10/15] vfio: Extend the device migration protocol with RUNNING_P2P Yishai Hadas
2022-02-22  2:00   ` Tian, Kevin
2022-02-23 17:42   ` Alex Williamson
2022-02-24  0:47     ` Jason Gunthorpe
2022-02-20  9:57 ` [PATCH V8 mlx5-next 11/15] vfio: Remove migration protocol v1 documentation Yishai Hadas
2022-02-20  9:57 ` [PATCH V8 mlx5-next 12/15] vfio/mlx5: Expose migration commands over mlx5 device Yishai Hadas
2022-02-20  9:57 ` [PATCH V8 mlx5-next 13/15] vfio/mlx5: Implement vfio_pci driver for mlx5 devices Yishai Hadas
2022-02-20  9:57 ` [PATCH V8 mlx5-next 14/15] vfio/pci: Expose vfio_pci_core_aer_err_detected() Yishai Hadas
2022-02-20  9:57 ` [PATCH V8 mlx5-next 15/15] vfio/mlx5: Use its own PCI reset_done error handler Yishai Hadas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220222165300.4a8dd044.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=bhelgaas@google.com \
    --cc=cohuck@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kevin.tian@intel.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=maorg@nvidia.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).