netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Cornelia Huck <cohuck@redhat.com>,
	Yishai Hadas <yishaih@nvidia.com>,
	bhelgaas@google.com, saeedm@nvidia.com,
	linux-pci@vger.kernel.org, kvm@vger.kernel.org,
	netdev@vger.kernel.org, kuba@kernel.org, leonro@nvidia.com,
	kwankhede@nvidia.com, mgurtovoy@nvidia.com, maorg@nvidia.com,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices
Date: Wed, 27 Oct 2021 13:05:20 -0600	[thread overview]
Message-ID: <20211027130520.33652a49.alex.williamson@redhat.com> (raw)
In-Reply-To: <20211026234300.GA2744544@nvidia.com>

On Tue, 26 Oct 2021 20:43:00 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Tue, Oct 26, 2021 at 01:50:46PM -0600, Alex Williamson wrote:
> > On Tue, 26 Oct 2021 12:18:51 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >   
> > > On Tue, Oct 26, 2021 at 08:42:12AM -0600, Alex Williamson wrote:
> > >   
> > > > > This is also why I don't like it being so transparent as it is
> > > > > something userspace needs to care about - especially if the HW cannot
> > > > > support such a thing, if we intend to allow that.    
> > > > 
> > > > Userspace does need to care, but userspace's concern over this should
> > > > not be able to compromise the platform and therefore making VF
> > > > assignment more susceptible to fatal error conditions to comply with a
> > > > migration uAPI is troublesome for me.    
> > > 
> > > It is an interesting scenario.
> > > 
> > > I think it points that we are not implementing this fully properly.
> > > 
> > > The !RUNNING state should be like your reset efforts.
> > > 
> > > All access to the MMIO memories from userspace should be revoked
> > > during !RUNNING
> > > 
> > > All VMAs zap'd.
> > > 
> > > All IOMMU peer mappings invalidated.
> > > 
> > > The kernel should directly block userspace from causing a MMIO TLP
> > > before the device driver goes to !RUNNING.
> > > 
> > > Then the question of what the device does at this edge is not
> > > relevant as hostile userspace cannot trigger it.
> > > 
> > > The logical way to implement this is to key off running and
> > > block/unblock MMIO access when !RUNNING.
> > > 
> > > To me this strongly suggests that the extra bit is the correct way
> > > forward as the driver is much simpler to implement and understand if
> > > RUNNING directly controls the availability of MMIO instead of having
> > > an irregular case where !RUNNING still allows MMIO but only until a
> > > pending_bytes read.
> > > 
> > > Given the complexity of this can we move ahead with the current
> > > mlx5_vfio and Yishai&co can come with some followup proposal to split
> > > the freeze/queice and block MMIO?  
> > 
> > I know how much we want this driver in, but I'm surprised that you're
> > advocating to cut-and-run with an implementation where we've identified
> > a potentially significant gap with some hand waving that we'll resolve
> > it later.  
> 
> I don't view it as cut and run, but making reasonable quanta of
> progress with patch series of reviewable size and scope.
> 
> At a certain point we need to get the base level of functionality,
> that matches the currently defined ABI merged so we can talk about
> where the ABI needs to go.
> 
> If you are uncomfortable about this from a uABI stability perspective
> then mark the driver EXPERIMENTAL and do not merge any other migration
> drivers until the two details are fully sorted out.
> 
> As far as the actual issue, if you hadn't just discovered it now
> nobody would have known we have this gap - much like how the very
> similar reset issue was present in VFIO for so many years until you
> plugged it.

But the fact that we did discover it is hugely important.  We've
identified that the potential use case is significantly limited and
that userspace doesn't have a good mechanism to determine when to
expose that limitation to the user.  We're tossing around solutions
that involve extensions, if not changes to the uAPI.  It's Wednesday of
rc7.

I feel like we've already been burned by making one of these
"reasonable quanta of progress" to accept and mark experimental
decisions with where we stand between defining the uAPI in the kernel
and accepting an experimental implementation in QEMU.  Now we have
multiple closed driver implementations (none of which are contributing
to this discussion), but thankfully we're not committed to supporting
them because we have no open implementations.  I think we could get away
with ripping up the uAPI if we really needed to.  

> > Deciding at some point in the future to forcefully block device MMIO
> > access from userspace when the device stops running is clearly a user
> > visible change and therefore subject to the don't-break-userspace
> > clause.    
> 
> I don't think so, this was done for reset retroactively after
> all. Well behaved qmeu should have silenced all MMIO touches as part
> of the ABI contract anyhow.

That's not obvious to me and I think it conflates access to the device
and execution of the device.  If it's QEMU's responsibility to quiesce
access to the device anyway, why does the kernel need to impose this
restriction.  I'd think we'd generally only impose such a restriction
if the alternative allows the user to invoke bad behavior outside the
scope of their use of the device or consistency of the migration data.
It appears that any such behavior would be implementation specific here.

> The "don't-break-userspace" is not an absolute prohibition, Linus has
> been very clear this limitation is about direct, ideally demonstrable,
> breakage to actually deployed software.

And if we introduce an open driver that unblocks QEMU support to become
non-experimental, I think that's where we stand.

> > That might also indicate that "freeze" is only an implementation
> > specific requirement.  Thanks,  
> 
> It doesn't matter if a theoretical device can exist that doesn't need
> "freeze" - this device does, and so it is the lowest common
> denominator for the uAPI contract and userspace must abide by the
> restriction.

Sorry, "to the victor go the spoils" is not really how I strictly want
to define a uAPI contract with userspace.  If we're claiming that
userspace is responsible for quiescing devices and we're providing a
means for that to occur, and userspace is already responsible for
managing MMIO access, then the only reason the kernel would forcefully
impose such a restriction itself would be to protect the host and the
implementation of that would depend on whether this is expected to be a
universal or device specific limitation.  Thanks,

Alex


  reply	other threads:[~2021-10-27 19:05 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-19 10:58 [PATCH V2 mlx5-next 00/14] Add mlx5 live migration driver Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 01/14] PCI/IOV: Add pci_iov_vf_id() to get VF index Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 02/14] net/mlx5: Reuse exported virtfn index function call Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 03/14] net/mlx5: Disable SRIOV before PF removal Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 04/14] PCI/IOV: Add pci_iov_get_pf_drvdata() to allow VF reaching the drvdata of a PF Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 05/14] net/mlx5: Expose APIs to get/put the mlx5 core device Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 06/14] vdpa/mlx5: Use mlx5_vf_get_core_dev() to get PF device Yishai Hadas
2021-10-19 11:16   ` Max Gurtovoy
2021-10-20  8:58     ` Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 07/14] vfio: Fix VFIO_DEVICE_STATE_SET_ERROR macro Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 08/14] vfio: Add a macro for VFIO_DEVICE_STATE_ERROR Yishai Hadas
2021-10-19 15:48   ` Alex Williamson
2021-10-19 15:50     ` Alex Williamson
2021-10-20  7:35       ` Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 09/14] vfio/pci_core: Make the region->release() function optional Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 10/14] net/mlx5: Introduce migration bits and structures Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 11/14] vfio/mlx5: Expose migration commands over mlx5 device Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 12/14] vfio/mlx5: Implement vfio_pci driver for mlx5 devices Yishai Hadas
2021-10-19 18:43   ` Alex Williamson
2021-10-19 19:23     ` Jason Gunthorpe
2021-10-19 20:58       ` Alex Williamson
2021-10-19 23:04         ` Jason Gunthorpe
2021-10-20  8:28           ` Yishai Hadas
2021-10-20 16:52             ` Alex Williamson
2021-10-20 18:59               ` Jason Gunthorpe
2021-10-20 21:07                 ` Alex Williamson
2021-10-21  9:34                   ` Cornelia Huck
2021-10-21 21:47                     ` Alex Williamson
2021-10-25 12:29                       ` Jason Gunthorpe
2021-10-25 14:28                         ` Alex Williamson
2021-10-25 14:56                           ` Jason Gunthorpe
2021-10-26 14:42                             ` Alex Williamson
2021-10-26 15:18                               ` Jason Gunthorpe
2021-10-26 19:50                                 ` Alex Williamson
2021-10-26 23:43                                   ` Jason Gunthorpe
2021-10-27 19:05                                     ` Alex Williamson [this message]
2021-10-27 19:23                                       ` Jason Gunthorpe
2021-10-28 15:08                                         ` Cornelia Huck
2021-10-29  0:26                                           ` Jason Gunthorpe
2021-10-29  7:35                                             ` Yishai Hadas
2021-10-28 15:30                                         ` Alex Williamson
2021-10-28 23:47                                           ` Jason Gunthorpe
2021-10-29  6:57                                             ` Cornelia Huck
2021-10-29  7:48                                               ` Yishai Hadas
2021-10-29 10:32                                             ` Shameerali Kolothum Thodi
2021-10-29 12:15                                               ` Jason Gunthorpe
2021-10-29 22:06                                             ` Alex Williamson
2021-11-01 17:25                                               ` Jason Gunthorpe
2021-11-02 11:19                                                 ` Shameerali Kolothum Thodi
2021-11-02 14:56                                                 ` Alex Williamson
2021-11-02 15:54                                                   ` Jason Gunthorpe
2021-11-02 16:22                                                     ` Alex Williamson
2021-11-02 16:36                                                       ` Jason Gunthorpe
2021-11-02 20:15                                                         ` Alex Williamson
2021-11-03 12:09                                                           ` Jason Gunthorpe
2021-11-03 15:44                                                             ` Alex Williamson
2021-11-03 16:10                                                               ` Jason Gunthorpe
2021-11-03 18:04                                                                 ` Alex Williamson
2021-11-04 11:19                                                                   ` Cornelia Huck
2021-11-05 16:53                                                                     ` Cornelia Huck
2021-11-16 16:59                                                                       ` Cornelia Huck
2021-11-05 13:24                                                                   ` Jason Gunthorpe
2021-11-05 15:31                                                                     ` Alex Williamson
2021-11-15 23:29                                                                       ` Jason Gunthorpe
2021-11-16 17:57                                                                         ` Alex Williamson
2021-11-16 19:25                                                                           ` Jason Gunthorpe
2021-11-16 21:10                                                                             ` Alex Williamson
2021-11-17  1:48                                                                               ` Jason Gunthorpe
2021-11-18 18:15                                                                                 ` Alex Williamson
2021-11-22 19:18                                                                                   ` Jason Gunthorpe
2021-11-08  8:53                                 ` Tian, Kevin
2021-11-08 12:35                                   ` Jason Gunthorpe
2021-11-09  0:58                                     ` Tian, Kevin
2021-11-09 12:45                                       ` Jason Gunthorpe
2021-10-25 16:34               ` Dr. David Alan Gilbert
2021-10-25 17:55                 ` Alex Williamson
2021-10-25 18:47                   ` Dr. David Alan Gilbert
2021-10-25 19:15                     ` Jason Gunthorpe
2021-10-26  8:40                       ` Dr. David Alan Gilbert
2021-10-26 12:13                         ` Jason Gunthorpe
2021-10-26 14:52                           ` Alex Williamson
2021-10-26 15:56                             ` Jason Gunthorpe
2021-10-26 14:29                     ` Alex Williamson
2021-10-26 14:51                       ` Dr. David Alan Gilbert
2021-10-26 15:25                         ` Jason Gunthorpe
2021-10-20  8:01     ` Yishai Hadas
2021-10-20 16:25       ` Jason Gunthorpe
2021-10-21 10:46         ` Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 13/14] vfio/pci: Expose vfio_pci_aer_err_detected() Yishai Hadas
2021-10-19 10:58 ` [PATCH V2 mlx5-next 14/14] vfio/mlx5: Use its own PCI reset_done error handler Yishai Hadas
2021-10-19 18:55   ` Alex Williamson
2021-10-19 19:10     ` Jason Gunthorpe
2021-10-20  8:46       ` Yishai Hadas
2021-10-20 16:46         ` Jason Gunthorpe
2021-10-20 17:45           ` Alex Williamson
2021-10-20 18:57             ` Jason Gunthorpe
2021-10-20 21:38               ` Alex Williamson
2021-10-21 10:39             ` Yishai Hadas
2021-11-17 16:42 ` vfio migration discussions (was: [PATCH V2 mlx5-next 00/14] Add mlx5 live migration driver) Cornelia Huck
2021-11-17 17:47   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211027130520.33652a49.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=cohuck@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=leonro@nvidia.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=maorg@nvidia.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).