Linux IOMMU Development
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: "jean-philippe@linaro.org" <jean-philippe@linaro.org>,
	"Raj, Ashok" <ashok.raj@intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Tian, Jun J" <jun.j.tian@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"Sun, Yi Y" <yi.y.sun@intel.com>, "Wu, Hao" <hao.wu@intel.com>
Subject: Re: [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs
Date: Tue, 14 Apr 2020 18:36:02 -0600	[thread overview]
Message-ID: <20200414183602.7de084b0@x1.home> (raw)
In-Reply-To: <AADFC41AFE54684AB9EE6CBC0274A5D19D81F946@SHSMSX104.ccr.corp.intel.com>

On Tue, 14 Apr 2020 23:57:33 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Alex Williamson <alex.williamson@redhat.com>
> > Sent: Tuesday, April 14, 2020 11:24 PM
> > 
> > On Tue, 14 Apr 2020 03:42:42 +0000
> > "Tian, Kevin" <kevin.tian@intel.com> wrote:
> >   
> > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > Sent: Tuesday, April 14, 2020 11:29 AM
> > > >
> > > > On Tue, 14 Apr 2020 02:40:58 +0000
> > > > "Tian, Kevin" <kevin.tian@intel.com> wrote:
> > > >  
> > > > > > From: Alex Williamson <alex.williamson@redhat.com>
> > > > > > Sent: Tuesday, April 14, 2020 3:21 AM
> > > > > >
> > > > > > On Mon, 13 Apr 2020 08:05:33 +0000
> > > > > > "Tian, Kevin" <kevin.tian@intel.com> wrote:
> > > > > >  
> > > > > > > > From: Tian, Kevin
> > > > > > > > Sent: Monday, April 13, 2020 3:55 PM
> > > > > > > >  
> > > > > > > > > From: Raj, Ashok <ashok.raj@linux.intel.com>
> > > > > > > > > Sent: Monday, April 13, 2020 11:11 AM
> > > > > > > > >
> > > > > > > > > On Wed, Apr 08, 2020 at 10:19:40AM -0600, Alex Williamson  
> > wrote:  
> > > > > > > > > > On Tue, 7 Apr 2020 21:00:21 -0700
> > > > > > > > > > "Raj, Ashok" <ashok.raj@intel.com> wrote:
> > > > > > > > > >  
> > > > > > > > > > > Hi Alex
> > > > > > > > > > >
> > > > > > > > > > > + Bjorn  
> > > > > > > > > >
> > > > > > > > > >  + Don
> > > > > > > > > >  
> > > > > > > > > > > FWIW I can't understand why PCI SIG went different ways  
> > with  
> > > > ATS,  
> > > > > > > > > > > where its enumerated on PF and VF. But for PASID and PRI its  
> > > > only  
> > > > > > > > > > > in PF.
> > > > > > > > > > >
> > > > > > > > > > > I'm checking with our internal SIG reps to followup on that.
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Apr 07, 2020 at 09:58:01AM -0600, Alex Williamson  
> > > > wrote:  
> > > > > > > > > > > > > Is there vendor guarantee that hidden registers will locate  
> > at  
> > > > the  
> > > > > > > > > > > > > same offset between PF and VF config space?  
> > > > > > > > > > > >
> > > > > > > > > > > > I'm not sure if the spec really precludes hidden registers,  
> > but  
> > > > the  
> > > > > > > > > > > > fact that these registers are explicitly outside of the  
> > capability  
> > > > > > > > > > > > chain implies they're only intended for device specific use,  
> > so  
> > > > I'd  
> > > > > > say  
> > > > > > > > > > > > there are no guarantees about anything related to these  
> > > > registers.  
> > > > > > > > > > >
> > > > > > > > > > > As you had suggested in the other thread, we could consider
> > > > > > > > > > > using the same offset as in PF, but even that's a better guess
> > > > > > > > > > > still not reliable.
> > > > > > > > > > >
> > > > > > > > > > > The other option is to maybe extend driver ops in the PF to  
> > > > expose  
> > > > > > > > > > > where the offsets should be. Sort of adding the quirk in the
> > > > > > > > > > > implementation.
> > > > > > > > > > >
> > > > > > > > > > > I'm not sure how prevalent are PASID and PRI in VF devices. If  
> > > > SIG is  
> > > > > > > > > resisting  
> > > > > > > > > > > making VF's first class citizen, we might ask them to add  
> > some  
> > > > > > verbiage  
> > > > > > > > > > > to suggest leave the same offsets as PF open to help  
> > emulation  
> > > > > > software.  
> > > > > > > > > >
> > > > > > > > > > Even if we know where to expose these capabilities on the VF,  
> > it's  
> > > > not  
> > > > > > > > > > clear to me how we can actually virtualize the capability itself.  
> > If  
> > > > > > > > > > the spec defines, for example, an enable bit as r/w then  
> > software  
> > > > that  
> > > > > > > > > > interacts with that register expects the bit is settable.  There's  
> > no  
> > > > > > > > > > protocol for "try to set the bit and re-read it to see if the  
> > hardware  
> > > > > > > > > > accepted it".  Therefore a capability with a fixed enable bit
> > > > > > > > > > representing the state of the PF, not settable by the VF, is
> > > > > > > > > > disingenuous to the spec.  
> > > > > > > > >
> > > > > > > > > I think we are all in violent agreement. A lot of times the pci spec  
> > > > gets  
> > > > > > > > > defined several years ahead of real products and no one  
> > > > remembers  
> > > > > > > > > the justification on why they restricted things the way they did.
> > > > > > > > >
> > > > > > > > > Maybe someone early product wasn't quite exposing these  
> > features  
> > > > to  
> > > > > > the  
> > > > > > > > > VF
> > > > > > > > > and hence the spec is bug compatible :-)
> > > > > > > > >  
> > > > > > > > > >
> > > > > > > > > > If what we're trying to do is expose that PASID and PRI are  
> > enabled  
> > > > on  
> > > > > > > > > > the PF to a VF driver, maybe duplicating the PF capabilities on  
> > the  
> > > > VF  
> > > > > > > > > > without the ability to control it is not the right approach.  
> > Maybe  
> > > > we  
> > > > > > > > >
> > > > > > > > > As long as the capability enable is only provided when the PF has  
> > > > > > enabled  
> > > > > > > > > the feature. Then it seems the hardware seems to do the right  
> > thing.  
> > > > > > > > >
> > > > > > > > > Assume we expose PASID/PRI only when PF has enabled it. It will  
> > be  
> > > > the  
> > > > > > > > > case since the PF driver needs to exist, and IOMMU would have  
> > set  
> > > > the  
> > > > > > > > > PASID/PRI/ATS on PF.
> > > > > > > > >
> > > > > > > > > If the emulation is purely spoofing the capability. Once vIOMMU  
> > > > driver  
> > > > > > > > > enables PASID, the context entries for the VF are completely  
> > > > > > independent  
> > > > > > > > > from the PF context entries.
> > > > > > > > >
> > > > > > > > > vIOMMU would enable PASID, and we just spoof the PASID  
> > > > capability.  
> > > > > > > > >
> > > > > > > > > If vIOMMU or guest for some reason does disable_pasid(), then  
> > the  
> > > > > > > > > vIOMMU driver can disaable PASID on the VF context entries. So  
> > the  
> > > > VF  
> > > > > > > > > although the capability is blanket enabled on PF, IOMMU  
> > gaurantees  
> > > > > > the  
> > > > > > > > > transactions are blocked.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > In the interim, it seems like the intent of the virtual capability
> > > > > > > > > can be honored via help from the IOMMU for the controlling  
> > aspect..  
> > > > > > > > >
> > > > > > > > > Did i miss anything?  
> > > > > > > >
> > > > > > > > Above works for emulating the enable bit (under the assumption  
> > that  
> > > > > > > > PF driver won't disable pasid when vf is assigned). However, there  
> > are  
> > > > > > > > also "Execute permission enable" and "Privileged mode enable"  
> > bits in  
> > > > > > > > PASID control registers. I don't know how those bits could be  
> > cleanly  
> > > > > > > > emulated when the guest writes a value different from PF's...  
> > > > > > >
> > > > > > > sent too quick. the IOMMU also includes control bits for allowing/
> > > > > > > blocking execute requests and supervisor requests. We can rely on
> > > > > > > IOMMU to block those requests to emulate the disabled cases of
> > > > > > > all three control bits in the pasid cap.  
> > > > > >
> > > > > >
> > > > > > So if the emulation of the PASID capability takes into account the
> > > > > > IOMMU configuration to back that emulation, shouldn't we do that
> > > > > > emulation in the hypervisor, ie. QEMU, rather than the kernel vfio
> > > > > > layer?  Thanks,
> > > > > >
> > > > > > Alex  
> > > > >
> > > > > We need enforce it in physical IOMMU, to ensure that even the
> > > > > VF may send requests which violate the guest expectation those
> > > > > requests are always blocked by IOMMU. Kernel vfio identifies
> > > > > such need when emulating the pasid cap and then forward the
> > > > > request to host iommu driver.  
> > > >
> > > > Implementing this in the kernel would be necessary if we needed to
> > > > protect from the guest device doing something bad to the host or
> > > > other devices.  Making sure the physical IOMMU is configured to meet
> > > > guest expectations doesn't sound like it necessarily falls into that
> > > > category.  We do that on a regular basis to program the DMA mappings.
> > > > Tell me more about why the hypervisor can't handle this piece of
> > > > guest/host synchronization on top of all the other things it
> > > > synchronizes to make a VM.  Thanks,
> > > >  
> > >
> > > I care more about "execution permission" and "privileged mode".
> > > It might be dangerous when the guest disallows the VF from sending  
> > 
> > "Dangerous" how?  We're generally ok with the user managing their own
> > consistency, it's when the user can affect other users/devices that we
> > require vfio in the kernel to actively manage something.  There's a very
> > different scope to the vfio-pci kernel module implementing a fake
> > capability and trying to make it behave indistinguishably from the real
> > capability versus a userspace driver piecing together an emulation
> > that's good enough for their purposes.  Thanks,
> >  
> 
> How could emulation fix this gap when the VF DMAs don't go through
> the vIOMMU? What you explained all makes sense before talking about
> the emulation of PASID capability, i.e. vfio only cares about isolation 
> between assigned devices. However now vfio exposes a capability 
> which is shared by PF/VF while pure software emulation may break 
> the guest expectation, and now the only viable mitigation is to get 
> the help from physical IOMMU. then why cannot vfio include such 
> mitigation in its emulation of the PASID capability? 

DMA never actually goes "through" the vIOMMU.  I'm not suggesting that
vfio doesn't participate some how, but I don't know that emulating a
capability that doesn't exist and involves policy should be done in the
kernel, versus providing userspace with an interface to control what
they need to implement that emulation.  Thanks,

Alex

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2020-04-15  0:36 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-22 12:33 [PATCH v1 0/2] vfio/pci: expose device's PASID capability to VMs Liu, Yi L
2020-03-22 12:33 ` [PATCH v1 1/2] vfio/pci: Expose PCIe PASID capability to guest Liu, Yi L
2020-03-31  6:39   ` Tian, Kevin
2020-03-31  6:42     ` Liu, Yi L
2020-03-22 12:33 ` [PATCH v1 2/2] vfio/pci: Emulate PASID/PRI capability for VFs Liu, Yi L
2020-04-02 22:59   ` Alex Williamson
2020-04-03  7:53     ` Liu, Yi L
2020-04-03 17:25       ` Alex Williamson
2020-04-07  4:26         ` Tian, Kevin
2020-04-07 15:58           ` Alex Williamson
2020-04-08  0:27             ` Tian, Kevin
2020-04-08  4:00             ` Raj, Ashok
2020-04-08 16:19               ` Alex Williamson
2020-04-08 16:33                 ` Raj, Ashok
2020-04-09  7:35                 ` Jean-Philippe Brucker
2020-04-13 19:44                   ` Alex Williamson
2020-04-13  3:10                 ` Raj, Ashok
2020-04-13  3:29                   ` Raj, Ashok
2020-04-13 19:10                     ` Alex Williamson
2020-04-13  7:54                   ` Tian, Kevin
2020-04-13  8:05                   ` Tian, Kevin
2020-04-13 19:21                     ` Alex Williamson
2020-04-14  2:40                       ` Tian, Kevin
2020-04-14  3:28                         ` Alex Williamson
2020-04-14  3:42                           ` Tian, Kevin
2020-04-14 15:24                             ` Alex Williamson
2020-04-14 23:57                               ` Tian, Kevin
2020-04-15  0:36                                 ` Alex Williamson [this message]
2020-04-15  1:01                                   ` Tian, Kevin
2020-04-03 11:42     ` Liu, Yi L
2020-03-31  6:35 ` [PATCH v1 0/2] vfio/pci: expose device's PASID capability to VMs Tian, Kevin
2020-03-31  7:08   ` Lu, Baolu
2020-04-16 22:12     ` Yan Zhao
2020-04-16 22:33       ` Raj, Ashok
2020-04-17  1:13         ` Yan Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200414183602.7de084b0@x1.home \
    --to=alex.williamson@redhat.com \
    --cc=ashok.raj@intel.com \
    --cc=bhelgaas@google.com \
    --cc=hao.wu@intel.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jean-philippe@linaro.org \
    --cc=jun.j.tian@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yi.y.sun@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox