From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: [PATCH] VT-d/ATS: correct and clean up dev_invalidate_iotlb() Date: Tue, 24 Jun 2014 17:09:43 +0100 Message-ID: <53A9A2C7.6010708@citrix.com> References: <53A9A68C020000780001CDD4@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2284531592336607604==" Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1WzTI4-0007fh-JN for xen-devel@lists.xenproject.org; Tue, 24 Jun 2014 16:09:48 +0000 In-Reply-To: <53A9A68C020000780001CDD4@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , xen-devel Cc: Yang Z Zhang List-Id: xen-devel@lists.xenproject.org --===============2284531592336607604== Content-Type: multipart/alternative; boundary="------------090303060001030508000000" --------------090303060001030508000000 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit On 24/06/14 15:25, Jan Beulich wrote: > While this was intended to only do cleanup (replace the two bogus > "ret |= " constructs, and a simple formatting correction), this now > also > - fixes the bit manipulations for size_order > 0 > a) correct an off-by-one in the use of size_order for shifting (till > now double the requested size got invalidated) > b) in fact setting bit 12 and up if necessary (without which too > small a region might have got invalidated) > c) making them capable of dealing with regions of 4Gb size and up > - corrects the return value handling, such that a later iteration's > success won't clear an earlier iteration's error indication > - uses PCI_BDF2() instead of open coding it > - bail immediately on bad passed in invalidation type, rather than > repeatedly printing the same message for each ATS-capable device, at > once also no longer hiding that failure from the caller > > Signed-off-by: Jan Beulich > --- > Note that despite not having ATS capable hardware and hence not being > able to actually test the changes, I'm still certain changed the code > gets closer to what the spec mandates than the original one. > > --- a/xen/drivers/passthrough/vtd/x86/ats.c > +++ b/xen/drivers/passthrough/vtd/x86/ats.c > @@ -110,21 +110,23 @@ int dev_invalidate_iotlb(struct iommu *i > u64 addr, unsigned int size_order, u64 type) > { > struct pci_ats_dev *pdev; > - int sbit, ret = 0; > - u16 sid; > + int ret = 0; > > if ( !ecap_dev_iotlb(iommu->ecap) ) > return ret; > > list_for_each_entry( pdev, &ats_devices, list ) > { > - sid = (pdev->bus << 8) | pdev->devfn; > + u16 sid = PCI_BDF2(pdev->bus, pdev->devfn); > + bool_t sbit; > + int rc = 0; > > /* Only invalidate devices that belong to this IOMMU */ > if ( pdev->iommu != iommu ) > continue; > > - switch ( type ) { > + switch ( type ) > + { > case DMA_TLB_DSI_FLUSH: > if ( !device_in_domain(iommu, pdev, did) ) > break; > @@ -133,32 +135,37 @@ int dev_invalidate_iotlb(struct iommu *i > /* invalidate all translations: sbit=1,bit_63=0,bit[62:12]=1 */ > sbit = 1; > addr = (~0 << PAGE_SHIFT_4K) & 0x7FFFFFFFFFFFFFFF; > - ret |= qinval_device_iotlb(iommu, pdev->ats_queue_depth, > - sid, sbit, addr); > + rc = qinval_device_iotlb(iommu, pdev->ats_queue_depth, > + sid, sbit, addr); > break; > case DMA_TLB_PSI_FLUSH: > if ( !device_in_domain(iommu, pdev, did) ) > break; > > - addr &= ~0 << (PAGE_SHIFT + size_order); > - > /* if size <= 4K, set sbit = 0, else set sbit = 1 */ > sbit = size_order ? 1 : 0; > > /* clear lower bits */ > - addr &= (~0 << (PAGE_SHIFT + size_order)); > + addr &= ~0 << PAGE_SHIFT_4K; Doesn't this need to be ~0ULL as addr is u64? ~Andrew > > /* if sbit == 1, zero out size_order bit and set lower bits to 1 */ > if ( sbit ) > - addr &= (~0 & ~(1 << (PAGE_SHIFT + size_order))); > + { > + addr &= ~((u64)PAGE_SIZE_4K << (size_order - 1)); > + addr |= (((u64)1 << (size_order - 1)) - 1) << PAGE_SHIFT_4K; > + } > > - ret |= qinval_device_iotlb(iommu, pdev->ats_queue_depth, > - sid, sbit, addr); > + rc = qinval_device_iotlb(iommu, pdev->ats_queue_depth, > + sid, sbit, addr); > break; > default: > dprintk(XENLOG_WARNING VTDPREFIX, "invalid vt-d flush type\n"); > - break; > + return -EOPNOTSUPP; > } > + > + if ( !ret ) > + ret = rc; > } > + > return ret; > } > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel --------------090303060001030508000000 Content-Type: text/html; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit
On 24/06/14 15:25, Jan Beulich wrote:
While this was intended to only do cleanup (replace the two bogus
"ret |= " constructs, and a simple formatting correction), this now
also
- fixes the bit manipulations for size_order > 0
  a) correct an off-by-one in the use of size_order for shifting (till
     now double the requested size got invalidated)
  b) in fact setting bit 12 and up if necessary (without which too
     small a region might have got invalidated)
  c) making them capable of dealing with regions of 4Gb size and up
- corrects the return value handling, such that a later iteration's
  success won't clear an earlier iteration's error indication
- uses PCI_BDF2() instead of open coding it
- bail immediately on bad passed in invalidation type, rather than
  repeatedly printing the same message for each ATS-capable device, at
  once also no longer hiding that failure from the caller

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Note that despite not having ATS capable hardware and hence not being
able to actually test the changes, I'm still certain changed the code
gets closer to what the spec mandates than the original one.

--- a/xen/drivers/passthrough/vtd/x86/ats.c
+++ b/xen/drivers/passthrough/vtd/x86/ats.c
@@ -110,21 +110,23 @@ int dev_invalidate_iotlb(struct iommu *i
     u64 addr, unsigned int size_order, u64 type)
 {
     struct pci_ats_dev *pdev;
-    int sbit, ret = 0;
-    u16 sid;
+    int ret = 0;
 
     if ( !ecap_dev_iotlb(iommu->ecap) )
         return ret;
 
     list_for_each_entry( pdev, &ats_devices, list )
     {
-        sid = (pdev->bus << 8) | pdev->devfn;
+        u16 sid = PCI_BDF2(pdev->bus, pdev->devfn);
+        bool_t sbit;
+        int rc = 0;
 
         /* Only invalidate devices that belong to this IOMMU */
         if ( pdev->iommu != iommu )
             continue;
 
-        switch ( type ) {
+        switch ( type )
+        {
         case DMA_TLB_DSI_FLUSH:
             if ( !device_in_domain(iommu, pdev, did) )
                 break;
@@ -133,32 +135,37 @@ int dev_invalidate_iotlb(struct iommu *i
             /* invalidate all translations: sbit=1,bit_63=0,bit[62:12]=1 */
             sbit = 1;
             addr = (~0 << PAGE_SHIFT_4K) & 0x7FFFFFFFFFFFFFFF;
-            ret |= qinval_device_iotlb(iommu, pdev->ats_queue_depth,
-                                       sid, sbit, addr);
+            rc = qinval_device_iotlb(iommu, pdev->ats_queue_depth,
+                                     sid, sbit, addr);
             break;
         case DMA_TLB_PSI_FLUSH:
             if ( !device_in_domain(iommu, pdev, did) )
                 break;
 
-            addr &= ~0 << (PAGE_SHIFT + size_order);
-
             /* if size <= 4K, set sbit = 0, else set sbit = 1 */
             sbit = size_order ? 1 : 0;
 
             /* clear lower bits */
-            addr &= (~0 << (PAGE_SHIFT + size_order));
+            addr &= ~0 << PAGE_SHIFT_4K;

Doesn't this need to be ~0ULL as addr is u64?

~Andrew

 
             /* if sbit == 1, zero out size_order bit and set lower bits to 1 */
             if ( sbit )
-                addr &= (~0  & ~(1 << (PAGE_SHIFT + size_order)));
+            {
+                addr &= ~((u64)PAGE_SIZE_4K << (size_order - 1));
+                addr |= (((u64)1 << (size_order - 1)) - 1) << PAGE_SHIFT_4K;
+            }
 
-            ret |= qinval_device_iotlb(iommu, pdev->ats_queue_depth,
-                                       sid, sbit, addr);
+            rc = qinval_device_iotlb(iommu, pdev->ats_queue_depth,
+                                     sid, sbit, addr);
             break;
         default:
             dprintk(XENLOG_WARNING VTDPREFIX, "invalid vt-d flush type\n");
-            break;
+            return -EOPNOTSUPP;
         }
+
+        if ( !ret )
+            ret = rc;
     }
+
     return ret;
 }





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

--------------090303060001030508000000-- --===============2284531592336607604== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============2284531592336607604==--