xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] x86: further P2M adjustments
@ 2015-09-21 13:57 Jan Beulich
  2015-09-21 14:02 ` [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates Jan Beulich
                   ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: Jan Beulich @ 2015-09-21 13:57 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Andrew Cooper, Kevin Tian, Keir Fraser,
	Jun Nakajima

1: p2m: tighten conditions of IOMMU mapping updates
2: p2m-pt: use pre-calculated IOMMU flags
3: p2m-ept: adjust some types in ept_set_entry()

Signed-off-by: Jan Beulich <jbeulich@suse.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-21 13:57 [PATCH 0/3] x86: further P2M adjustments Jan Beulich
@ 2015-09-21 14:02 ` Jan Beulich
  2015-09-22 14:15   ` Jan Beulich
                     ` (3 more replies)
  2015-09-21 14:03 ` [PATCH 2/3] x86/p2m‑pt: use pre‑calculated IOMMU flags Jan Beulich
                   ` (3 subsequent siblings)
  4 siblings, 4 replies; 24+ messages in thread
From: Jan Beulich @ 2015-09-21 14:02 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Andrew Cooper, Kevin Tian, Keir Fraser,
	Jun Nakajima

[-- Attachment #1: Type: text/plain, Size: 8625 bytes --]

In the EPT case permission changes should also result in updates or
TLB flushes.

In the NPT case the old MFN does not depend on the new entry being
valid (but solely on the old one), and the need to update or TLB-flush
again also depends on permission changes.

In the shadow mode case, iommu_hap_pt_share should be ignored.

Furthermore in the NPT/shadow case old intermediate page tables must
be freed only after IOMMU side updates/flushes have got carried out.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
In addition to the fixes here it looks to me as if both EPT and
NPT/shadow code lack invalidation of IOMMU side paging structure
caches, i.e. further changes may be needed. Am I overlooking something?

--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -668,6 +668,7 @@ ept_set_entry(struct p2m_domain *p2m, un
     uint8_t ipat = 0;
     int need_modify_vtd_table = 1;
     int vtd_pte_present = 0;
+    unsigned int iommu_flags = p2m_get_iommu_flags(p2mt);
     enum { sync_off, sync_on, sync_check } needs_sync = sync_check;
     ept_entry_t old_entry = { .epte = 0 };
     ept_entry_t new_entry = { .epte = 0 };
@@ -798,8 +799,9 @@ ept_set_entry(struct p2m_domain *p2m, un
         new_entry.mfn = mfn_x(mfn);
 
         /* Safe to read-then-write because we hold the p2m lock */
-        if ( ept_entry->mfn == new_entry.mfn )
-             need_modify_vtd_table = 0;
+        if ( ept_entry->mfn == new_entry.mfn &&
+             p2m_get_iommu_flags(ept_entry->sa_p2mt) == iommu_flags )
+            need_modify_vtd_table = 0;
 
         ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
     }
@@ -830,11 +832,9 @@ out:
             iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
         else
         {
-            unsigned int flags = p2m_get_iommu_flags(p2mt);
-
-            if ( flags != 0 )
+            if ( iommu_flags )
                 for ( i = 0; i < (1 << order); i++ )
-                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, flags);
+                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
             else
                 for ( i = 0; i < (1 << order); i++ )
                     iommu_unmap_page(d, gfn + i);
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -488,12 +488,13 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
     void *table;
     unsigned long i, gfn_remainder = gfn;
     l1_pgentry_t *p2m_entry;
-    l1_pgentry_t entry_content;
+    l1_pgentry_t entry_content, old_entry = l1e_empty();
     l2_pgentry_t l2e_content;
     l3_pgentry_t l3e_content;
     int rc;
     unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt);
-    unsigned long old_mfn = 0;
+    unsigned int old_flags = 0;
+    unsigned long old_mfn = INVALID_MFN;
 
     ASSERT(sve != 0);
 
@@ -538,17 +539,20 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
      */
     if ( page_order == PAGE_ORDER_1G )
     {
-        l1_pgentry_t old_entry = l1e_empty();
         p2m_entry = p2m_find_entry(table, &gfn_remainder, gfn,
                                    L3_PAGETABLE_SHIFT - PAGE_SHIFT,
                                    L3_PAGETABLE_ENTRIES);
         ASSERT(p2m_entry);
-        if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) &&
-             !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
+        old_flags = l1e_get_flags(*p2m_entry);
+        if ( old_flags & _PAGE_PRESENT )
         {
-            /* We're replacing a non-SP page with a superpage.  Make sure to
-             * handle freeing the table properly. */
-            old_entry = *p2m_entry;
+            if ( old_flags & _PAGE_PSE )
+                old_mfn = l1e_get_pfn(*p2m_entry);
+            else
+            {
+                old_entry = *p2m_entry;
+                old_flags = ~0U;
+            }
         }
 
         ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
@@ -559,17 +563,10 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
         entry_content.l1 = l3e_content.l3;
 
         if ( entry_content.l1 != 0 )
-        {
             p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
-            old_mfn = l1e_get_pfn(*p2m_entry);
-        }
 
         p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 3);
         /* NB: paging_write_p2m_entry() handles tlb flushes properly */
-
-        /* Free old intermediate tables if necessary */
-        if ( l1e_get_flags(old_entry) & _PAGE_PRESENT )
-            p2m_free_entry(p2m, &old_entry, page_order);
     }
     else 
     {
@@ -591,7 +588,9 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
         p2m_entry = p2m_find_entry(table, &gfn_remainder, gfn,
                                    0, L1_PAGETABLE_ENTRIES);
         ASSERT(p2m_entry);
-        
+        old_flags = l1e_get_flags(*p2m_entry);
+        old_mfn = l1e_get_pfn(*p2m_entry);
+
         if ( mfn_valid(mfn) || (p2mt == p2m_mmio_direct)
                             || p2m_is_paging(p2mt) )
             entry_content = p2m_l1e_from_pfn(mfn_x(mfn),
@@ -600,29 +599,28 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
             entry_content = l1e_empty();
 
         if ( entry_content.l1 != 0 )
-        {
             p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
-            old_mfn = l1e_get_pfn(*p2m_entry);
-        }
+
         /* level 1 entry */
         p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 1);
         /* NB: paging_write_p2m_entry() handles tlb flushes properly */
     }
     else if ( page_order == PAGE_ORDER_2M )
     {
-        l1_pgentry_t old_entry = l1e_empty();
         p2m_entry = p2m_find_entry(table, &gfn_remainder, gfn,
                                    L2_PAGETABLE_SHIFT - PAGE_SHIFT,
                                    L2_PAGETABLE_ENTRIES);
         ASSERT(p2m_entry);
-        
-        /* FIXME: Deal with 4k replaced by 2meg pages */
-        if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) &&
-             !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
-        {
-            /* We're replacing a non-SP page with a superpage.  Make sure to
-             * handle freeing the table properly. */
-            old_entry = *p2m_entry;
+        old_flags = l1e_get_flags(*p2m_entry);
+        if ( old_flags & _PAGE_PRESENT )
+        {
+            if ( old_flags & _PAGE_PSE )
+                old_mfn = l1e_get_pfn(*p2m_entry);
+            else
+            {
+                old_entry = *p2m_entry;
+                old_flags = ~0U;
+            }
         }
         
         ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
@@ -636,17 +634,10 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
         entry_content.l1 = l2e_content.l2;
 
         if ( entry_content.l1 != 0 )
-        {
             p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
-            old_mfn = l1e_get_pfn(*p2m_entry);
-        }
 
         p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 2);
         /* NB: paging_write_p2m_entry() handles tlb flushes properly */
-
-        /* Free old intermediate tables if necessary */
-        if ( l1e_get_flags(old_entry) & _PAGE_PRESENT )
-            p2m_free_entry(p2m, &old_entry, page_order);
     }
 
     /* Track the highest gfn for which we have ever had a valid mapping */
@@ -654,11 +645,14 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
          && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) )
         p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1;
 
-    if ( iommu_enabled && need_iommu(p2m->domain) )
+    if ( iommu_enabled && need_iommu(p2m->domain) &&
+         (p2m_get_iommu_flags(p2m_flags_to_type(old_flags)) !=
+          iommu_pte_flags ||
+          old_mfn != mfn_x(mfn)) )
     {
-        if ( iommu_hap_pt_share )
+        if ( iommu_use_hap_pt(p2m->domain) )
         {
-            if ( old_mfn && (old_mfn != mfn_x(mfn)) )
+            if ( old_flags & _PAGE_PRESENT )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
         else
@@ -674,6 +668,14 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
         }
     }
 
+    /*
+     * Free old intermediate tables if necessary.  This has to be the
+     * last thing we do, after removal from the IOMMU tables, so as to
+     * avoid a potential use-after-free.
+     */
+    if ( l1e_get_flags(old_entry) & _PAGE_PRESENT )
+        p2m_free_entry(p2m, &old_entry, page_order);
+
  out:
     unmap_domain_page(table);
     return rc;



[-- Attachment #2: x86-p2m-pt-share.patch --]
[-- Type: text/plain, Size: 8677 bytes --]

x86/p2m: tighten conditions of IOMMU mapping updates

In the EPT case permission changes should also result in updates or
TLB flushes.

In the NPT case the old MFN does not depend on the new entry being
valid (but solely on the old one), and the need to update or TLB-flush
again also depends on permission changes.

In the shadow mode case, iommu_hap_pt_share should be ignored.

Furthermore in the NPT/shadow case old intermediate page tables must
be freed only after IOMMU side updates/flushes have got carried out.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
In addition to the fixes here it looks to me as if both EPT and
NPT/shadow code lack invalidation of IOMMU side paging structure
caches, i.e. further changes may be needed. Am I overlooking something?

--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -668,6 +668,7 @@ ept_set_entry(struct p2m_domain *p2m, un
     uint8_t ipat = 0;
     int need_modify_vtd_table = 1;
     int vtd_pte_present = 0;
+    unsigned int iommu_flags = p2m_get_iommu_flags(p2mt);
     enum { sync_off, sync_on, sync_check } needs_sync = sync_check;
     ept_entry_t old_entry = { .epte = 0 };
     ept_entry_t new_entry = { .epte = 0 };
@@ -798,8 +799,9 @@ ept_set_entry(struct p2m_domain *p2m, un
         new_entry.mfn = mfn_x(mfn);
 
         /* Safe to read-then-write because we hold the p2m lock */
-        if ( ept_entry->mfn == new_entry.mfn )
-             need_modify_vtd_table = 0;
+        if ( ept_entry->mfn == new_entry.mfn &&
+             p2m_get_iommu_flags(ept_entry->sa_p2mt) == iommu_flags )
+            need_modify_vtd_table = 0;
 
         ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
     }
@@ -830,11 +832,9 @@ out:
             iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
         else
         {
-            unsigned int flags = p2m_get_iommu_flags(p2mt);
-
-            if ( flags != 0 )
+            if ( iommu_flags )
                 for ( i = 0; i < (1 << order); i++ )
-                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, flags);
+                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
             else
                 for ( i = 0; i < (1 << order); i++ )
                     iommu_unmap_page(d, gfn + i);
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -488,12 +488,13 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
     void *table;
     unsigned long i, gfn_remainder = gfn;
     l1_pgentry_t *p2m_entry;
-    l1_pgentry_t entry_content;
+    l1_pgentry_t entry_content, old_entry = l1e_empty();
     l2_pgentry_t l2e_content;
     l3_pgentry_t l3e_content;
     int rc;
     unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt);
-    unsigned long old_mfn = 0;
+    unsigned int old_flags = 0;
+    unsigned long old_mfn = INVALID_MFN;
 
     ASSERT(sve != 0);
 
@@ -538,17 +539,20 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
      */
     if ( page_order == PAGE_ORDER_1G )
     {
-        l1_pgentry_t old_entry = l1e_empty();
         p2m_entry = p2m_find_entry(table, &gfn_remainder, gfn,
                                    L3_PAGETABLE_SHIFT - PAGE_SHIFT,
                                    L3_PAGETABLE_ENTRIES);
         ASSERT(p2m_entry);
-        if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) &&
-             !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
+        old_flags = l1e_get_flags(*p2m_entry);
+        if ( old_flags & _PAGE_PRESENT )
         {
-            /* We're replacing a non-SP page with a superpage.  Make sure to
-             * handle freeing the table properly. */
-            old_entry = *p2m_entry;
+            if ( old_flags & _PAGE_PSE )
+                old_mfn = l1e_get_pfn(*p2m_entry);
+            else
+            {
+                old_entry = *p2m_entry;
+                old_flags = ~0U;
+            }
         }
 
         ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
@@ -559,17 +563,10 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
         entry_content.l1 = l3e_content.l3;
 
         if ( entry_content.l1 != 0 )
-        {
             p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
-            old_mfn = l1e_get_pfn(*p2m_entry);
-        }
 
         p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 3);
         /* NB: paging_write_p2m_entry() handles tlb flushes properly */
-
-        /* Free old intermediate tables if necessary */
-        if ( l1e_get_flags(old_entry) & _PAGE_PRESENT )
-            p2m_free_entry(p2m, &old_entry, page_order);
     }
     else 
     {
@@ -591,7 +588,9 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
         p2m_entry = p2m_find_entry(table, &gfn_remainder, gfn,
                                    0, L1_PAGETABLE_ENTRIES);
         ASSERT(p2m_entry);
-        
+        old_flags = l1e_get_flags(*p2m_entry);
+        old_mfn = l1e_get_pfn(*p2m_entry);
+
         if ( mfn_valid(mfn) || (p2mt == p2m_mmio_direct)
                             || p2m_is_paging(p2mt) )
             entry_content = p2m_l1e_from_pfn(mfn_x(mfn),
@@ -600,29 +599,28 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
             entry_content = l1e_empty();
 
         if ( entry_content.l1 != 0 )
-        {
             p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
-            old_mfn = l1e_get_pfn(*p2m_entry);
-        }
+
         /* level 1 entry */
         p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 1);
         /* NB: paging_write_p2m_entry() handles tlb flushes properly */
     }
     else if ( page_order == PAGE_ORDER_2M )
     {
-        l1_pgentry_t old_entry = l1e_empty();
         p2m_entry = p2m_find_entry(table, &gfn_remainder, gfn,
                                    L2_PAGETABLE_SHIFT - PAGE_SHIFT,
                                    L2_PAGETABLE_ENTRIES);
         ASSERT(p2m_entry);
-        
-        /* FIXME: Deal with 4k replaced by 2meg pages */
-        if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) &&
-             !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
-        {
-            /* We're replacing a non-SP page with a superpage.  Make sure to
-             * handle freeing the table properly. */
-            old_entry = *p2m_entry;
+        old_flags = l1e_get_flags(*p2m_entry);
+        if ( old_flags & _PAGE_PRESENT )
+        {
+            if ( old_flags & _PAGE_PSE )
+                old_mfn = l1e_get_pfn(*p2m_entry);
+            else
+            {
+                old_entry = *p2m_entry;
+                old_flags = ~0U;
+            }
         }
         
         ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
@@ -636,17 +634,10 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
         entry_content.l1 = l2e_content.l2;
 
         if ( entry_content.l1 != 0 )
-        {
             p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
-            old_mfn = l1e_get_pfn(*p2m_entry);
-        }
 
         p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 2);
         /* NB: paging_write_p2m_entry() handles tlb flushes properly */
-
-        /* Free old intermediate tables if necessary */
-        if ( l1e_get_flags(old_entry) & _PAGE_PRESENT )
-            p2m_free_entry(p2m, &old_entry, page_order);
     }
 
     /* Track the highest gfn for which we have ever had a valid mapping */
@@ -654,11 +645,14 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
          && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) )
         p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1;
 
-    if ( iommu_enabled && need_iommu(p2m->domain) )
+    if ( iommu_enabled && need_iommu(p2m->domain) &&
+         (p2m_get_iommu_flags(p2m_flags_to_type(old_flags)) !=
+          iommu_pte_flags ||
+          old_mfn != mfn_x(mfn)) )
     {
-        if ( iommu_hap_pt_share )
+        if ( iommu_use_hap_pt(p2m->domain) )
         {
-            if ( old_mfn && (old_mfn != mfn_x(mfn)) )
+            if ( old_flags & _PAGE_PRESENT )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
         else
@@ -674,6 +668,14 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
         }
     }
 
+    /*
+     * Free old intermediate tables if necessary.  This has to be the
+     * last thing we do, after removal from the IOMMU tables, so as to
+     * avoid a potential use-after-free.
+     */
+    if ( l1e_get_flags(old_entry) & _PAGE_PRESENT )
+        p2m_free_entry(p2m, &old_entry, page_order);
+
  out:
     unmap_domain_page(table);
     return rc;

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/3] x86/p2m‑pt: use pre‑calculated IOMMU flags
  2015-09-21 13:57 [PATCH 0/3] x86: further P2M adjustments Jan Beulich
  2015-09-21 14:02 ` [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates Jan Beulich
@ 2015-09-21 14:03 ` Jan Beulich
  2015-09-28 16:42   ` George Dunlap
  2015-10-01 16:31   ` Wei Liu
  2015-09-21 14:03 ` [PATCH 3/3] x86/p2m‑ept: adjust some types in ept_set_entry() Jan Beulich
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 24+ messages in thread
From: Jan Beulich @ 2015-09-21 14:03 UTC (permalink / raw)
  To: xen-devel; +Cc: George Dunlap, Andrew Cooper, Keir Fraser

[-- Attachment #1: Type: text/plain, Size: 1249 bytes --]

... instead of recalculating them.

At once clean up formatting of the touched code and drop a stray loop
local variable shadowing a function scope one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -655,17 +655,13 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
             if ( old_flags & _PAGE_PRESENT )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
+        else if ( iommu_pte_flags )
+            for ( i = 0; i < (1UL << page_order); i++ )
+                iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
+                               iommu_pte_flags);
         else
-        {
-            unsigned int flags = p2m_get_iommu_flags(p2mt);
-
-            if ( flags != 0 )
-                for ( i = 0; i < (1UL << page_order); i++ )
-                    iommu_map_page(p2m->domain, gfn+i, mfn_x(mfn)+i, flags);
-            else
-                for ( int i = 0; i < (1UL << page_order); i++ )
-                    iommu_unmap_page(p2m->domain, gfn+i);
-        }
+            for ( i = 0; i < (1UL << page_order); i++ )
+                iommu_unmap_page(p2m->domain, gfn + i);
     }
 
     /*




[-- Attachment #2: x86-p2m-pt-IOMMU-flags.patch --]
[-- Type: text/plain, Size: 1289 bytes --]

x86/p2m-pt: use pre-calculated IOMMU flags

... instead of recalculating them.

At once clean up formatting of the touched code and drop a stray loop
local variable shadowing a function scope one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -655,17 +655,13 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
             if ( old_flags & _PAGE_PRESENT )
                 amd_iommu_flush_pages(p2m->domain, gfn, page_order);
         }
+        else if ( iommu_pte_flags )
+            for ( i = 0; i < (1UL << page_order); i++ )
+                iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
+                               iommu_pte_flags);
         else
-        {
-            unsigned int flags = p2m_get_iommu_flags(p2mt);
-
-            if ( flags != 0 )
-                for ( i = 0; i < (1UL << page_order); i++ )
-                    iommu_map_page(p2m->domain, gfn+i, mfn_x(mfn)+i, flags);
-            else
-                for ( int i = 0; i < (1UL << page_order); i++ )
-                    iommu_unmap_page(p2m->domain, gfn+i);
-        }
+            for ( i = 0; i < (1UL << page_order); i++ )
+                iommu_unmap_page(p2m->domain, gfn + i);
     }
 
     /*

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/3] x86/p2m‑ept: adjust some types in ept_set_entry()
  2015-09-21 13:57 [PATCH 0/3] x86: further P2M adjustments Jan Beulich
  2015-09-21 14:02 ` [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates Jan Beulich
  2015-09-21 14:03 ` [PATCH 2/3] x86/p2m‑pt: use pre‑calculated IOMMU flags Jan Beulich
@ 2015-09-21 14:03 ` Jan Beulich
  2015-09-28  8:56   ` Tian, Kevin
  2015-09-28 16:44   ` George Dunlap
  2015-09-25  6:58 ` [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates Jan Beulich
  2015-09-28 11:33 ` [PATCH 0/3] x86: further P2M adjustments Jan Beulich
  4 siblings, 2 replies; 24+ messages in thread
From: Jan Beulich @ 2015-09-21 14:03 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Andrew Cooper, Kevin Tian, Keir Fraser,
	Jun Nakajima

[-- Attachment #1: Type: text/plain, Size: 840 bytes --]

Use unsigned and bool_t as appropriate.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -662,12 +662,12 @@ ept_set_entry(struct p2m_domain *p2m, un
 {
     ept_entry_t *table, *ept_entry = NULL;
     unsigned long gfn_remainder = gfn;
-    int i, target = order / EPT_TABLE_ORDER;
+    unsigned int i, target = order / EPT_TABLE_ORDER;
     int ret, rc = 0;
     bool_t direct_mmio = (p2mt == p2m_mmio_direct);
     uint8_t ipat = 0;
-    int need_modify_vtd_table = 1;
-    int vtd_pte_present = 0;
+    bool_t need_modify_vtd_table = 1;
+    bool_t vtd_pte_present = 0;
     unsigned int iommu_flags = p2m_get_iommu_flags(p2mt);
     enum { sync_off, sync_on, sync_check } needs_sync = sync_check;
     ept_entry_t old_entry = { .epte = 0 };




[-- Attachment #2: x86-p2m-ept_set_entry-types.patch --]
[-- Type: text/plain, Size: 887 bytes --]

x86/p2m-ept: adjust some types in ept_set_entry()

Use unsigned and bool_t as appropriate.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -662,12 +662,12 @@ ept_set_entry(struct p2m_domain *p2m, un
 {
     ept_entry_t *table, *ept_entry = NULL;
     unsigned long gfn_remainder = gfn;
-    int i, target = order / EPT_TABLE_ORDER;
+    unsigned int i, target = order / EPT_TABLE_ORDER;
     int ret, rc = 0;
     bool_t direct_mmio = (p2mt == p2m_mmio_direct);
     uint8_t ipat = 0;
-    int need_modify_vtd_table = 1;
-    int vtd_pte_present = 0;
+    bool_t need_modify_vtd_table = 1;
+    bool_t vtd_pte_present = 0;
     unsigned int iommu_flags = p2m_get_iommu_flags(p2mt);
     enum { sync_off, sync_on, sync_check } needs_sync = sync_check;
     ept_entry_t old_entry = { .epte = 0 };

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-21 14:02 ` [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates Jan Beulich
@ 2015-09-22 14:15   ` Jan Beulich
  2015-09-23 14:00     ` Wei Liu
  2015-09-28  8:55   ` Tian, Kevin
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2015-09-22 14:15 UTC (permalink / raw)
  To: Wei Liu
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Jun Nakajima, xen-devel

>>> On 21.09.15 at 16:02, <JBeulich@suse.com> wrote:
> In the EPT case permission changes should also result in updates or
> TLB flushes.
> 
> In the NPT case the old MFN does not depend on the new entry being
> valid (but solely on the old one), and the need to update or TLB-flush
> again also depends on permission changes.
> 
> In the shadow mode case, iommu_hap_pt_share should be ignored.
> 
> Furthermore in the NPT/shadow case old intermediate page tables must
> be freed only after IOMMU side updates/flushes have got carried out.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Wei,

I'm sorry, I forgot to Cc you on the patch and state that I think this
should be considered for 4.6.

Jan

> ---
> In addition to the fixes here it looks to me as if both EPT and
> NPT/shadow code lack invalidation of IOMMU side paging structure
> caches, i.e. further changes may be needed. Am I overlooking something?
> 
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -668,6 +668,7 @@ ept_set_entry(struct p2m_domain *p2m, un
>      uint8_t ipat = 0;
>      int need_modify_vtd_table = 1;
>      int vtd_pte_present = 0;
> +    unsigned int iommu_flags = p2m_get_iommu_flags(p2mt);
>      enum { sync_off, sync_on, sync_check } needs_sync = sync_check;
>      ept_entry_t old_entry = { .epte = 0 };
>      ept_entry_t new_entry = { .epte = 0 };
> @@ -798,8 +799,9 @@ ept_set_entry(struct p2m_domain *p2m, un
>          new_entry.mfn = mfn_x(mfn);
>  
>          /* Safe to read-then-write because we hold the p2m lock */
> -        if ( ept_entry->mfn == new_entry.mfn )
> -             need_modify_vtd_table = 0;
> +        if ( ept_entry->mfn == new_entry.mfn &&
> +             p2m_get_iommu_flags(ept_entry->sa_p2mt) == iommu_flags )
> +            need_modify_vtd_table = 0;
>  
>          ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
>      }
> @@ -830,11 +832,9 @@ out:
>              iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
>          else
>          {
> -            unsigned int flags = p2m_get_iommu_flags(p2mt);
> -
> -            if ( flags != 0 )
> +            if ( iommu_flags )
>                  for ( i = 0; i < (1 << order); i++ )
> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, flags);
> +                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, 
> iommu_flags);
>              else
>                  for ( i = 0; i < (1 << order); i++ )
>                      iommu_unmap_page(d, gfn + i);
> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -488,12 +488,13 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>      void *table;
>      unsigned long i, gfn_remainder = gfn;
>      l1_pgentry_t *p2m_entry;
> -    l1_pgentry_t entry_content;
> +    l1_pgentry_t entry_content, old_entry = l1e_empty();
>      l2_pgentry_t l2e_content;
>      l3_pgentry_t l3e_content;
>      int rc;
>      unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt);
> -    unsigned long old_mfn = 0;
> +    unsigned int old_flags = 0;
> +    unsigned long old_mfn = INVALID_MFN;
>  
>      ASSERT(sve != 0);
>  
> @@ -538,17 +539,20 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>       */
>      if ( page_order == PAGE_ORDER_1G )
>      {
> -        l1_pgentry_t old_entry = l1e_empty();
>          p2m_entry = p2m_find_entry(table, &gfn_remainder, gfn,
>                                     L3_PAGETABLE_SHIFT - PAGE_SHIFT,
>                                     L3_PAGETABLE_ENTRIES);
>          ASSERT(p2m_entry);
> -        if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) &&
> -             !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
> +        old_flags = l1e_get_flags(*p2m_entry);
> +        if ( old_flags & _PAGE_PRESENT )
>          {
> -            /* We're replacing a non-SP page with a superpage.  Make sure to
> -             * handle freeing the table properly. */
> -            old_entry = *p2m_entry;
> +            if ( old_flags & _PAGE_PSE )
> +                old_mfn = l1e_get_pfn(*p2m_entry);
> +            else
> +            {
> +                old_entry = *p2m_entry;
> +                old_flags = ~0U;
> +            }
>          }
>  
>          ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
> @@ -559,17 +563,10 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>          entry_content.l1 = l3e_content.l3;
>  
>          if ( entry_content.l1 != 0 )
> -        {
>              p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
> -            old_mfn = l1e_get_pfn(*p2m_entry);
> -        }
>  
>          p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 3);
>          /* NB: paging_write_p2m_entry() handles tlb flushes properly */
> -
> -        /* Free old intermediate tables if necessary */
> -        if ( l1e_get_flags(old_entry) & _PAGE_PRESENT )
> -            p2m_free_entry(p2m, &old_entry, page_order);
>      }
>      else 
>      {
> @@ -591,7 +588,9 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>          p2m_entry = p2m_find_entry(table, &gfn_remainder, gfn,
>                                     0, L1_PAGETABLE_ENTRIES);
>          ASSERT(p2m_entry);
> -        
> +        old_flags = l1e_get_flags(*p2m_entry);
> +        old_mfn = l1e_get_pfn(*p2m_entry);
> +
>          if ( mfn_valid(mfn) || (p2mt == p2m_mmio_direct)
>                              || p2m_is_paging(p2mt) )
>              entry_content = p2m_l1e_from_pfn(mfn_x(mfn),
> @@ -600,29 +599,28 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>              entry_content = l1e_empty();
>  
>          if ( entry_content.l1 != 0 )
> -        {
>              p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
> -            old_mfn = l1e_get_pfn(*p2m_entry);
> -        }
> +
>          /* level 1 entry */
>          p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 1);
>          /* NB: paging_write_p2m_entry() handles tlb flushes properly */
>      }
>      else if ( page_order == PAGE_ORDER_2M )
>      {
> -        l1_pgentry_t old_entry = l1e_empty();
>          p2m_entry = p2m_find_entry(table, &gfn_remainder, gfn,
>                                     L2_PAGETABLE_SHIFT - PAGE_SHIFT,
>                                     L2_PAGETABLE_ENTRIES);
>          ASSERT(p2m_entry);
> -        
> -        /* FIXME: Deal with 4k replaced by 2meg pages */
> -        if ( (l1e_get_flags(*p2m_entry) & _PAGE_PRESENT) &&
> -             !(l1e_get_flags(*p2m_entry) & _PAGE_PSE) )
> -        {
> -            /* We're replacing a non-SP page with a superpage.  Make sure to
> -             * handle freeing the table properly. */
> -            old_entry = *p2m_entry;
> +        old_flags = l1e_get_flags(*p2m_entry);
> +        if ( old_flags & _PAGE_PRESENT )
> +        {
> +            if ( old_flags & _PAGE_PSE )
> +                old_mfn = l1e_get_pfn(*p2m_entry);
> +            else
> +            {
> +                old_entry = *p2m_entry;
> +                old_flags = ~0U;
> +            }
>          }
>          
>          ASSERT(!mfn_valid(mfn) || p2mt != p2m_mmio_direct);
> @@ -636,17 +634,10 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>          entry_content.l1 = l2e_content.l2;
>  
>          if ( entry_content.l1 != 0 )
> -        {
>              p2m_add_iommu_flags(&entry_content, 0, iommu_pte_flags);
> -            old_mfn = l1e_get_pfn(*p2m_entry);
> -        }
>  
>          p2m->write_p2m_entry(p2m, gfn, p2m_entry, entry_content, 2);
>          /* NB: paging_write_p2m_entry() handles tlb flushes properly */
> -
> -        /* Free old intermediate tables if necessary */
> -        if ( l1e_get_flags(old_entry) & _PAGE_PRESENT )
> -            p2m_free_entry(p2m, &old_entry, page_order);
>      }
>  
>      /* Track the highest gfn for which we have ever had a valid mapping */
> @@ -654,11 +645,14 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>           && (gfn + (1UL << page_order) - 1 > p2m->max_mapped_pfn) )
>          p2m->max_mapped_pfn = gfn + (1UL << page_order) - 1;
>  
> -    if ( iommu_enabled && need_iommu(p2m->domain) )
> +    if ( iommu_enabled && need_iommu(p2m->domain) &&
> +         (p2m_get_iommu_flags(p2m_flags_to_type(old_flags)) !=
> +          iommu_pte_flags ||
> +          old_mfn != mfn_x(mfn)) )
>      {
> -        if ( iommu_hap_pt_share )
> +        if ( iommu_use_hap_pt(p2m->domain) )
>          {
> -            if ( old_mfn && (old_mfn != mfn_x(mfn)) )
> +            if ( old_flags & _PAGE_PRESENT )
>                  amd_iommu_flush_pages(p2m->domain, gfn, page_order);
>          }
>          else
> @@ -674,6 +668,14 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>          }
>      }
>  
> +    /*
> +     * Free old intermediate tables if necessary.  This has to be the
> +     * last thing we do, after removal from the IOMMU tables, so as to
> +     * avoid a potential use-after-free.
> +     */
> +    if ( l1e_get_flags(old_entry) & _PAGE_PRESENT )
> +        p2m_free_entry(p2m, &old_entry, page_order);
> +
>   out:
>      unmap_domain_page(table);
>      return rc;

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-22 14:15   ` Jan Beulich
@ 2015-09-23 14:00     ` Wei Liu
  2015-09-29 10:47       ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Wei Liu @ 2015-09-23 14:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Wei Liu, George Dunlap, Andrew Cooper, Jun Nakajima,
	xen-devel, Keir Fraser

On Tue, Sep 22, 2015 at 08:15:39AM -0600, Jan Beulich wrote:
> >>> On 21.09.15 at 16:02, <JBeulich@suse.com> wrote:
> > In the EPT case permission changes should also result in updates or
> > TLB flushes.
> > 
> > In the NPT case the old MFN does not depend on the new entry being
> > valid (but solely on the old one), and the need to update or TLB-flush
> > again also depends on permission changes.
> > 
> > In the shadow mode case, iommu_hap_pt_share should be ignored.
> > 
> > Furthermore in the NPT/shadow case old intermediate page tables must
> > be freed only after IOMMU side updates/flushes have got carried out.
> > 
> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Wei,
> 
> I'm sorry, I forgot to Cc you on the patch and state that I think this
> should be considered for 4.6.
> 

While I think having this issue fixed for 4.6 would be nice,  I would
like to ...

> Jan
> 
> > ---
> > In addition to the fixes here it looks to me as if both EPT and
> > NPT/shadow code lack invalidation of IOMMU side paging structure
> > caches, i.e. further changes may be needed. Am I overlooking something?

have this question answered. Having a "fix" that doesn't actually fix
the issue is not very useful...

Wei.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-21 13:57 [PATCH 0/3] x86: further P2M adjustments Jan Beulich
                   ` (2 preceding siblings ...)
  2015-09-21 14:03 ` [PATCH 3/3] x86/p2m‑ept: adjust some types in ept_set_entry() Jan Beulich
@ 2015-09-25  6:58 ` Jan Beulich
  2015-09-28 11:33 ` [PATCH 0/3] x86: further P2M adjustments Jan Beulich
  4 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2015-09-25  6:58 UTC (permalink / raw)
  To: Aravind Gopalakrishnan, suravee.suthikulpanit, Kevin Tian,
	Yang Z Zhang
  Cc: Keir Fraser, George Dunlap, Andrew Cooper, Jun Nakajima,
	xen-devel, Wei Liu

>>> On 21.09.15 at 16:02,  wrote:
> In the EPT case permission changes should also result in updates or
> TLB flushes.
> 
> In the NPT case the old MFN does not depend on the new entry being
> valid (but solely on the old one), and the need to update or TLB-flush
> again also depends on permission changes.
> 
> In the shadow mode case, iommu_hap_pt_share should be ignored.
> 
> Furthermore in the NPT/shadow case old intermediate page tables must
> be freed only after IOMMU side updates/flushes have got carried out.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> In addition to the fixes here it looks to me as if both EPT and
> NPT/shadow code lack invalidation of IOMMU side paging structure
> caches, i.e. further changes may be needed. Am I overlooking something?

Vendor-specific IOMMU maintainers,

not having Cc-ed you since the change doesn't explicitly touch
relevant code, I think I should make it explicit that we're expecting
input from you on this: Do current IOMMU implementations not do
any paging structure caching? Or else, what am I overlooking that
makes things work despite the apparently missing flushes? I'm
willing to try to do the necessary work, but we do need your input
also in the light of determining whether what is done here is enough
for current hardware (and hence for 4.6), with the remaining issue
just being a (for now) theoretical one.

Thanks, Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-21 14:02 ` [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates Jan Beulich
  2015-09-22 14:15   ` Jan Beulich
@ 2015-09-28  8:55   ` Tian, Kevin
  2015-09-28  9:06     ` Jan Beulich
  2015-09-28 16:32   ` George Dunlap
  2015-09-29  7:58   ` Jan Beulich
  3 siblings, 1 reply; 24+ messages in thread
From: Tian, Kevin @ 2015-09-28  8:55 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Andrew Cooper, Keir Fraser, Nakajima, Jun

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, September 21, 2015 10:03 PM
> 
> In the EPT case permission changes should also result in updates or
> TLB flushes.
> 
> In the NPT case the old MFN does not depend on the new entry being
> valid (but solely on the old one), and the need to update or TLB-flush
> again also depends on permission changes.
> 
> In the shadow mode case, iommu_hap_pt_share should be ignored.
> 
> Furthermore in the NPT/shadow case old intermediate page tables must
> be freed only after IOMMU side updates/flushes have got carried out.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>

Thanks
Kevin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] x86/p2m‑ept: adjust some types in ept_set_entry()
  2015-09-21 14:03 ` [PATCH 3/3] x86/p2m‑ept: adjust some types in ept_set_entry() Jan Beulich
@ 2015-09-28  8:56   ` Tian, Kevin
  2015-09-28 16:44   ` George Dunlap
  1 sibling, 0 replies; 24+ messages in thread
From: Tian, Kevin @ 2015-09-28  8:56 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Andrew Cooper, Keir Fraser, Nakajima, Jun

> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Monday, September 21, 2015 10:04 PM
> 
> Use unsigned and bool_t as appropriate.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-28  8:55   ` Tian, Kevin
@ 2015-09-28  9:06     ` Jan Beulich
  2015-09-28 15:12       ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2015-09-28  9:06 UTC (permalink / raw)
  To: Kevin Tian
  Cc: George Dunlap, Andrew Cooper, Keir Fraser, Jun Nakajima,
	xen-devel

>>> On 28.09.15 at 10:55, <kevin.tian@intel.com> wrote:
>>  From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Monday, September 21, 2015 10:03 PM
>> 
>> In the EPT case permission changes should also result in updates or
>> TLB flushes.
>> 
>> In the NPT case the old MFN does not depend on the new entry being
>> valid (but solely on the old one), and the need to update or TLB-flush
>> again also depends on permission changes.
>> 
>> In the shadow mode case, iommu_hap_pt_share should be ignored.
>> 
>> Furthermore in the NPT/shadow case old intermediate page tables must
>> be freed only after IOMMU side updates/flushes have got carried out.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Acked-by: Kevin Tian <kevin.tian@intel.com>

Thanks, but quite a bit more important would have been a reply
to this

>>In addition to the fixes here it looks to me as if both EPT and
>>NPT/shadow code lack invalidation of IOMMU side paging structure
>>caches, i.e. further changes may be needed. Am I overlooking something?

as it may require the patch to be extended (and hence the ack to
be dropped again).

Thanks, Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/3] x86: further P2M adjustments
  2015-09-21 13:57 [PATCH 0/3] x86: further P2M adjustments Jan Beulich
                   ` (3 preceding siblings ...)
  2015-09-25  6:58 ` [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates Jan Beulich
@ 2015-09-28 11:33 ` Jan Beulich
  4 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2015-09-28 11:33 UTC (permalink / raw)
  To: George Dunlap
  Cc: Andrew Cooper, Kevin Tian, Keir Fraser, Jun Nakajima, xen-devel

>>> On 21.09.15 at 15:57, <JBeulich@suse.com> wrote:
> 1: p2m: tighten conditions of IOMMU mapping updates
> 2: p2m-pt: use pre-calculated IOMMU flags
> 3: p2m-ept: adjust some types in ept_set_entry()
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

George,

any chance I could get an ack or otherwise on this series? Despite
the open question raised in patch 1 I think rather than rev-ing that
patch we should follow up with another one, should further
adjustments indeed turn out necessary.

Thanks, Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-28  9:06     ` Jan Beulich
@ 2015-09-28 15:12       ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2015-09-28 15:12 UTC (permalink / raw)
  To: Kevin Tian, Yang Z Zhang
  Cc: George Dunlap, Andrew Cooper, Keir Fraser, Jun Nakajima,
	xen-devel

>>> On 28.09.15 at 11:06, <JBeulich@suse.com> wrote:
>>>> On 28.09.15 at 10:55, <kevin.tian@intel.com> wrote:
>> Acked-by: Kevin Tian <kevin.tian@intel.com>
> 
> Thanks, but quite a bit more important would have been a reply
> to this
> 
>>>In addition to the fixes here it looks to me as if both EPT and
>>>NPT/shadow code lack invalidation of IOMMU side paging structure
>>>caches, i.e. further changes may be needed. Am I overlooking something?
> 
> as it may require the patch to be extended (and hence the ack to
> be dropped again).

And btw here is a first (incomplete as you can see) take at
addressing the issue (Yang, I realize I forgot to Cc you on the
original submission):

TBD: iommu_pte_flush() includes a cache flush for the passed in PTE, which
     would need to be carried out for all entries which got updated and
     which the IOMMU may have used (i.e. ept_invalidate_emt_range() is okay,
     but resolve_misconfig() and ept_set_entry() aren't).

--- unstable.orig/xen/arch/x86/mm/p2m-ept.c
+++ unstable/xen/arch/x86/mm/p2m-ept.c
@@ -452,6 +452,10 @@ static int ept_invalidate_emt_range(stru
         wrc = atomic_write_ept_entry(&table[index], split_ept_entry, i);
         ASSERT(wrc == 0);
 
+        if ( iommu_hap_pt_share && need_iommu(p2m->domain) )
+            iommu_pte_flush(p2m->domain, first_gfn, &table[index].epte,
+                            i * EPT_TABLE_ORDER, 1);
+
         for ( ; i > target; --i )
             if ( !ept_next_level(p2m, 1, &table, &gfn_remainder, i) )
                 break;
@@ -496,9 +500,9 @@ static int ept_invalidate_emt_range(stru
 static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
 {
     struct ept_data *ept = &p2m->ept;
-    unsigned int level = ept_get_wl(ept);
+    unsigned int level = ept_get_wl(ept), flush_order = 0;
     unsigned long mfn = ept_get_asr(ept);
-    ept_entry_t *epte;
+    ept_entry_t *epte, *flush_epte = NULL;
     int wrc, rc = 0;
 
     if ( !mfn )
@@ -575,6 +579,11 @@ static int resolve_misconfig(struct p2m_
                 {
                     if ( ept_split_super_page(p2m, &e, level, level - 1) )
                     {
+                        if ( !flush_epte )
+                        {
+                            flush_epte = &epte[i];
+                            flush_order = level * EPT_TABLE_ORDER;
+                        }
                         wrc = atomic_write_ept_entry(&epte[i], e, level);
                         ASSERT(wrc == 0);
                         unmap_domain_page(epte);
@@ -619,6 +628,10 @@ static int resolve_misconfig(struct p2m_
     }
 
     unmap_domain_page(epte);
+
+    if ( iommu_hap_pt_share && flush_epte && need_iommu(p2m->domain) )
+        iommu_pte_flush(p2m->domain, gfn, &flush_epte->epte, flush_order, 1);
+
     if ( rc )
     {
         struct vcpu *v;
@@ -668,7 +681,7 @@ ept_set_entry(struct p2m_domain *p2m, un
     uint8_t ipat = 0;
     bool_t need_modify_vtd_table = 1;
     bool_t vtd_pte_present = 0;
-    unsigned int iommu_flags = p2m_get_iommu_flags(p2mt);
+    unsigned int iommu_flags = p2m_get_iommu_flags(p2mt), flush_order = 0;
     enum { sync_off, sync_on, sync_check } needs_sync = sync_check;
     ept_entry_t old_entry = { .epte = 0 };
     ept_entry_t new_entry = { .epte = 0 };
@@ -763,6 +776,8 @@ ept_set_entry(struct p2m_domain *p2m, un
             goto out;
         }
 
+        flush_order = i * EPT_TABLE_ORDER;
+
         /* now install the newly split ept sub-tree */
         /* NB: please make sure domian is paused and no in-fly VT-d DMA. */
         rc = atomic_write_ept_entry(ept_entry, split_ept_entry, i);
@@ -800,7 +815,8 @@ ept_set_entry(struct p2m_domain *p2m, un
 
         /* Safe to read-then-write because we hold the p2m lock */
         if ( ept_entry->mfn == new_entry.mfn &&
-             p2m_get_iommu_flags(ept_entry->sa_p2mt) == iommu_flags )
+             p2m_get_iommu_flags(ept_entry->sa_p2mt) == iommu_flags &&
+             (!iommu_hap_pt_share || flush_order <= order) )
             need_modify_vtd_table = 0;
 
         ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
@@ -829,7 +845,14 @@ out:
          need_modify_vtd_table )
     {
         if ( iommu_hap_pt_share )
-            iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
+        {
+            if ( flush_order > order )
+                vtd_pte_present = 1;
+            else
+                flush_order = order;
+            iommu_pte_flush(d, gfn, &ept_entry->epte, flush_order,
+                            vtd_pte_present);
+        }
         else
         {
             if ( iommu_flags )

Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-21 14:02 ` [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates Jan Beulich
  2015-09-22 14:15   ` Jan Beulich
  2015-09-28  8:55   ` Tian, Kevin
@ 2015-09-28 16:32   ` George Dunlap
  2015-09-28 16:40     ` George Dunlap
  2015-09-29  7:34     ` Jan Beulich
  2015-09-29  7:58   ` Jan Beulich
  3 siblings, 2 replies; 24+ messages in thread
From: George Dunlap @ 2015-09-28 16:32 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Andrew Cooper, Kevin Tian, Keir Fraser,
	Jun Nakajima

On 21/09/15 15:02, Jan Beulich wrote:
> In the EPT case permission changes should also result in updates or
> TLB flushes.
> 
> In the NPT case the old MFN does not depend on the new entry being
> valid (but solely on the old one), and the need to update or TLB-flush
> again also depends on permission changes.
> 
> In the shadow mode case, iommu_hap_pt_share should be ignored.
> 
> Furthermore in the NPT/shadow case old intermediate page tables must
> be freed only after IOMMU side updates/flushes have got carried out.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

So first of all, having all these changes in the same patch almost
certainly slowed down the review process a bit.

> ---
> In addition to the fixes here it looks to me as if both EPT and
> NPT/shadow code lack invalidation of IOMMU side paging structure
> caches, i.e. further changes may be needed. Am I overlooking something?
> 
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -668,6 +668,7 @@ ept_set_entry(struct p2m_domain *p2m, un
>      uint8_t ipat = 0;
>      int need_modify_vtd_table = 1;
>      int vtd_pte_present = 0;
> +    unsigned int iommu_flags = p2m_get_iommu_flags(p2mt);
>      enum { sync_off, sync_on, sync_check } needs_sync = sync_check;
>      ept_entry_t old_entry = { .epte = 0 };
>      ept_entry_t new_entry = { .epte = 0 };
> @@ -798,8 +799,9 @@ ept_set_entry(struct p2m_domain *p2m, un
>          new_entry.mfn = mfn_x(mfn);
>  
>          /* Safe to read-then-write because we hold the p2m lock */
> -        if ( ept_entry->mfn == new_entry.mfn )
> -             need_modify_vtd_table = 0;
> +        if ( ept_entry->mfn == new_entry.mfn &&
> +             p2m_get_iommu_flags(ept_entry->sa_p2mt) == iommu_flags )
> +            need_modify_vtd_table = 0;
>  
>          ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
>      }
> @@ -830,11 +832,9 @@ out:
>              iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
>          else
>          {
> -            unsigned int flags = p2m_get_iommu_flags(p2mt);
> -
> -            if ( flags != 0 )
> +            if ( iommu_flags )
>                  for ( i = 0; i < (1 << order); i++ )
> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, flags);
> +                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
>              else
>                  for ( i = 0; i < (1 << order); i++ )
>                      iommu_unmap_page(d, gfn + i);

EPT changes:

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

And now...

> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -488,12 +488,13 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>      void *table;
>      unsigned long i, gfn_remainder = gfn;
>      l1_pgentry_t *p2m_entry;
> -    l1_pgentry_t entry_content;
> +    l1_pgentry_t entry_content, old_entry = l1e_empty();
>      l2_pgentry_t l2e_content;
>      l3_pgentry_t l3e_content;
>      int rc;
>      unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt);
> -    unsigned long old_mfn = 0;
> +    unsigned int old_flags = 0;
> +    unsigned long old_mfn = INVALID_MFN;

I think this could use at least a comment, and perhaps some better
variable naming here explaining what setting or not setting each of
these different variables means.

Trying to sort out what the effect of setting old_flags to ~0 is
particularly convoluted.

Inferring from the code:

1) Setting old_entry() to some value will cause those values to be
unmapped.  This will only be set for 1G and 2M entries, if the existing
entry is both present and not set as a superpage.  This is pretty
straightforward and looks correct.

2) old_mfn and old_flags are primarily used to control whether an IOMMU
flush happens.

3) old_mfn
 - old_mfn begins at INVALID_MFN, and is set only if the entry itself is
leaf entry.
 - a flush will happen if old_mfn != mfn
 - The effect of this will be to force a flush if replacing a leaf with
a leaf but a different mfn, or if replacing an intermediate table with a
leaf *if* the leaf is not INVALID_MFN

4) old_flags
 - old_flags will be the actual flags if replacing a leaf with a leaf,
or ~0 if replacing an intermediate table with a leaf.
 - a flush will happen if
p2m_iommu_get_flags(p2m_flags_to_type(old_flags)) != iommu_pte_flags.
 - p2m_flags_to_type
   - returns p2m_invalid if flags == 0.  This should never happen here
   - returns the type bits from the flags otherwires.
   - So in the case of old_flags being the actual flags, you get the
actual type of the old entry
   - in the case of old_flags being ~0, you get 0x7f, which is undefined
 - p2m_iommu_get_flags() will return 0 for unknown types.
 - so the effect of the above will be to force a flush if replacing a
leaf with a leaf of different iommu permisions; or, to force a flush if
replacing an intermediate table with any permissions with a leaf that
has at least some iommu permissions.

Did I get the logic there right?

It *looks* to me, therefore, like this won't handle flushes properly in
the case of replacing an intermediate table with an empty leaf entry:
- old_mfn will be INVALID_MFN, matching mfn
- the result of the function call around old_flags will be 0, matching
iommu_pte_flags for the new entry (0)
- so no flush will happen
- but because you're removing permissions, a flush actually should happen.

Did I miss something?

 -George

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-28 16:32   ` George Dunlap
@ 2015-09-28 16:40     ` George Dunlap
  2015-09-29  7:34     ` Jan Beulich
  1 sibling, 0 replies; 24+ messages in thread
From: George Dunlap @ 2015-09-28 16:40 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Andrew Cooper, Kevin Tian, Keir Fraser,
	Jun Nakajima

On 28/09/15 17:32, George Dunlap wrote:
> On 21/09/15 15:02, Jan Beulich wrote:
>> In the EPT case permission changes should also result in updates or
>> TLB flushes.
>>
>> In the NPT case the old MFN does not depend on the new entry being
>> valid (but solely on the old one), and the need to update or TLB-flush
>> again also depends on permission changes.
>>
>> In the shadow mode case, iommu_hap_pt_share should be ignored.
>>
>> Furthermore in the NPT/shadow case old intermediate page tables must
>> be freed only after IOMMU side updates/flushes have got carried out.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> So first of all, having all these changes in the same patch almost
> certainly slowed down the review process a bit.
> 
>> ---
>> In addition to the fixes here it looks to me as if both EPT and
>> NPT/shadow code lack invalidation of IOMMU side paging structure
>> caches, i.e. further changes may be needed. Am I overlooking something?
>>
>> --- a/xen/arch/x86/mm/p2m-ept.c
>> +++ b/xen/arch/x86/mm/p2m-ept.c
>> @@ -668,6 +668,7 @@ ept_set_entry(struct p2m_domain *p2m, un
>>      uint8_t ipat = 0;
>>      int need_modify_vtd_table = 1;
>>      int vtd_pte_present = 0;
>> +    unsigned int iommu_flags = p2m_get_iommu_flags(p2mt);
>>      enum { sync_off, sync_on, sync_check } needs_sync = sync_check;
>>      ept_entry_t old_entry = { .epte = 0 };
>>      ept_entry_t new_entry = { .epte = 0 };
>> @@ -798,8 +799,9 @@ ept_set_entry(struct p2m_domain *p2m, un
>>          new_entry.mfn = mfn_x(mfn);
>>  
>>          /* Safe to read-then-write because we hold the p2m lock */
>> -        if ( ept_entry->mfn == new_entry.mfn )
>> -             need_modify_vtd_table = 0;
>> +        if ( ept_entry->mfn == new_entry.mfn &&
>> +             p2m_get_iommu_flags(ept_entry->sa_p2mt) == iommu_flags )
>> +            need_modify_vtd_table = 0;
>>  
>>          ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
>>      }
>> @@ -830,11 +832,9 @@ out:
>>              iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
>>          else
>>          {
>> -            unsigned int flags = p2m_get_iommu_flags(p2mt);
>> -
>> -            if ( flags != 0 )
>> +            if ( iommu_flags )
>>                  for ( i = 0; i < (1 << order); i++ )
>> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, flags);
>> +                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
>>              else
>>                  for ( i = 0; i < (1 << order); i++ )
>>                      iommu_unmap_page(d, gfn + i);
> 
> EPT changes:
> 
> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
> 
> And now...
> 
>> --- a/xen/arch/x86/mm/p2m-pt.c
>> +++ b/xen/arch/x86/mm/p2m-pt.c
>> @@ -488,12 +488,13 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>>      void *table;
>>      unsigned long i, gfn_remainder = gfn;
>>      l1_pgentry_t *p2m_entry;
>> -    l1_pgentry_t entry_content;
>> +    l1_pgentry_t entry_content, old_entry = l1e_empty();
>>      l2_pgentry_t l2e_content;
>>      l3_pgentry_t l3e_content;
>>      int rc;
>>      unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt);
>> -    unsigned long old_mfn = 0;
>> +    unsigned int old_flags = 0;
>> +    unsigned long old_mfn = INVALID_MFN;
> 
> I think this could use at least a comment, and perhaps some better
> variable naming here explaining what setting or not setting each of
> these different variables means.
> 
> Trying to sort out what the effect of setting old_flags to ~0 is
> particularly convoluted.
> 
> Inferring from the code:
> 
> 1) Setting old_entry() to some value will cause those values to be
> unmapped.  This will only be set for 1G and 2M entries, if the existing
> entry is both present and not set as a superpage.  This is pretty
> straightforward and looks correct.
> 
> 2) old_mfn and old_flags are primarily used to control whether an IOMMU
> flush happens.
> 
> 3) old_mfn
>  - old_mfn begins at INVALID_MFN, and is set only if the entry itself is
> leaf entry.
>  - a flush will happen if old_mfn != mfn
>  - The effect of this will be to force a flush if replacing a leaf with
> a leaf but a different mfn, or if replacing an intermediate table with a
> leaf *if* the leaf is not INVALID_MFN
> 
> 4) old_flags
>  - old_flags will be the actual flags if replacing a leaf with a leaf,
> or ~0 if replacing an intermediate table with a leaf.
>  - a flush will happen if
> p2m_iommu_get_flags(p2m_flags_to_type(old_flags)) != iommu_pte_flags.
>  - p2m_flags_to_type
>    - returns p2m_invalid if flags == 0.  This should never happen here
>    - returns the type bits from the flags otherwires.
>    - So in the case of old_flags being the actual flags, you get the
> actual type of the old entry
>    - in the case of old_flags being ~0, you get 0x7f, which is undefined
>  - p2m_iommu_get_flags() will return 0 for unknown types.
>  - so the effect of the above will be to force a flush if replacing a
> leaf with a leaf of different iommu permisions; or, to force a flush if
> replacing an intermediate table with any permissions with a leaf that
> has at least some iommu permissions.
> 
> Did I get the logic there right?
> 
> It *looks* to me, therefore, like this won't handle flushes properly in
> the case of replacing an intermediate table with an empty leaf entry:
> - old_mfn will be INVALID_MFN, matching mfn
> - the result of the function call around old_flags will be 0, matching
> iommu_pte_flags for the new entry (0)
> - so no flush will happen
> - but because you're removing permissions, a flush actually should happen.
> 
> Did I miss something?

Regardless whether I did or not, this code is very difficult to reason
about, even after you've figured out what old_mfn and old_flags are
meant to be used for.  It seems like maintaining a boolean like
"needs_flush" would be clearer and easier to reason about.

I might rename "old_entry" to "intermediate_table" as well, with a
comment saying "/* Intermediate table to free if we're replacing it with
a superpage */" or something like that.

 -George

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] x86/p2m‑pt: use pre‑calculated IOMMU flags
  2015-09-21 14:03 ` [PATCH 2/3] x86/p2m‑pt: use pre‑calculated IOMMU flags Jan Beulich
@ 2015-09-28 16:42   ` George Dunlap
  2015-10-01 16:31   ` Wei Liu
  1 sibling, 0 replies; 24+ messages in thread
From: George Dunlap @ 2015-09-28 16:42 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: George Dunlap, Andrew Cooper, Keir Fraser

On 21/09/15 15:03, Jan Beulich wrote:
> ... instead of recalculating them.
> 
> At once clean up formatting of the touched code and drop a stray loop
> local variable shadowing a function scope one.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] x86/p2m‑ept: adjust some types in ept_set_entry()
  2015-09-21 14:03 ` [PATCH 3/3] x86/p2m‑ept: adjust some types in ept_set_entry() Jan Beulich
  2015-09-28  8:56   ` Tian, Kevin
@ 2015-09-28 16:44   ` George Dunlap
  1 sibling, 0 replies; 24+ messages in thread
From: George Dunlap @ 2015-09-28 16:44 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Andrew Cooper, Kevin Tian, Keir Fraser,
	Jun Nakajima

On 21/09/15 15:03, Jan Beulich wrote:
> Use unsigned and bool_t as appropriate.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: George Dunlap <george.dunlap@citrix.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-28 16:32   ` George Dunlap
  2015-09-28 16:40     ` George Dunlap
@ 2015-09-29  7:34     ` Jan Beulich
  1 sibling, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2015-09-29  7:34 UTC (permalink / raw)
  To: George Dunlap
  Cc: KevinTian, Keir Fraser, George Dunlap, Andrew Cooper,
	Jun Nakajima, xen-devel

>>> On 28.09.15 at 18:32, <george.dunlap@citrix.com> wrote:
> On 21/09/15 15:02, Jan Beulich wrote:
>> In the EPT case permission changes should also result in updates or
>> TLB flushes.
>> 
>> In the NPT case the old MFN does not depend on the new entry being
>> valid (but solely on the old one), and the need to update or TLB-flush
>> again also depends on permission changes.
>> 
>> In the shadow mode case, iommu_hap_pt_share should be ignored.
>> 
>> Furthermore in the NPT/shadow case old intermediate page tables must
>> be freed only after IOMMU side updates/flushes have got carried out.
>> 
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> So first of all, having all these changes in the same patch almost
> certainly slowed down the review process a bit.

Okay, I'll try to split things up ...

>> --- a/xen/arch/x86/mm/p2m-ept.c
>> +++ b/xen/arch/x86/mm/p2m-ept.c
>> @@ -668,6 +668,7 @@ ept_set_entry(struct p2m_domain *p2m, un
>>      uint8_t ipat = 0;
>>      int need_modify_vtd_table = 1;
>>      int vtd_pte_present = 0;
>> +    unsigned int iommu_flags = p2m_get_iommu_flags(p2mt);
>>      enum { sync_off, sync_on, sync_check } needs_sync = sync_check;
>>      ept_entry_t old_entry = { .epte = 0 };
>>      ept_entry_t new_entry = { .epte = 0 };
>> @@ -798,8 +799,9 @@ ept_set_entry(struct p2m_domain *p2m, un
>>          new_entry.mfn = mfn_x(mfn);
>>  
>>          /* Safe to read-then-write because we hold the p2m lock */
>> -        if ( ept_entry->mfn == new_entry.mfn )
>> -             need_modify_vtd_table = 0;
>> +        if ( ept_entry->mfn == new_entry.mfn &&
>> +             p2m_get_iommu_flags(ept_entry->sa_p2mt) == iommu_flags )
>> +            need_modify_vtd_table = 0;
>>  
>>          ept_p2m_type_to_flags(p2m, &new_entry, p2mt, p2ma);
>>      }
>> @@ -830,11 +832,9 @@ out:
>>              iommu_pte_flush(d, gfn, &ept_entry->epte, order, vtd_pte_present);
>>          else
>>          {
>> -            unsigned int flags = p2m_get_iommu_flags(p2mt);
>> -
>> -            if ( flags != 0 )
>> +            if ( iommu_flags )
>>                  for ( i = 0; i < (1 << order); i++ )
>> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, flags);
>> +                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, 
> iommu_flags);
>>              else
>>                  for ( i = 0; i < (1 << order); i++ )
>>                      iommu_unmap_page(d, gfn + i);
> 
> EPT changes:
> 
> Reviewed-by: George Dunlap <george.dunlap@citrix.com>

... and commit the EPT side.

>> --- a/xen/arch/x86/mm/p2m-pt.c
>> +++ b/xen/arch/x86/mm/p2m-pt.c
>> @@ -488,12 +488,13 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>>      void *table;
>>      unsigned long i, gfn_remainder = gfn;
>>      l1_pgentry_t *p2m_entry;
>> -    l1_pgentry_t entry_content;
>> +    l1_pgentry_t entry_content, old_entry = l1e_empty();
>>      l2_pgentry_t l2e_content;
>>      l3_pgentry_t l3e_content;
>>      int rc;
>>      unsigned int iommu_pte_flags = p2m_get_iommu_flags(p2mt);
>> -    unsigned long old_mfn = 0;
>> +    unsigned int old_flags = 0;
>> +    unsigned long old_mfn = INVALID_MFN;
> 
> I think this could use at least a comment, and perhaps some better
> variable naming here explaining what setting or not setting each of
> these different variables means.
> 
> Trying to sort out what the effect of setting old_flags to ~0 is
> particularly convoluted.
> 
> Inferring from the code:
> 
> 1) Setting old_entry() to some value will cause those values to be
> unmapped.  This will only be set for 1G and 2M entries, if the existing
> entry is both present and not set as a superpage.  This is pretty
> straightforward and looks correct.
> 
> 2) old_mfn and old_flags are primarily used to control whether an IOMMU
> flush happens.
> 
> 3) old_mfn
>  - old_mfn begins at INVALID_MFN, and is set only if the entry itself is
> leaf entry.
>  - a flush will happen if old_mfn != mfn
>  - The effect of this will be to force a flush if replacing a leaf with
> a leaf but a different mfn, or if replacing an intermediate table with a
> leaf *if* the leaf is not INVALID_MFN
> 
> 4) old_flags
>  - old_flags will be the actual flags if replacing a leaf with a leaf,
> or ~0 if replacing an intermediate table with a leaf.
>  - a flush will happen if
> p2m_iommu_get_flags(p2m_flags_to_type(old_flags)) != iommu_pte_flags.
>  - p2m_flags_to_type
>    - returns p2m_invalid if flags == 0.  This should never happen here
>    - returns the type bits from the flags otherwires.
>    - So in the case of old_flags being the actual flags, you get the
> actual type of the old entry
>    - in the case of old_flags being ~0, you get 0x7f, which is undefined
>  - p2m_iommu_get_flags() will return 0 for unknown types.
>  - so the effect of the above will be to force a flush if replacing a
> leaf with a leaf of different iommu permisions; or, to force a flush if
> replacing an intermediate table with any permissions with a leaf that
> has at least some iommu permissions.
> 
> Did I get the logic there right?
> 
> It *looks* to me, therefore, like this won't handle flushes properly in
> the case of replacing an intermediate table with an empty leaf entry:
> - old_mfn will be INVALID_MFN, matching mfn
> - the result of the function call around old_flags will be 0, matching
> iommu_pte_flags for the new entry (0)
> - so no flush will happen
> - but because you're removing permissions, a flush actually should happen.
> 
> Did I miss something?

I think you're right; using p2m_flags_to_type(~0) was not a good
idea (and I simply didn't spot that implication of changing the
initializer from 0 to ~0, as using 0 would have other issues).

But, with things getting quite complicated here, I wonder whether
we shouldn't consider an easier route as alternative: There's
currently no use for all the IOMMU side interaction in p2m-pt.c
(since on NPT iommu_hap_pt_share gets forced to false), and
hence rather than trying to get dead code right we could as well
rip it out. Which would then leave only the page table freeing fix
to be carried.

Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-21 14:02 ` [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates Jan Beulich
                     ` (2 preceding siblings ...)
  2015-09-28 16:32   ` George Dunlap
@ 2015-09-29  7:58   ` Jan Beulich
  2015-09-30  0:02     ` Jiang, Yunhong
  3 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2015-09-29  7:58 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Andrew Cooper, Kevin Tian, Keir Fraser,
	Jun Nakajima

>>> On 21.09.15 at 16:02, <JBeulich@suse.com> wrote:
> In the EPT case permission changes should also result in updates or
> TLB flushes.
> 
> In the NPT case the old MFN does not depend on the new entry being
> valid (but solely on the old one), and the need to update or TLB-flush
> again also depends on permission changes.
> 
> In the shadow mode case, iommu_hap_pt_share should be ignored.
> 
> Furthermore in the NPT/shadow case old intermediate page tables must
> be freed only after IOMMU side updates/flushes have got carried out.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> In addition to the fixes here it looks to me as if both EPT and
> NPT/shadow code lack invalidation of IOMMU side paging structure
> caches, i.e. further changes may be needed. Am I overlooking something?

Okay, I think I finally figured it myself (with the help of the partial
patch presented yesterday): There is no need for more flushing
in the current model, where we only ever split large pages or fully
replace sub-hierarchies with large pages (accompanied by a suitable
flush already). More flushing would only be needed if we were to
add code to re-establish large pages if an intermediate page table
re-gained a state where this would be possible (quite desirable
in a situation where log-dirty mode got temporarily enabled on a
domain, but I'm not sure how common an operation that would be).
When splitting large pages, all the hardware can have cached are
- leaf entries the size of the page being split
- leaf entries of smaller size sub-pages (in case large pages don't
  get entered into the TLB)
- intermediate entries from above the leaf entry being split (which
  don't get changed)
Leaf entries already get flushed correctly, and since the involved
intermediate entries don't get changed, not flush is needed there.

Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-23 14:00     ` Wei Liu
@ 2015-09-29 10:47       ` Jan Beulich
  2015-09-29 11:28         ` Wei Liu
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2015-09-29 10:47 UTC (permalink / raw)
  To: Wei Liu
  Cc: Kevin Tian, Keir Fraser, George Dunlap, Andrew Cooper,
	Jun Nakajima, xen-devel

>>> On 23.09.15 at 16:00, <wei.liu2@citrix.com> wrote:
> On Tue, Sep 22, 2015 at 08:15:39AM -0600, Jan Beulich wrote:
>> >>> On 21.09.15 at 16:02, <JBeulich@suse.com> wrote:
>> > In the EPT case permission changes should also result in updates or
>> > TLB flushes.
>> > 
>> > In the NPT case the old MFN does not depend on the new entry being
>> > valid (but solely on the old one), and the need to update or TLB-flush
>> > again also depends on permission changes.
>> > 
>> > In the shadow mode case, iommu_hap_pt_share should be ignored.
>> > 
>> > Furthermore in the NPT/shadow case old intermediate page tables must
>> > be freed only after IOMMU side updates/flushes have got carried out.
>> > 
>> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> 
>> Wei,
>> 
>> I'm sorry, I forgot to Cc you on the patch and state that I think this
>> should be considered for 4.6.
> 
> While I think having this issue fixed for 4.6 would be nice,  I would
> like to ...
> 
>> > ---
>> > In addition to the fixes here it looks to me as if both EPT and
>> > NPT/shadow code lack invalidation of IOMMU side paging structure
>> > caches, i.e. further changes may be needed. Am I overlooking something?
> 
> have this question answered. Having a "fix" that doesn't actually fix
> the issue is not very useful...

So I suppose you saw my reasoning of there not actually being any
further flushing necessary right now. Following the discussion with
George I split off and applied the EPT side of this, which I'd like to
see go into 4.6 too. The NPT/shadow side will need a little more work
(including further splitting the original patch), partly depending on
a decision on whether to keep the dead page table sharing code in
p2m-pt.c.

I did already split out the intermediate page table freeing patch.
The only other bug is the incomplete checking of iommu_hap_pt_share
in shadow mode, everything else would be cleanup only no matter
what route we go. The only thing is that this broken check would
go away altogether if we decided to drop the pt-share logic, but I
suppose for backporting purposes it would be worthwhile to have
that fix separated regardless of further direction.

Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-29 10:47       ` Jan Beulich
@ 2015-09-29 11:28         ` Wei Liu
  0 siblings, 0 replies; 24+ messages in thread
From: Wei Liu @ 2015-09-29 11:28 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Wei Liu, George Dunlap, Andrew Cooper, Jun Nakajima,
	xen-devel, Keir Fraser

On Tue, Sep 29, 2015 at 04:47:00AM -0600, Jan Beulich wrote:
> >>> On 23.09.15 at 16:00, <wei.liu2@citrix.com> wrote:
> > On Tue, Sep 22, 2015 at 08:15:39AM -0600, Jan Beulich wrote:
> >> >>> On 21.09.15 at 16:02, <JBeulich@suse.com> wrote:
> >> > In the EPT case permission changes should also result in updates or
> >> > TLB flushes.
> >> > 
> >> > In the NPT case the old MFN does not depend on the new entry being
> >> > valid (but solely on the old one), and the need to update or TLB-flush
> >> > again also depends on permission changes.
> >> > 
> >> > In the shadow mode case, iommu_hap_pt_share should be ignored.
> >> > 
> >> > Furthermore in the NPT/shadow case old intermediate page tables must
> >> > be freed only after IOMMU side updates/flushes have got carried out.
> >> > 
> >> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> 
> >> Wei,
> >> 
> >> I'm sorry, I forgot to Cc you on the patch and state that I think this
> >> should be considered for 4.6.
> > 
> > While I think having this issue fixed for 4.6 would be nice,  I would
> > like to ...
> > 
> >> > ---
> >> > In addition to the fixes here it looks to me as if both EPT and
> >> > NPT/shadow code lack invalidation of IOMMU side paging structure
> >> > caches, i.e. further changes may be needed. Am I overlooking something?
> > 
> > have this question answered. Having a "fix" that doesn't actually fix
> > the issue is not very useful...
> 
> So I suppose you saw my reasoning of there not actually being any
> further flushing necessary right now. Following the discussion with

Yes, I saw it. I'm convinced.

> George I split off and applied the EPT side of this, which I'd like to
> see go into 4.6 too. The NPT/shadow side will need a little more work

EPT side:

Release-acked-by: Wei Liu <wei.liu2@citrix.com>

> (including further splitting the original patch), partly depending on
> a decision on whether to keep the dead page table sharing code in
> p2m-pt.c.
> 
> I did already split out the intermediate page table freeing patch.
> The only other bug is the incomplete checking of iommu_hap_pt_share
> in shadow mode, everything else would be cleanup only no matter
> what route we go. The only thing is that this broken check would
> go away altogether if we decided to drop the pt-share logic, but I
> suppose for backporting purposes it would be worthwhile to have
> that fix separated regardless of further direction.
> 

I will reply separately to your other two patches.

Wei.

> Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-29  7:58   ` Jan Beulich
@ 2015-09-30  0:02     ` Jiang, Yunhong
  2015-09-30  9:58       ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Jiang, Yunhong @ 2015-09-30  0:02 UTC (permalink / raw)
  To: Jan Beulich, xen-devel
  Cc: George Dunlap, Andrew Cooper, Tian, Kevin, Keir Fraser,
	Nakajima, Jun



> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Jan Beulich
> Sent: Tuesday, September 29, 2015 12:59 AM
> To: xen-devel
> Cc: George Dunlap; Andrew Cooper; Tian, Kevin; Keir Fraser; Nakajima, Jun
> Subject: Re: [Xen-devel] [PATCH 1/3] x86/p2m: tighten conditions of IOMMU
> mapping updates
> 
> >>> On 21.09.15 at 16:02, <JBeulich@suse.com> wrote:
> > In the EPT case permission changes should also result in updates or
> > TLB flushes.
> >
> > In the NPT case the old MFN does not depend on the new entry being
> > valid (but solely on the old one), and the need to update or TLB-flush
> > again also depends on permission changes.
> >
> > In the shadow mode case, iommu_hap_pt_share should be ignored.
> >
> > Furthermore in the NPT/shadow case old intermediate page tables must
> > be freed only after IOMMU side updates/flushes have got carried out.
> >
> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > ---
> > In addition to the fixes here it looks to me as if both EPT and
> > NPT/shadow code lack invalidation of IOMMU side paging structure
> > caches, i.e. further changes may be needed. Am I overlooking something?
> 
> Okay, I think I finally figured it myself (with the help of the partial
> patch presented yesterday): There is no need for more flushing
> in the current model, where we only ever split large pages or fully
> replace sub-hierarchies with large pages (accompanied by a suitable
> flush already). More flushing would only be needed if we were to
> add code to re-establish large pages if an intermediate page table
> re-gained a state where this would be possible (quite desirable
> in a situation where log-dirty mode got temporarily enabled on a
> domain, but I'm not sure how common an operation that would be).
> When splitting large pages, all the hardware can have cached are
> - leaf entries the size of the page being split

I'm not sure if there is an offset-1 issue. 

Basically I think we depends on the "Invalidation Hint" being cleared so that all the paging structure cache are cleared. That should work if there is no split happen.
However, in the split situation, I suspect that the entry that was split is not covered by the ADDR/AM combination since the AM, based on  iommu_flush_iotlb_psi(), is in fact the "order" parameter for ept_set_entry(). I'm not sure if I missed anything. 

* Address (ADDR): For page-selective-within-domain invalidations, the Address field indicates the
starting second-level page address of the mappings that needs to be invalidated. Hardware
ignores bits 127:(64+N), where N is the maximum guest address width (MGAW) supported. This
field is ignored by hardware for global and domain-selective invalidations.
* Address Mask (AM): For page-selective-within-domain invalidations, the Address Mask specifies
the number of contiguous second-level pages that needs to be invalidated. The encoding for the
AM field is same as the AM field encoding in the Invalidate Address Register (see
Section 10.4.8.2). When invalidating a large-page translation, software must use the appropriate
Address Mask value (0 for 4KByte page, 9 for 2-MByte page, and 18 for 1-GByte page).

Thanks
--jyh

> - leaf entries of smaller size sub-pages (in case large pages don't
>   get entered into the TLB)
> - intermediate entries from above the leaf entry being split (which
>   don't get changed)
> Leaf entries already get flushed correctly, and since the involved
> intermediate entries don't get changed, not flush is needed there.
> 
> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-30  0:02     ` Jiang, Yunhong
@ 2015-09-30  9:58       ` Jan Beulich
  2015-10-02 16:37         ` Jiang, Yunhong
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2015-09-30  9:58 UTC (permalink / raw)
  To: Yunhong Jiang
  Cc: Kevin Tian, KeirFraser, George Dunlap, Andrew Cooper,
	Jun Nakajima, xen-devel

>>> On 30.09.15 at 02:02, <yunhong.jiang@intel.com> wrote:

> 
>> -----Original Message-----
>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>> bounces@lists.xen.org] On Behalf Of Jan Beulich
>> Sent: Tuesday, September 29, 2015 12:59 AM
>> To: xen-devel
>> Cc: George Dunlap; Andrew Cooper; Tian, Kevin; Keir Fraser; Nakajima, Jun
>> Subject: Re: [Xen-devel] [PATCH 1/3] x86/p2m: tighten conditions of IOMMU
>> mapping updates
>> 
>> >>> On 21.09.15 at 16:02, <JBeulich@suse.com> wrote:
>> > In the EPT case permission changes should also result in updates or
>> > TLB flushes.
>> >
>> > In the NPT case the old MFN does not depend on the new entry being
>> > valid (but solely on the old one), and the need to update or TLB-flush
>> > again also depends on permission changes.
>> >
>> > In the shadow mode case, iommu_hap_pt_share should be ignored.
>> >
>> > Furthermore in the NPT/shadow case old intermediate page tables must
>> > be freed only after IOMMU side updates/flushes have got carried out.
>> >
>> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> > ---
>> > In addition to the fixes here it looks to me as if both EPT and
>> > NPT/shadow code lack invalidation of IOMMU side paging structure
>> > caches, i.e. further changes may be needed. Am I overlooking something?
>> 
>> Okay, I think I finally figured it myself (with the help of the partial
>> patch presented yesterday): There is no need for more flushing
>> in the current model, where we only ever split large pages or fully
>> replace sub-hierarchies with large pages (accompanied by a suitable
>> flush already). More flushing would only be needed if we were to
>> add code to re-establish large pages if an intermediate page table
>> re-gained a state where this would be possible (quite desirable
>> in a situation where log-dirty mode got temporarily enabled on a
>> domain, but I'm not sure how common an operation that would be).
>> When splitting large pages, all the hardware can have cached are
>> - leaf entries the size of the page being split
> 
> I'm not sure if there is an offset-1 issue. 

offset-1 ?

> Basically I think we depends on the "Invalidation Hint" being cleared so 
> that all the paging structure cache are cleared. That should work if there is 
> no split happen.
> However, in the split situation, I suspect that the entry that was split is 
> not covered by the ADDR/AM combination since the AM, based on  
> iommu_flush_iotlb_psi(), is in fact the "order" parameter for 
> ept_set_entry(). I'm not sure if I missed anything. 

IH is always clear in Xen right now. And the address covers both -
leaf entries as well as intermediate ones. Yet as explained before
we don't even care about intermediate ones in the split case.

Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] x86/p2m‑pt: use pre‑calculated IOMMU flags
  2015-09-21 14:03 ` [PATCH 2/3] x86/p2m‑pt: use pre‑calculated IOMMU flags Jan Beulich
  2015-09-28 16:42   ` George Dunlap
@ 2015-10-01 16:31   ` Wei Liu
  1 sibling, 0 replies; 24+ messages in thread
From: Wei Liu @ 2015-10-01 16:31 UTC (permalink / raw)
  To: Jan Beulich
  Cc: George Dunlap, xen-devel, Keir Fraser, wei.liu2, Andrew Cooper

Andrew pointed me to this one and suggested it go into 4.6 because it's
prerequisite for a bug fix patch.

On Mon, Sep 21, 2015 at 08:03:22AM -0600, Jan Beulich wrote:
> ... instead of recalculating them.
> 
> At once clean up formatting of the touched code and drop a stray loop
> local variable shadowing a function scope one.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 

Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>

> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -655,17 +655,13 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
>              if ( old_flags & _PAGE_PRESENT )
>                  amd_iommu_flush_pages(p2m->domain, gfn, page_order);
>          }
> +        else if ( iommu_pte_flags )
> +            for ( i = 0; i < (1UL << page_order); i++ )
> +                iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
> +                               iommu_pte_flags);
>          else
> -        {
> -            unsigned int flags = p2m_get_iommu_flags(p2mt);
> -
> -            if ( flags != 0 )
> -                for ( i = 0; i < (1UL << page_order); i++ )
> -                    iommu_map_page(p2m->domain, gfn+i, mfn_x(mfn)+i, flags);
> -            else
> -                for ( int i = 0; i < (1UL << page_order); i++ )
> -                    iommu_unmap_page(p2m->domain, gfn+i);
> -        }
> +            for ( i = 0; i < (1UL << page_order); i++ )
> +                iommu_unmap_page(p2m->domain, gfn + i);
>      }
>  
>      /*
> 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates
  2015-09-30  9:58       ` Jan Beulich
@ 2015-10-02 16:37         ` Jiang, Yunhong
  0 siblings, 0 replies; 24+ messages in thread
From: Jiang, Yunhong @ 2015-10-02 16:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, KeirFraser, George Dunlap, Andrew Cooper,
	Nakajima, Jun, xen-devel



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Wednesday, September 30, 2015 2:58 AM
> To: Jiang, Yunhong
> Cc: Andrew Cooper; George Dunlap; Nakajima, Jun; Tian, Kevin; xen-devel;
> KeirFraser
> Subject: RE: [Xen-devel] [PATCH 1/3] x86/p2m: tighten conditions of IOMMU
> mapping updates
> 
> >>> On 30.09.15 at 02:02, <yunhong.jiang@intel.com> wrote:
> 
> >
> >> -----Original Message-----
> >> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> >> bounces@lists.xen.org] On Behalf Of Jan Beulich
> >> Sent: Tuesday, September 29, 2015 12:59 AM
> >> To: xen-devel
> >> Cc: George Dunlap; Andrew Cooper; Tian, Kevin; Keir Fraser; Nakajima, Jun
> >> Subject: Re: [Xen-devel] [PATCH 1/3] x86/p2m: tighten conditions of IOMMU
> >> mapping updates
> >>
> >> >>> On 21.09.15 at 16:02, <JBeulich@suse.com> wrote:
> >> > In the EPT case permission changes should also result in updates or
> >> > TLB flushes.
> >> >
> >> > In the NPT case the old MFN does not depend on the new entry being
> >> > valid (but solely on the old one), and the need to update or TLB-flush
> >> > again also depends on permission changes.
> >> >
> >> > In the shadow mode case, iommu_hap_pt_share should be ignored.
> >> >
> >> > Furthermore in the NPT/shadow case old intermediate page tables must
> >> > be freed only after IOMMU side updates/flushes have got carried out.
> >> >
> >> > Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> > ---
> >> > In addition to the fixes here it looks to me as if both EPT and
> >> > NPT/shadow code lack invalidation of IOMMU side paging structure
> >> > caches, i.e. further changes may be needed. Am I overlooking something?
> >>
> >> Okay, I think I finally figured it myself (with the help of the partial
> >> patch presented yesterday): There is no need for more flushing
> >> in the current model, where we only ever split large pages or fully
> >> replace sub-hierarchies with large pages (accompanied by a suitable
> >> flush already). More flushing would only be needed if we were to
> >> add code to re-establish large pages if an intermediate page table
> >> re-gained a state where this would be possible (quite desirable
> >> in a situation where log-dirty mode got temporarily enabled on a
> >> domain, but I'm not sure how common an operation that would be).
> >> When splitting large pages, all the hardware can have cached are
> >> - leaf entries the size of the page being split
> >
> > I'm not sure if there is an offset-1 issue.
> 
> offset-1 ?
> 
> > Basically I think we depends on the "Invalidation Hint" being cleared so
> > that all the paging structure cache are cleared. That should work if there is
> > no split happen.
> > However, in the split situation, I suspect that the entry that was split is
> > not covered by the ADDR/AM combination since the AM, based on
> > iommu_flush_iotlb_psi(), is in fact the "order" parameter for
> > ept_set_entry(). I'm not sure if I missed anything.
> 
> IH is always clear in Xen right now. And the address covers both -
> leaf entries as well as intermediate ones. Yet as explained before
> we don't even care about intermediate ones in the split case.

Yes, that's it. So as your statement before, there is no need for paging structure flush.

--jyh

> 
> Jan

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-10-02 16:37 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-21 13:57 [PATCH 0/3] x86: further P2M adjustments Jan Beulich
2015-09-21 14:02 ` [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates Jan Beulich
2015-09-22 14:15   ` Jan Beulich
2015-09-23 14:00     ` Wei Liu
2015-09-29 10:47       ` Jan Beulich
2015-09-29 11:28         ` Wei Liu
2015-09-28  8:55   ` Tian, Kevin
2015-09-28  9:06     ` Jan Beulich
2015-09-28 15:12       ` Jan Beulich
2015-09-28 16:32   ` George Dunlap
2015-09-28 16:40     ` George Dunlap
2015-09-29  7:34     ` Jan Beulich
2015-09-29  7:58   ` Jan Beulich
2015-09-30  0:02     ` Jiang, Yunhong
2015-09-30  9:58       ` Jan Beulich
2015-10-02 16:37         ` Jiang, Yunhong
2015-09-21 14:03 ` [PATCH 2/3] x86/p2m‑pt: use pre‑calculated IOMMU flags Jan Beulich
2015-09-28 16:42   ` George Dunlap
2015-10-01 16:31   ` Wei Liu
2015-09-21 14:03 ` [PATCH 3/3] x86/p2m‑ept: adjust some types in ept_set_entry() Jan Beulich
2015-09-28  8:56   ` Tian, Kevin
2015-09-28 16:44   ` George Dunlap
2015-09-25  6:58 ` [PATCH 1/3] x86/p2m: tighten conditions of IOMMU mapping updates Jan Beulich
2015-09-28 11:33 ` [PATCH 0/3] x86: further P2M adjustments Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).