[PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Yu Zhang <yu.c.zhang@linux.intel.com>
To: xen-devel@lists.xen.org
Cc: Kevin Tian <kevin.tian@intel.com>,
	Jun Nakajima <jun.nakajima@intel.com>,
	George Dunlap <george.dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Paul Durrant <paul.durrant@citrix.com>,
	zhiyuan.lv@intel.com, Jan Beulich <jbeulich@suse.com>
Subject: [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps.
Date: Fri,  2 Sep 2016 18:47:20 +0800	[thread overview]
Message-ID: <1472813240-11011-5-git-send-email-yu.c.zhang@linux.intel.com> (raw)
In-Reply-To: <1472813240-11011-1-git-send-email-yu.c.zhang@linux.intel.com>

This patch resets p2m_ioreq_server entries back to p2m_ram_rw,
after an ioreq server has unmapped. The resync is done both
asynchronously with the current p2m_change_entry_type_global()
interface, and synchronously by iterating the p2m table. The
synchronous resetting is necessary because we need to guarantee
the p2m table is clean before another ioreq server is mapped.
And since the sweeping of p2m table could be time consuming,
it is done with hypercall continuation. Asynchronous approach
is also taken so that p2m_ioreq_server entries can also be reset
when the HVM is scheduled between hypercall continuations.

This patch also disallows live migration, when there's still any
outstanding p2m_ioreq_server entry left. The core reason is our
current implementation of p2m_change_entry_type_global() can not
tell the state of p2m_ioreq_server entries(can not decide if an
entry is to be emulated or to be resynced).

Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
---
Cc: Paul Durrant <paul.durrant@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>

changes in v2: 
  - Move the calculation of ioreq server page entry_cout into 
    p2m_change_type_one() so that we do not need a seperate lock.
    Note: entry_count is also calculated in resolve_misconfig()/
    do_recalc(), fortunately callers of both routines have p2m
    lock protected already.
  - Simplify logic in hvmop_set_mem_type().
  - Introduce routine p2m_finish_type_change() to walk the p2m 
    table and do the p2m reset.
---
 xen/arch/x86/hvm/hvm.c    | 38 +++++++++++++++++++++++++++++++++---
 xen/arch/x86/mm/hap/hap.c |  9 +++++++++
 xen/arch/x86/mm/p2m-ept.c |  6 +++++-
 xen/arch/x86/mm/p2m-pt.c  | 10 ++++++++--
 xen/arch/x86/mm/p2m.c     | 49 +++++++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/p2m.h |  9 ++++++++-
 6 files changed, 114 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 9b419d2..200c661 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -5522,11 +5522,13 @@ static int hvmop_set_mem_type(
 }
 
 static int hvmop_map_mem_type_to_ioreq_server(
-    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop)
+    XEN_GUEST_HANDLE_PARAM(xen_hvm_map_mem_type_to_ioreq_server_t) uop,
+    unsigned long *iter)
 {
     xen_hvm_map_mem_type_to_ioreq_server_t op;
     struct domain *d;
     int rc;
+    unsigned long gfn = *iter;
 
     if ( copy_from_guest(&op, uop, 1) )
         return -EFAULT;
@@ -5551,7 +5553,35 @@ static int hvmop_map_mem_type_to_ioreq_server(
     if ( rc != 0 )
         goto out;
 
-    rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
+    if ( gfn == 0 )
+        rc = hvm_map_mem_type_to_ioreq_server(d, op.id, op.type, op.flags);
+
+    /*
+     * Iterate p2m table when an ioreq server unmaps from p2m_ioreq_server,
+     * and reset the remaining p2m_ioreq_server entries back to p2m_ram_rw.
+     */
+    if ( op.flags == 0 && rc == 0 )
+    {
+        struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+        while ( read_atomic(&p2m->ioreq.entry_count) &&
+                gfn <= p2m->max_mapped_pfn )
+        {
+            unsigned long gfn_end = gfn + HVMOP_op_mask;
+
+            p2m_finish_type_change(d, gfn, gfn_end,
+                                   p2m_ioreq_server, p2m_ram_rw);
+
+            /* Check for continuation if it's not the last iteration. */
+            if ( ++gfn_end <= p2m->max_mapped_pfn &&
+                hypercall_preempt_check() )
+            {
+                rc = -ERESTART;
+                *iter = gfn_end;
+                goto out;
+            }
+        }
+    }
 
  out:
     rcu_unlock_domain(d);
@@ -5570,6 +5600,7 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
         break;
     case HVMOP_modified_memory:
     case HVMOP_set_mem_type:
+    case HVMOP_map_mem_type_to_ioreq_server:
         mask = HVMOP_op_mask;
         break;
     }
@@ -5599,7 +5630,8 @@ long do_hvm_op(unsigned long op, XEN_GUEST_HANDLE_PARAM(void) arg)
 
     case HVMOP_map_mem_type_to_ioreq_server:
         rc = hvmop_map_mem_type_to_ioreq_server(
-            guest_handle_cast(arg, xen_hvm_map_mem_type_to_ioreq_server_t));
+            guest_handle_cast(arg, xen_hvm_map_mem_type_to_ioreq_server_t),
+            &start_iter);
         break;
 
     case HVMOP_set_ioreq_server_state:
diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
index 3218fa2..5df2d62 100644
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -190,6 +190,15 @@ out:
  */
 static int hap_enable_log_dirty(struct domain *d, bool_t log_global)
 {
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+
+    /*
+    * Refuse to turn on global log-dirty mode if
+    * there's outstanding p2m_ioreq_server pages.
+    */
+    if ( log_global && read_atomic(&p2m->ioreq.entry_count) )
+        return -EBUSY;
+
     /* turn on PG_log_dirty bit in paging mode */
     paging_lock(d);
     d->arch.paging.mode |= PG_log_dirty;
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index 700420a..6679ae7 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -545,6 +545,9 @@ static int resolve_misconfig(struct p2m_domain *p2m, unsigned long gfn)
                     e.ipat = ipat;
                     if ( e.recalc && p2m_is_changeable(e.sa_p2mt) )
                     {
+                         if ( e.sa_p2mt == p2m_ioreq_server )
+                             p2m->ioreq.entry_count--;
+
                          e.sa_p2mt = p2m_is_logdirty_range(p2m, gfn + i, gfn + i)
                                      ? p2m_ram_logdirty : p2m_ram_rw;
                          ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, e.access);
@@ -965,7 +968,8 @@ static mfn_t ept_get_entry(struct p2m_domain *p2m,
     if ( is_epte_valid(ept_entry) )
     {
         if ( (recalc || ept_entry->recalc) &&
-             p2m_is_changeable(ept_entry->sa_p2mt) )
+             p2m_is_changeable(ept_entry->sa_p2mt) &&
+             (ept_entry->sa_p2mt != p2m_ioreq_server) )
             *t = p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty
                                                       : p2m_ram_rw;
         else
diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 46a56fa..7f31c0e 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -439,11 +439,13 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
          needs_recalc(l1, *pent) )
     {
         l1_pgentry_t e = *pent;
+        p2m_type_t p2mt_old;
 
         if ( !valid_recalc(l1, e) )
             P2M_DEBUG("bogus recalc leaf at d%d:%lx:%u\n",
                       p2m->domain->domain_id, gfn, level);
-        if ( p2m_is_changeable(p2m_flags_to_type(l1e_get_flags(e))) )
+        p2mt_old = p2m_flags_to_type(l1e_get_flags(e));
+        if ( p2m_is_changeable(p2mt_old) )
         {
             unsigned long mask = ~0UL << (level * PAGETABLE_ORDER);
             p2m_type_t p2mt = p2m_is_logdirty_range(p2m, gfn & mask, gfn | ~mask)
@@ -463,6 +465,10 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
                      mfn &= ~(_PAGE_PSE_PAT >> PAGE_SHIFT);
                 flags |= _PAGE_PSE;
             }
+
+            if ( p2mt_old == p2m_ioreq_server )
+                p2m->ioreq.entry_count--;
+
             e = l1e_from_pfn(mfn, flags);
             p2m_add_iommu_flags(&e, level,
                                 (p2mt == p2m_ram_rw)
@@ -729,7 +735,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long gfn, mfn_t mfn,
 static inline p2m_type_t recalc_type(bool_t recalc, p2m_type_t t,
                                      struct p2m_domain *p2m, unsigned long gfn)
 {
-    if ( !recalc || !p2m_is_changeable(t) )
+    if ( !recalc || !p2m_is_changeable(t) || (t == p2m_ioreq_server) )
         return t;
     return p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty
                                                 : p2m_ram_rw;
diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
index 6e4cb1f..6581a70 100644
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -313,6 +313,9 @@ int p2m_set_ioreq_server(struct domain *d,
 
         p2m->ioreq.server = NULL;
         p2m->ioreq.flags = 0;
+
+        if ( read_atomic(&p2m->ioreq.entry_count) )
+            p2m_change_entry_type_global(d, p2m_ioreq_server, p2m_ram_rw);
     }
     else
     {
@@ -957,6 +960,23 @@ int p2m_change_type_one(struct domain *d, unsigned long gfn,
                          p2m->default_access)
          : -EBUSY;
 
+    if ( !rc )
+    {
+        switch ( nt )
+        {
+        case p2m_ram_rw:
+            if ( ot == p2m_ioreq_server )
+                p2m->ioreq.entry_count--;
+            break;
+        case p2m_ioreq_server:
+            if ( ot == p2m_ram_rw )
+                p2m->ioreq.entry_count++;
+            break;
+        default:
+            break;
+        }
+    }
+
     gfn_unlock(p2m, gfn, 0);
 
     return rc;
@@ -1021,6 +1041,35 @@ void p2m_change_type_range(struct domain *d,
     p2m_unlock(p2m);
 }
 
+/* Synchronously modify the p2m type of a range of gfns from ot to nt. */
+void p2m_finish_type_change(struct domain *d,
+                           unsigned long start, unsigned long end,
+                           p2m_type_t ot, p2m_type_t nt)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    p2m_type_t t;
+    unsigned long gfn = start;
+
+    ASSERT(start <= end);
+    ASSERT(ot != nt);
+    ASSERT(p2m_is_changeable(ot) && p2m_is_changeable(nt));
+
+    p2m_lock(p2m);
+
+    end = (end > p2m->max_mapped_pfn) ? p2m->max_mapped_pfn : end;
+    while ( gfn <= end )
+    {
+        get_gfn_query_unlocked(d, gfn, &t);
+
+        if ( t == ot )
+            p2m_change_type_one(d, gfn, t, nt);
+
+        gfn++;
+    }
+
+    p2m_unlock(p2m);
+}
+
 /*
  * Returns:
  *    0              for success
diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
index 4924c4b..bf9adca 100644
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -120,7 +120,8 @@ typedef unsigned int p2m_query_t;
 
 /* Types that can be subject to bulk transitions. */
 #define P2M_CHANGEABLE_TYPES (p2m_to_mask(p2m_ram_rw) \
-                              | p2m_to_mask(p2m_ram_logdirty) )
+                              | p2m_to_mask(p2m_ram_logdirty) \
+                              | p2m_to_mask(p2m_ioreq_server))
 
 #define P2M_POD_TYPES (p2m_to_mask(p2m_populate_on_demand))
 
@@ -349,6 +350,7 @@ struct p2m_domain {
          * are to be emulated by an ioreq server.
          */
         unsigned int flags;
+        unsigned int entry_count;
     } ioreq;
 };
 
@@ -604,6 +606,11 @@ void p2m_change_type_range(struct domain *d,
 int p2m_change_type_one(struct domain *d, unsigned long gfn,
                         p2m_type_t ot, p2m_type_t nt);
 
+/* Synchronously change types across a range of p2m entries (start ... end) */
+void p2m_finish_type_change(struct domain *d,
+                           unsigned long start, unsigned long end,
+                           p2m_type_t ot, p2m_type_t nt);
+
 /* Report a change affecting memory types. */
 void p2m_memory_type_changed(struct domain *d);
 
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

next prev parent reply	other threads:[~2016-09-02 10:47 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-02 10:47 [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2016-09-02 10:47 ` [PATCH v6 1/4] x86/ioreq server: Add HVMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
2016-09-05 13:31   ` Jan Beulich
2016-09-05 17:20     ` George Dunlap
2016-09-06  7:58       ` Jan Beulich
2016-09-06  8:03         ` Paul Durrant
2016-09-06  8:13           ` Jan Beulich
2016-09-06 10:00             ` Yu Zhang
2016-09-09  5:55     ` Yu Zhang
2016-09-09  8:09       ` Jan Beulich
2016-09-09  8:59         ` Yu Zhang
2016-09-05 17:23   ` George Dunlap
     [not found]   ` <57D24730.2050904@linux.intel.com>
2016-09-09  5:51     ` Yu Zhang
2016-09-21 13:04       ` George Dunlap
2016-09-22  9:12         ` Yu Zhang
2016-09-22 11:32           ` George Dunlap
2016-09-22 16:02             ` Yu Zhang
2016-09-23 10:35               ` George Dunlap
2016-09-26  6:57                 ` Yu Zhang
2016-09-26  6:58           ` Yu Zhang
2016-09-02 10:47 ` [PATCH v6 2/4] x86/ioreq server: Release the p2m lock after mmio is handled Yu Zhang
2016-09-05 13:49   ` Jan Beulich
     [not found]   ` <57D24782.6010701@linux.intel.com>
2016-09-09  5:56     ` Yu Zhang
2016-09-02 10:47 ` [PATCH v6 3/4] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages Yu Zhang
2016-09-05 14:10   ` Jan Beulich
     [not found]   ` <57D247F6.9010503@linux.intel.com>
2016-09-09  6:21     ` Yu Zhang
2016-09-09  8:12       ` Jan Beulich
2016-09-02 10:47 ` Yu Zhang [this message]
2016-09-05 14:47   ` [PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Jan Beulich
     [not found]   ` <57D24813.2090903@linux.intel.com>
2016-09-09  7:24     ` Yu Zhang
2016-09-09  8:20       ` Jan Beulich
2016-09-09  9:24         ` Yu Zhang
2016-09-09  9:44           ` Jan Beulich
2016-09-09  9:56             ` Yu Zhang
2016-09-09 10:09               ` Jan Beulich
2016-09-09 10:01                 ` Yu Zhang
2016-09-20  2:57                 ` Yu Zhang
2016-09-22 18:06                   ` George Dunlap
2016-09-23  1:31                     ` Yu Zhang
2016-09-06 10:57 ` [PATCH v6 0/4] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:9b419d2 dfblob:200c661 dfblob:3218fa2 dfblob:5df2d62
dfblob:700420a dfblob:6679ae7 dfblob:46a56fa dfblob:7f31c0e
dfblob:6e4cb1f dfblob:6581a70 dfblob:4924c4b dfblob:bf9adca )
 OR (
bs:"[PATCH v6 4/4] x86/ioreq server: Reset outstanding p2m_ioreq_server entries when an ioreq server unmaps." )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1472813240-11011-5-git-send-email-yu.c.zhang@linux.intel.com \
    --to=yu.c.zhang@linux.intel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=jbeulich@suse.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=paul.durrant@citrix.com \
    --cc=xen-devel@lists.xen.org \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).