From: Paul Durrant <Paul.Durrant@citrix.com>
To: 'Yu Zhang' <yu.c.zhang@linux.intel.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Cc: Kevin Tian <kevin.tian@intel.com>,
Jan Beulich <jbeulich@suse.com>,
Andrew Cooper <Andrew.Cooper3@citrix.com>,
George Dunlap <George.Dunlap@citrix.com>,
"zhiyuan.lv@intel.com" <zhiyuan.lv@intel.com>,
Jun Nakajima <jun.nakajima@intel.com>
Subject: Re: [PATCH v9 4/5] x86/ioreq server: Asynchronously reset outstanding p2m_ioreq_server entries.
Date: Tue, 21 Mar 2017 10:05:29 +0000 [thread overview]
Message-ID: <95784bba51a24ffea9a07da56776c192@AMSPEX02CL03.citrite.net> (raw)
In-Reply-To: <1490064773-26751-5-git-send-email-yu.c.zhang@linux.intel.com>
> -----Original Message-----
> From: Yu Zhang [mailto:yu.c.zhang@linux.intel.com]
> Sent: 21 March 2017 02:53
> To: xen-devel@lists.xen.org
> Cc: zhiyuan.lv@intel.com; Paul Durrant <Paul.Durrant@citrix.com>; Jan
> Beulich <jbeulich@suse.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; George Dunlap
> <George.Dunlap@citrix.com>; Jun Nakajima <jun.nakajima@intel.com>;
> Kevin Tian <kevin.tian@intel.com>
> Subject: [PATCH v9 4/5] x86/ioreq server: Asynchronously reset outstanding
> p2m_ioreq_server entries.
>
> After an ioreq server has unmapped, the remaining p2m_ioreq_server
> entries need to be reset back to p2m_ram_rw. This patch does this
> asynchronously with the current p2m_change_entry_type_global()
> interface.
>
> This patch also disallows live migration, when there's still any
> outstanding p2m_ioreq_server entry left. The core reason is our
> current implementation of p2m_change_entry_type_global() can not
> tell the state of p2m_ioreq_server entries(can not decide if an
> entry is to be emulated or to be resynced).
>
> Note: new field entry_count is introduced in struct p2m_domain,
> to record the number of p2m_ioreq_server p2m page table entries.
> One nature of these entries is that they only point to 4K sized
> page frames, because all p2m_ioreq_server entries are originated
> from p2m_ram_rw ones in p2m_change_type_one(). We do not need to
> worry about the counting for 2M/1G sized pages.
>
> Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
> ---
> Cc: Paul Durrant <paul.durrant@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
>
> changes in v4:
> - According to comments from Jan: use ASSERT() instead of 'if'
> condition in p2m_change_type_one().
> - According to comments from Jan: commit message changes to mention
> the p2m_ioreq_server are all based on 4K sized pages.
>
> changes in v3:
> - Move the synchronously resetting logic into patch 5.
> - According to comments from Jan: introduce p2m_check_changeable()
> to clarify the p2m type change code.
> - According to comments from George: use locks in the same order
> to avoid deadlock, call p2m_change_entry_type_global() after unmap
> of the ioreq server is finished.
>
> changes in v2:
> - Move the calculation of ioreq server page entry_cout into
> p2m_change_type_one() so that we do not need a seperate lock.
> Note: entry_count is also calculated in resolve_misconfig()/
> do_recalc(), fortunately callers of both routines have p2m
> lock protected already.
> - Simplify logic in hvmop_set_mem_type().
> - Introduce routine p2m_finish_type_change() to walk the p2m
> table and do the p2m reset.
> ---
> xen/arch/x86/hvm/ioreq.c | 8 ++++++++
> xen/arch/x86/mm/hap/hap.c | 9 +++++++++
> xen/arch/x86/mm/p2m-ept.c | 8 +++++++-
> xen/arch/x86/mm/p2m-pt.c | 13 +++++++++++--
> xen/arch/x86/mm/p2m.c | 20 ++++++++++++++++++++
> xen/include/asm-x86/p2m.h | 9 ++++++++-
> 6 files changed, 63 insertions(+), 4 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
> index 746799f..102c6c2 100644
> --- a/xen/arch/x86/hvm/ioreq.c
> +++ b/xen/arch/x86/hvm/ioreq.c
> @@ -949,6 +949,14 @@ int hvm_map_mem_type_to_ioreq_server(struct
> domain *d, ioservid_t id,
>
> spin_unlock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
>
> + if ( rc == 0 && flags == 0 )
> + {
> + struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> + if ( read_atomic(&p2m->ioreq.entry_count) )
> + p2m_change_entry_type_global(d, p2m_ioreq_server,
> p2m_ram_rw);
> + }
> +
> return rc;
> }
>
> diff --git a/xen/arch/x86/mm/hap/hap.c b/xen/arch/x86/mm/hap/hap.c
> index a57b385..6ec950a 100644
> --- a/xen/arch/x86/mm/hap/hap.c
> +++ b/xen/arch/x86/mm/hap/hap.c
> @@ -187,6 +187,15 @@ out:
> */
> static int hap_enable_log_dirty(struct domain *d, bool_t log_global)
> {
> + struct p2m_domain *p2m = p2m_get_hostp2m(d);
> +
> + /*
> + * Refuse to turn on global log-dirty mode if
> + * there's outstanding p2m_ioreq_server pages.
> + */
> + if ( log_global && read_atomic(&p2m->ioreq.entry_count) )
> + return -EBUSY;
> +
> /* turn on PG_log_dirty bit in paging mode */
> paging_lock(d);
> d->arch.paging.mode |= PG_log_dirty;
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index cc1eb21..1df3d09 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -544,6 +544,12 @@ static int resolve_misconfig(struct p2m_domain
> *p2m, unsigned long gfn)
> e.ipat = ipat;
> if ( e.recalc && p2m_is_changeable(e.sa_p2mt) )
> {
> + if ( e.sa_p2mt == p2m_ioreq_server )
> + {
> + p2m->ioreq.entry_count--;
> + ASSERT(p2m->ioreq.entry_count >= 0);
> + }
> +
> e.sa_p2mt = p2m_is_logdirty_range(p2m, gfn + i, gfn + i)
> ? p2m_ram_logdirty : p2m_ram_rw;
> ept_p2m_type_to_flags(p2m, &e, e.sa_p2mt, e.access);
> @@ -965,7 +971,7 @@ static mfn_t ept_get_entry(struct p2m_domain
> *p2m,
> if ( is_epte_valid(ept_entry) )
> {
> if ( (recalc || ept_entry->recalc) &&
> - p2m_is_changeable(ept_entry->sa_p2mt) )
> + p2m_check_changeable(ept_entry->sa_p2mt) )
> *t = p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty
> : p2m_ram_rw;
> else
> diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
> index f6c45ec..169de75 100644
> --- a/xen/arch/x86/mm/p2m-pt.c
> +++ b/xen/arch/x86/mm/p2m-pt.c
> @@ -436,11 +436,13 @@ static int do_recalc(struct p2m_domain *p2m,
> unsigned long gfn)
> needs_recalc(l1, *pent) )
> {
> l1_pgentry_t e = *pent;
> + p2m_type_t p2mt_old;
>
> if ( !valid_recalc(l1, e) )
> P2M_DEBUG("bogus recalc leaf at d%d:%lx:%u\n",
> p2m->domain->domain_id, gfn, level);
> - if ( p2m_is_changeable(p2m_flags_to_type(l1e_get_flags(e))) )
> + p2mt_old = p2m_flags_to_type(l1e_get_flags(e));
> + if ( p2m_is_changeable(p2mt_old) )
> {
> unsigned long mask = ~0UL << (level * PAGETABLE_ORDER);
> p2m_type_t p2mt = p2m_is_logdirty_range(p2m, gfn & mask, gfn |
> ~mask)
> @@ -460,6 +462,13 @@ static int do_recalc(struct p2m_domain *p2m,
> unsigned long gfn)
> mfn &= ~(_PAGE_PSE_PAT >> PAGE_SHIFT);
> flags |= _PAGE_PSE;
> }
> +
> + if ( p2mt_old == p2m_ioreq_server )
> + {
> + p2m->ioreq.entry_count--;
> + ASSERT(p2m->ioreq.entry_count >= 0);
> + }
> +
> e = l1e_from_pfn(mfn, flags);
> p2m_add_iommu_flags(&e, level,
> (p2mt == p2m_ram_rw)
> @@ -729,7 +738,7 @@ p2m_pt_set_entry(struct p2m_domain *p2m,
> unsigned long gfn, mfn_t mfn,
> static inline p2m_type_t recalc_type(bool_t recalc, p2m_type_t t,
> struct p2m_domain *p2m, unsigned long gfn)
> {
> - if ( !recalc || !p2m_is_changeable(t) )
> + if ( !recalc || !p2m_check_changeable(t) )
> return t;
> return p2m_is_logdirty_range(p2m, gfn, gfn) ? p2m_ram_logdirty
> : p2m_ram_rw;
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index dd4e477..e3e54f1 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -954,6 +954,26 @@ int p2m_change_type_one(struct domain *d,
> unsigned long gfn,
> p2m->default_access)
> : -EBUSY;
>
> + if ( !rc )
> + {
> + switch ( nt )
> + {
> + case p2m_ram_rw:
> + if ( ot == p2m_ioreq_server )
> + {
> + p2m->ioreq.entry_count--;
> + ASSERT(p2m->ioreq.entry_count >= 0);
> + }
> + break;
> + case p2m_ioreq_server:
> + ASSERT(ot == p2m_ram_rw);
> + p2m->ioreq.entry_count++;
> + break;
> + default:
> + break;
> + }
> + }
> +
> gfn_unlock(p2m, gfn, 0);
>
> return rc;
> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h
> index 3786680..395f125 100644
> --- a/xen/include/asm-x86/p2m.h
> +++ b/xen/include/asm-x86/p2m.h
> @@ -120,7 +120,10 @@ typedef unsigned int p2m_query_t;
>
> /* Types that can be subject to bulk transitions. */
> #define P2M_CHANGEABLE_TYPES (p2m_to_mask(p2m_ram_rw) \
> - | p2m_to_mask(p2m_ram_logdirty) )
> + | p2m_to_mask(p2m_ram_logdirty) \
> + | p2m_to_mask(p2m_ioreq_server) )
> +
> +#define P2M_IOREQ_TYPES (p2m_to_mask(p2m_ioreq_server))
>
> #define P2M_POD_TYPES (p2m_to_mask(p2m_populate_on_demand))
>
> @@ -157,6 +160,7 @@ typedef unsigned int p2m_query_t;
> #define p2m_is_readonly(_t) (p2m_to_mask(_t) & P2M_RO_TYPES)
> #define p2m_is_discard_write(_t) (p2m_to_mask(_t) &
> P2M_DISCARD_WRITE_TYPES)
> #define p2m_is_changeable(_t) (p2m_to_mask(_t) &
> P2M_CHANGEABLE_TYPES)
> +#define p2m_is_ioreq(_t) (p2m_to_mask(_t) & P2M_IOREQ_TYPES)
> #define p2m_is_pod(_t) (p2m_to_mask(_t) & P2M_POD_TYPES)
> #define p2m_is_grant(_t) (p2m_to_mask(_t) & P2M_GRANT_TYPES)
> /* Grant types are *not* considered valid, because they can be
> @@ -178,6 +182,8 @@ typedef unsigned int p2m_query_t;
>
> #define p2m_allows_invalid_mfn(t) (p2m_to_mask(t) &
> P2M_INVALID_MFN_TYPES)
>
> +#define p2m_check_changeable(t) (p2m_is_changeable(t) &&
> !p2m_is_ioreq(t))
> +
> typedef enum {
> p2m_host,
> p2m_nested,
> @@ -349,6 +355,7 @@ struct p2m_domain {
> * are to be emulated by an ioreq server.
> */
> unsigned int flags;
> + long entry_count;
> } ioreq;
> };
>
> --
> 1.9.1
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-03-21 10:05 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-21 2:52 [PATCH v9 0/5] x86/ioreq server: Introduce HVMMEM_ioreq_server mem type Yu Zhang
2017-03-21 2:52 ` [PATCH v9 1/5] x86/ioreq server: Release the p2m lock after mmio is handled Yu Zhang
2017-03-29 13:39 ` George Dunlap
2017-03-29 13:50 ` Jan Beulich
2017-03-21 2:52 ` [PATCH v9 2/5] x86/ioreq server: Add DMOP to map guest ram with p2m_ioreq_server to an ioreq server Yu Zhang
2017-03-22 7:49 ` Tian, Kevin
2017-03-22 10:12 ` Yu Zhang
2017-03-24 9:26 ` Tian, Kevin
2017-03-24 12:34 ` Yu Zhang
2017-03-22 14:21 ` Jan Beulich
2017-03-23 3:23 ` Yu Zhang
2017-03-23 8:57 ` Jan Beulich
2017-03-24 9:05 ` Yu Zhang
2017-03-24 10:19 ` Jan Beulich
2017-03-24 12:35 ` Yu Zhang
2017-03-24 13:09 ` Jan Beulich
2017-03-21 2:52 ` [PATCH v9 3/5] x86/ioreq server: Handle read-modify-write cases for p2m_ioreq_server pages Yu Zhang
2017-03-22 14:22 ` Jan Beulich
2017-03-21 2:52 ` [PATCH v9 4/5] x86/ioreq server: Asynchronously reset outstanding p2m_ioreq_server entries Yu Zhang
2017-03-21 10:05 ` Paul Durrant [this message]
2017-03-22 8:10 ` Tian, Kevin
2017-03-22 10:12 ` Yu Zhang
2017-03-24 9:37 ` Tian, Kevin
2017-03-24 12:45 ` Yu Zhang
2017-03-22 14:29 ` Jan Beulich
2017-03-23 3:23 ` Yu Zhang
2017-03-23 9:00 ` Jan Beulich
2017-03-24 9:05 ` Yu Zhang
2017-03-24 10:37 ` Jan Beulich
2017-03-24 12:36 ` Yu Zhang
2017-03-21 2:52 ` [PATCH v9 5/5] x86/ioreq server: Synchronously reset outstanding p2m_ioreq_server entries when an ioreq server unmaps Yu Zhang
2017-03-21 10:00 ` Paul Durrant
2017-03-21 11:15 ` Yu Zhang
2017-03-21 13:49 ` Paul Durrant
2017-03-21 14:14 ` Yu Zhang
2017-03-22 8:28 ` Tian, Kevin
2017-03-22 8:54 ` Jan Beulich
2017-03-22 9:02 ` Tian, Kevin
2017-03-22 14:39 ` Jan Beulich
2017-03-23 3:23 ` Yu Zhang
2017-03-23 9:02 ` Jan Beulich
2017-03-24 9:05 ` Yu Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=95784bba51a24ffea9a07da56776c192@AMSPEX02CL03.citrite.net \
--to=paul.durrant@citrix.com \
--cc=Andrew.Cooper3@citrix.com \
--cc=George.Dunlap@citrix.com \
--cc=jbeulich@suse.com \
--cc=jun.nakajima@intel.com \
--cc=kevin.tian@intel.com \
--cc=xen-devel@lists.xen.org \
--cc=yu.c.zhang@linux.intel.com \
--cc=zhiyuan.lv@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).