From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released Date: Tue, 22 Dec 2015 12:23:00 +0000 Message-ID: <567940A4.6040900@citrix.com> References: <1450446634-8762-1-git-send-email-david.vrabel@citrix.com> <1450446634-8762-3-git-send-email-david.vrabel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aBLy8-0008GN-F6 for xen-devel@lists.xenproject.org; Tue, 22 Dec 2015 12:23:08 +0000 In-Reply-To: <1450446634-8762-3-git-send-email-david.vrabel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel , xen-devel@lists.xenproject.org Cc: Kevin Tian , Jun Nakajima , George Dunlap , Andrew Cooper , Tim Deegan , Jan Beulich List-Id: xen-devel@lists.xenproject.org On 18/12/15 13:50, David Vrabel wrote: > Holding the p2m lock while calling ept_sync_domain() is very expensive > since it does a on_selected_cpus() call. IPIs on many socket machines > can be very slows and on_selected_cpus() is serialized. > > It is safe to defer the invalidate until the p2m lock is released > except for two cases: > > 1. When freeing a page table page (since partial translations may be > cached). > 2. When reclaiming a zero page as part of PoD. > > For these cases, add p2m_tlb_flush_sync() calls which will immediately > perform the invalidate before the page is freed or reclaimed. There are at least two other places in the PoD code where the "remove -> check -> add to cache -> unlock" pattern exist; and it looks to me like there are other places where races might occur (e.g., p2m_paging_evict(), which does remove -> scrub -> put -> unlock; p2m_altp2m_propagate_change(), which does remove -> put -> unlock). Part of me wonders whether, rather than making callers that need it remember to do a flush, it might be better to explicitly pass in P2M_FLUSH or P2M_CAN_DEFER when calling p2m_set_entry, just to make people think about the fact that the p2m change may not actually take effect until later. Any thoughts on that? Comments on the current approach inline. > diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > index c094320..43c7f1b 100644 > --- a/xen/arch/x86/mm/p2m-ept.c > +++ b/xen/arch/x86/mm/p2m-ept.c > @@ -263,6 +263,7 @@ static void ept_free_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry, int l > unmap_domain_page(epte); > } > > + p2m_tlb_flush_sync(p2m); > p2m_free_ptp(p2m, mfn_to_page(ept_entry->mfn)); It's probably worth a comment here pointing out that even if this function is called several times (e.g., if you replace a load of 4k entries with a 1G entry), the actual flush will only happen the first time. > +static void ept_flush_and_unlock(struct p2m_domain *p2m, bool_t unlock) > +{ > + p2m->need_flush = 0; > + if ( unlock ) > + mm_write_unlock(&p2m->lock); > + ept_sync_domain_mask(p2m, p2m->domain->domain_dirty_cpumask); > } Having a function called "flush_and_unlock", with a boolean as to whether to unlock or not, just seems a bit wonky. Wouldn't it make more sense to have the hook just named "flush_sync()", and move the unlocking out in the generic p2m code (where you already have the check for need_flush)? > diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h > index fa46dd9..9c394c2 100644 > --- a/xen/include/asm-x86/p2m.h > +++ b/xen/include/asm-x86/p2m.h > @@ -261,6 +261,10 @@ struct p2m_domain { > unsigned long gfn, l1_pgentry_t *p, > l1_pgentry_t new, unsigned int level); > long (*audit_p2m)(struct p2m_domain *p2m); > + void (*flush_and_unlock)(struct p2m_domain *p2m, bool_t unlock); > + > + unsigned int defer_flush; > + bool_t need_flush; It's probably worth a comment that at the moment calling flush_and_unlock() is gated on need_flush; so it's OK not to implement flush_and_unlock() as long as you never set need_flush. Thanks, -George