From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Vrabel Subject: Re: [PATCHv6 2/2] x86/ept: defer the invalidation until the p2m lock is released Date: Tue, 22 Dec 2015 14:20:01 +0000 Message-ID: <56795C11.4050204@citrix.com> References: <1450446634-8762-1-git-send-email-david.vrabel@citrix.com> <1450446634-8762-3-git-send-email-david.vrabel@citrix.com> <567940A4.6040900@citrix.com> <567957A8.6020702@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aBNnL-0006yc-51 for xen-devel@lists.xenproject.org; Tue, 22 Dec 2015 14:20:07 +0000 In-Reply-To: <567957A8.6020702@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper , George Dunlap , David Vrabel , xen-devel@lists.xenproject.org Cc: George Dunlap , Kevin Tian , Tim Deegan , Jun Nakajima , Jan Beulich List-Id: xen-devel@lists.xenproject.org On 22/12/15 14:01, Andrew Cooper wrote: > On 22/12/15 12:23, George Dunlap wrote: >> On 18/12/15 13:50, David Vrabel wrote: >>> Holding the p2m lock while calling ept_sync_domain() is very expensive >>> since it does a on_selected_cpus() call. IPIs on many socket machines >>> can be very slows and on_selected_cpus() is serialized. >>> >>> It is safe to defer the invalidate until the p2m lock is released >>> except for two cases: >>> >>> 1. When freeing a page table page (since partial translations may be >>> cached). >>> 2. When reclaiming a zero page as part of PoD. >>> >>> For these cases, add p2m_tlb_flush_sync() calls which will immediately >>> perform the invalidate before the page is freed or reclaimed. >> There are at least two other places in the PoD code where the "remove -> >> check -> add to cache -> unlock" pattern exist; and it looks to me like >> there are other places where races might occur (e.g., >> p2m_paging_evict(), which does remove -> scrub -> put -> unlock; >> p2m_altp2m_propagate_change(), which does remove -> put -> unlock). >> >> Part of me wonders whether, rather than making callers that need it >> remember to do a flush, it might be better to explicitly pass in >> P2M_FLUSH or P2M_CAN_DEFER when calling p2m_set_entry, just to make >> people think about the fact that the p2m change may not actually take >> effect until later. Any thoughts on that? >> >> Comments on the current approach inline. >> >>> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c >>> index c094320..43c7f1b 100644 >>> --- a/xen/arch/x86/mm/p2m-ept.c >>> +++ b/xen/arch/x86/mm/p2m-ept.c >>> @@ -263,6 +263,7 @@ static void ept_free_entry(struct p2m_domain *p2m, ept_entry_t *ept_entry, int l >>> unmap_domain_page(epte); >>> } >>> >>> + p2m_tlb_flush_sync(p2m); >>> p2m_free_ptp(p2m, mfn_to_page(ept_entry->mfn)); >> It's probably worth a comment here pointing out that even if this >> function is called several times (e.g., if you replace a load of 4k >> entries with a 1G entry), the actual flush will only happen the first time. >> >>> +static void ept_flush_and_unlock(struct p2m_domain *p2m, bool_t unlock) >>> +{ >>> + p2m->need_flush = 0; >>> + if ( unlock ) >>> + mm_write_unlock(&p2m->lock); >>> + ept_sync_domain_mask(p2m, p2m->domain->domain_dirty_cpumask); >>> } >> Having a function called "flush_and_unlock", with a boolean as to >> whether to unlock or not, just seems a bit wonky. >> >> Wouldn't it make more sense to have the hook just named "flush_sync()", >> and move the unlocking out in the generic p2m code (where you already >> have the check for need_flush)? >> >>> diff --git a/xen/include/asm-x86/p2m.h b/xen/include/asm-x86/p2m.h >>> index fa46dd9..9c394c2 100644 >>> --- a/xen/include/asm-x86/p2m.h >>> +++ b/xen/include/asm-x86/p2m.h >>> @@ -261,6 +261,10 @@ struct p2m_domain { >>> unsigned long gfn, l1_pgentry_t *p, >>> l1_pgentry_t new, unsigned int level); >>> long (*audit_p2m)(struct p2m_domain *p2m); >>> + void (*flush_and_unlock)(struct p2m_domain *p2m, bool_t unlock); >>> + >>> + unsigned int defer_flush; >>> + bool_t need_flush; >> It's probably worth a comment that at the moment calling >> flush_and_unlock() is gated on need_flush; so it's OK not to implement >> flush_and_unlock() as long as you never set need_flush. > > This is just one small accident (in code elsewhere) away from a code > injection vulnerability. > > Either we should require that all function pointers are filled in, or > BUG() if the pointer is missing when we attempt to use it. Jan asked for the call to be conditional on need_flush and to not test flush_and_unlock. David