Re: [PATCH v4 06/13] nEPT: Add EPT tables support to paging_tmpl.h

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Gleb Natapov <gleb@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org,
	Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>,
	Jun Nakajima <jun.nakajima@Intel.com>,
	Yang Zhang <yang.z.zhang@Intel.com>
Subject: Re: [PATCH v4 06/13] nEPT: Add EPT tables support to paging_tmpl.h
Date: Tue, 30 Jul 2013 17:22:34 +0300	[thread overview]
Message-ID: <20130730142232.GE25505@redhat.com> (raw)
In-Reply-To: <51F7ADF2.8020506@redhat.com>

On Tue, Jul 30, 2013 at 02:13:38PM +0200, Paolo Bonzini wrote:
> Il 30/07/2013 13:56, Gleb Natapov ha scritto:
> >>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> >>> index 4c4274d..b5273c3 100644
> >>> --- a/arch/x86/kvm/mmu.c
> >>> +++ b/arch/x86/kvm/mmu.c
> >>> @@ -3494,6 +3494,11 @@ static inline bool is_last_gpte(struct kvm_mmu *mmu, unsigned level, unsigned gp
> >>>  	return mmu->last_pte_bitmap & (1 << index);
> >>>  }
> >>>  
> >>> +#define PTTYPE_EPT 18 /* arbitrary */
> >>> +#define PTTYPE PTTYPE_EPT
> >>> +#include "paging_tmpl.h"
> >>> +#undef PTTYPE
> >>> +
> >>>  #define PTTYPE 64
> >>>  #include "paging_tmpl.h"
> >>>  #undef PTTYPE
> >>> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> >>> index 7581395..e38b3c0 100644
> >>> --- a/arch/x86/kvm/paging_tmpl.h
> >>> +++ b/arch/x86/kvm/paging_tmpl.h
> >>> @@ -58,6 +58,21 @@
> >>>  	#define PT_GUEST_DIRTY_SHIFT PT_DIRTY_SHIFT
> >>>  	#define PT_GUEST_ACCESSED_SHIFT PT_ACCESSED_SHIFT
> >>>  	#define CMPXCHG cmpxchg
> >>> +#elif PTTYPE == PTTYPE_EPT
> >>> +	#define pt_element_t u64
> >>> +	#define guest_walker guest_walkerEPT
> >>> +	#define FNAME(name) ept_##name
> >>> +	#define PT_BASE_ADDR_MASK PT64_BASE_ADDR_MASK
> >>> +	#define PT_LVL_ADDR_MASK(lvl) PT64_LVL_ADDR_MASK(lvl)
> >>> +	#define PT_LVL_OFFSET_MASK(lvl) PT64_LVL_OFFSET_MASK(lvl)
> >>> +	#define PT_INDEX(addr, level) PT64_INDEX(addr, level)
> >>> +	#define PT_LEVEL_BITS PT64_LEVEL_BITS
> >>> +	#define PT_GUEST_ACCESSED_MASK 0
> >>> +	#define PT_GUEST_DIRTY_MASK 0
> >>> +	#define PT_GUEST_DIRTY_SHIFT 0
> >>> +	#define PT_GUEST_ACCESSED_SHIFT 0
> >>> +	#define CMPXCHG cmpxchg64
> >>> +	#define PT_MAX_FULL_LEVELS 4
> >>>  #else
> >>>  	#error Invalid PTTYPE value
> >>>  #endif
> >>
> >> Please add a
> >>
> >> BUILD_BUG_ON(!!PT_GUEST_ACCESSED_MASK != !!PT_GUEST_DIRTY_MASK);
> >> #if PT_GUEST_ACCESSED_MASK
> >> BUILD_BUG_ON(PT_GUEST_DIRTY_SHIFT <= PT_GUEST_ACCESSED_SHIFT);
> >> BUILD_BUG_ON(PT_GUEST_DIRTY_SHIFT <= PT_WRITABLE_SHIFT);
> >> BUILD_BUG_ON(PT_GUEST_DIRTY_MASK != (1 << PT_GUEST_DIRTY_SHIFT));
> >> BUILD_BUG_ON(PT_GUEST_ACCESSED_MASK != (1 << PT_GUEST_ACCESSED_SHIFT));
> >
> > This will not work if I define _SHIFT to be 8/9 now.
> 
> They will because I put them under an appropriate "#if". :)
> 
True, missed that.

> OTOH if you define _SHIFT to be 8/9 you can move the #if so that it only
> covers the last two checks.
> 
> > But we do not use
> > BUILD_BUG_ON() to check values from the same define "namespace". It is
> > implied that they are correct and when they change all "namespace"
> > remains to be consistent. If you look at BUILD_BUG_ON() that we have
> > (and this patch adds) they are from the form:
> >   PT_WRITABLE_MASK != ACC_WRITE_MASK
> >   ACC_WRITE_MASK != VMX_EPT_WRITABLE_MASK
> >   VMX_EPT_READABLE_MASK != PT_PRESENT_MASK
> >   VMX_EPT_WRITABLE_MASK != PT_WRITABLE_MASK
> > 
> > i.e they compare value from different "namespaces".
> 
> Yes, all these BUILD_BUG_ONs make sense.
> 
> But I think BUILD_BUG_ON() is more generically for consistency checks
> and enforcing invariants that the code expects.  Our invariants are:
> 
> * A/D bits are enabled or disabled in pairs
> 
> * dirty is the left of accessed and writable
> 
> * masks should either be zero or match the corresponding shifts
> 
> The alternative to BUILD_BUG_ON would be a comment that explains the
> invariants, but there's no need to use English if C can do it better! :)
> 
OK, will add this in separate patch.

> >>>  	pt_element_t pte;
> >>>  	pt_element_t __user *uninitialized_var(ptep_user);
> >>>  	gfn_t table_gfn;
> >>> @@ -322,7 +351,9 @@ retry_walk:
> >>>  		accessed_dirty &= pte >>
> >>>  			(PT_GUEST_DIRTY_SHIFT - PT_GUEST_ACCESSED_SHIFT);
> >>
> >> This shift is one of two things that bugged me.  I dislike including
> >> "wrong" code just because it is dead.  Perhaps you can #define the
> > Meanwhile I changed comment above this to be:
> >            /*
> >             * On a write fault, fold the dirty bit into accessed_dirty.
> >             * For modes without A/D bits support accessed_dirty will be
> >             * always clear.
> >             */
> 
> Good.
> 
> >> shifts to 8 and 9 already now, even if the masks stay 0?
> >>
> > Currently I do not see any problem with that, but we will have to be careful
> > that *_SHIFT values will not leak into a code where it could matter.
> 
> They would leak with PT_GUEST_*_SHIFT == 0 too, I think?  (And with
> worse effects, because they would use bit 0 and/or do shifts with
> negative counts).
> 
> Perhaps PT_GUEST_*_SHIFT can instead be defined to a function like
> __using_nonexistent_pte_bit(), so that compilation fails unless all such
> references are optimized away.  See arch/x86/include/asm/cmpxchg.h for
> an example:
> 
> /*
>  * Non-existant functions to indicate usage errors at link time
>  * (or compile-time if the compiler implements __compiletime_error().
>  */
> extern void __xchg_wrong_size(void)
>         __compiletime_error("Bad argument size for xchg");
> 
Nice! But not all of them are optimized in the correct stage. The one
in accessed_dirty calculation is reachable, but result is thrown away.
If I define *_SHIFT to be a function the code cannot be optimized since
compiler thinks that function call has side effects. Adding pure function
attribute solves that though, so I'll go with that.
 
--
			Gleb.

next prev parent reply	other threads:[~2013-07-30 14:22 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-25 10:59 [PATCH v4 00/13] Nested EPT Gleb Natapov
2013-07-25 10:59 ` [PATCH v4 01/13] nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 Gleb Natapov
2013-07-29  8:32   ` Paolo Bonzini
2013-07-29 13:12     ` Gleb Natapov
2013-07-29 14:13       ` Paolo Bonzini
2013-07-25 10:59 ` [PATCH v4 02/13] nEPT: Fix cr3 handling in nested exit and entry Gleb Natapov
2013-07-25 10:59 ` [PATCH v4 03/13] nEPT: Fix wrong test in kvm_set_cr3 Gleb Natapov
2013-07-29  8:36   ` Paolo Bonzini
2013-07-29 10:43     ` Gleb Natapov
2013-07-31  8:02   ` Xiao Guangrong
2013-07-25 10:59 ` [PATCH v4 04/13] nEPT: Move common code to paging_tmpl.h Gleb Natapov
2013-07-31  8:02   ` Xiao Guangrong
2013-07-31  8:36     ` Gleb Natapov
2013-07-25 10:59 ` [PATCH v4 05/13] nEPT: make guest's A/D bits depends on guest's paging mode Gleb Natapov
2013-07-25 10:59 ` [PATCH v4 06/13] nEPT: Add EPT tables support to paging_tmpl.h Gleb Natapov
2013-07-29  9:48   ` Paolo Bonzini
2013-07-29 11:33     ` Gleb Natapov
2013-07-29 11:55       ` Paolo Bonzini
2013-07-29 12:24         ` Gleb Natapov
2013-07-29 13:19           ` Paolo Bonzini
2013-07-29 13:27             ` Gleb Natapov
2013-07-29 14:15               ` Paolo Bonzini
2013-07-29 16:14                 ` Gleb Natapov
2013-07-29 16:28                   ` Paolo Bonzini
2013-07-29 16:43                     ` Gleb Natapov
2013-07-29 17:06                       ` Paolo Bonzini
2013-07-29 17:11                         ` Gleb Natapov
2013-07-30 10:03   ` Paolo Bonzini
2013-07-30 11:56     ` Gleb Natapov
2013-07-30 12:13       ` Paolo Bonzini
2013-07-30 14:22         ` Gleb Natapov [this message]
2013-07-30 14:36           ` Gleb Natapov
2013-07-25 10:59 ` [PATCH v4 07/13] nEPT: Redefine EPT-specific link_shadow_page() Gleb Natapov
2013-07-25 10:59 ` [PATCH v4 08/13] nEPT: Nested INVEPT Gleb Natapov
2013-07-25 10:59 ` [PATCH v4 09/13] nEPT: Add nEPT violation/misconfigration support Gleb Natapov
2013-07-29  8:59   ` Paolo Bonzini
2013-07-29 10:52     ` Gleb Natapov
2013-07-29 10:59       ` Paolo Bonzini
2013-07-29 11:43         ` Gleb Natapov
2013-07-29 12:05           ` Paolo Bonzini
2013-07-29 12:34             ` Gleb Natapov
2013-07-29 13:11               ` Paolo Bonzini
2013-07-29 13:20                 ` Gleb Natapov
2013-07-29 14:12                   ` Paolo Bonzini
2013-07-29 16:24                     ` Gleb Natapov
2013-07-29 16:36                       ` Paolo Bonzini
2013-07-29 16:54                         ` Gleb Natapov
2013-07-25 10:59 ` [PATCH v4 10/13] nEPT: MMU context for nested EPT Gleb Natapov
2013-07-25 10:59 ` [PATCH v4 11/13] nEPT: Advertise EPT to L1 Gleb Natapov
2013-07-29  9:21   ` Paolo Bonzini
2013-07-29 11:11     ` Gleb Natapov
2013-07-29 11:33       ` Paolo Bonzini
2013-07-29 11:35         ` Gleb Natapov
2013-07-25 11:00 ` [PATCH v4 12/13] nEPT: Some additional comments Gleb Natapov
2013-07-25 11:00 ` [PATCH v4 13/13] nEPT: Miscelleneous cleanups Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130730142232.GE25505@redhat.com \
    --to=gleb@redhat.com \
    --cc=jun.nakajima@Intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    --cc=yang.z.zhang@Intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.