Re: [PATCH v5 10/14] nEPT: Add nEPT violation/misconfigration support

public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed

From: Gleb Natapov <gleb@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>,
	kvm@vger.kernel.org, Jun Nakajima <jun.nakajima@intel.com>,
	Yang Zhang <yang.z.zhang@intel.com>
Subject: Re: [PATCH v5 10/14] nEPT: Add nEPT violation/misconfigration support
Date: Thu, 1 Aug 2013 14:47:58 +0300	[thread overview]
Message-ID: <20130801114758.GD6042@redhat.com> (raw)
In-Reply-To: <20130801111911.GB5245@mail.corp.redhat.com>

On Thu, Aug 01, 2013 at 01:19:11PM +0200, Paolo Bonzini wrote:
>  On Aug 01 2013, Gleb Natapov wrote:
> > On Thu, Aug 01, 2013 at 04:31:31PM +0800, Xiao Guangrong wrote:
> > > On 07/31/2013 10:48 PM, Gleb Natapov wrote:
> > > > From: Yang Zhang <yang.z.zhang@Intel.com>
> > > > 
> > > > Inject nEPT fault to L1 guest. This patch is original from Xinhao.
> > > > 
> > > > Signed-off-by: Jun Nakajima <jun.nakajima@intel.com>
> > > > Signed-off-by: Xinhao Xu <xinhao.xu@intel.com>
> > > > Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
> > > > Signed-off-by: Gleb Natapov <gleb@redhat.com>
> > > > ---
> > > >  arch/x86/include/asm/kvm_host.h |    4 ++++
> > > >  arch/x86/kvm/mmu.c              |   34 ++++++++++++++++++++++++++++++++++
> > > >  arch/x86/kvm/paging_tmpl.h      |   28 ++++++++++++++++++++++++----
> > > >  arch/x86/kvm/vmx.c              |   17 +++++++++++++++++
> > > >  4 files changed, 79 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > > index 531f47c..58a17c0 100644
> > > > --- a/arch/x86/include/asm/kvm_host.h
> > > > +++ b/arch/x86/include/asm/kvm_host.h
> > > > @@ -286,6 +286,7 @@ struct kvm_mmu {
> > > >  	u64 *pae_root;
> > > >  	u64 *lm_root;
> > > >  	u64 rsvd_bits_mask[2][4];
> > > > +	u64 bad_mt_xwr;
> > > > 
> > > >  	/*
> > > >  	 * Bitmap: bit set = last pte in walk
> > > > @@ -512,6 +513,9 @@ struct kvm_vcpu_arch {
> > > >  	 * instruction.
> > > >  	 */
> > > >  	bool write_fault_to_shadow_pgtable;
> > > > +
> > > > +	/* set at EPT violation at this point */
> > > > +	unsigned long exit_qualification;
> > > >  };
> > > > 
> > > >  struct kvm_lpage_info {
> > > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > > > index 3df3ac3..58ae9db 100644
> > > > --- a/arch/x86/kvm/mmu.c
> > > > +++ b/arch/x86/kvm/mmu.c
> > > > @@ -3521,6 +3521,8 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
> > > >  	int maxphyaddr = cpuid_maxphyaddr(vcpu);
> > > >  	u64 exb_bit_rsvd = 0;
> > > > 
> > > > +	context->bad_mt_xwr = 0;
> > > > +
> > > >  	if (!context->nx)
> > > >  		exb_bit_rsvd = rsvd_bits(63, 63);
> > > >  	switch (context->root_level) {
> > > > @@ -3576,6 +3578,38 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
> > > >  	}
> > > >  }
> > > > 
> > > > +static void reset_rsvds_bits_mask_ept(struct kvm_vcpu *vcpu,
> > > > +		struct kvm_mmu *context, bool execonly)
> > > > +{
> > > > +	int maxphyaddr = cpuid_maxphyaddr(vcpu);
> > > > +	int pte;
> > > > +
> > > > +	context->rsvd_bits_mask[0][3] =
> > > > +		rsvd_bits(maxphyaddr, 51) | rsvd_bits(3, 7);
> > > > +	context->rsvd_bits_mask[0][2] =
> > > > +		rsvd_bits(maxphyaddr, 51) | rsvd_bits(3, 6);
> > > > +	context->rsvd_bits_mask[0][1] =
> > > > +		rsvd_bits(maxphyaddr, 51) | rsvd_bits(3, 6);
> > > > +	context->rsvd_bits_mask[0][0] = rsvd_bits(maxphyaddr, 51);
> > > > +
> > > > +	/* large page */
> > > > +	context->rsvd_bits_mask[1][3] = context->rsvd_bits_mask[0][3];
> > > > +	context->rsvd_bits_mask[1][2] =
> > > > +		rsvd_bits(maxphyaddr, 51) | rsvd_bits(12, 29);
> > > > +	context->rsvd_bits_mask[1][1] =
> > > > +		rsvd_bits(maxphyaddr, 51) | rsvd_bits(12, 20);
> > > > +	context->rsvd_bits_mask[1][0] = context->rsvd_bits_mask[0][0];
> > > > +	
> > > > +	for (pte = 0; pte < 64; pte++) {
> > > > +		int rwx_bits = pte & 7;
> > > > +		int mt = pte >> 3;
> > > > +		if (mt == 0x2 || mt == 0x3 || mt == 0x7 ||
> > > > +				rwx_bits == 0x2 || rwx_bits == 0x6 ||
> > > > +				(rwx_bits == 0x4 && !execonly))
> > > > +			context->bad_mt_xwr |= (1ull << pte);
> > > > +	}
> > > > +}
> > > > +
> > > >  static void update_permission_bitmask(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
> > > >  {
> > > >  	unsigned bit, byte, pfec;
> > > > diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> > > > index 0d25351..ed6773e 100644
> > > > --- a/arch/x86/kvm/paging_tmpl.h
> > > > +++ b/arch/x86/kvm/paging_tmpl.h
> > > > @@ -127,12 +127,13 @@ static inline void FNAME(protect_clean_gpte)(unsigned *access, unsigned gpte)
> > > >  	*access &= mask;
> > > >  }
> > > > 
> > > > -static bool FNAME(is_rsvd_bits_set)(struct kvm_mmu *mmu, u64 gpte, int level)
> > > > +static bool inline FNAME(is_rsvd_bits_set)(struct kvm_mmu *mmu, u64 gpte,
> > > > +		int level)
> > > 
> > > Not sure this explicit "inline" is needed... Gcc always inlines the small and
> > > static functions.
> > 
> > Paolo asked for it, but now I see that I did in in a wrong patch. I do
> > not care much personally either way, I agree with you though, compiler
> > will inline it anyway.
> 
> The point here was that if we use "||" below (or multiple "if"s as I
> suggested in my review), we really want to inline the function to ensure
> that the branches here is merged with the one in the caller's "if()".
> 
> With the "|" there is not much effect.
> 
Even with if() do you really think there is a chance the function will not be
inlined? I see that much much bigger functions are inlined.

> > > >  {
> > > > -	int bit7;
> > > > +	int bit7 = (gpte >> 7) & 1, low5 = gpte & 0x3f;
> 
> Low 6, actually.
Well, 5. This is bug, should be 0x1f. Good catch :)

> 
> > > > -	bit7 = (gpte >> 7) & 1;
> > > > -	return (gpte & mmu->rsvd_bits_mask[bit7][level-1]) != 0;
> > > > +	return ((gpte & mmu->rsvd_bits_mask[bit7][level-1]) != 0) |
> > > > +		((mmu->bad_mt_xwr & (1ull << low5)) != 0);
> 
> If you really want to optimize this thing to avoid branches, you can
> also change it to
> 
>    return ((gpte & mmu->rsvd_bits_mask[bit7][level-1]) |
> 		(mmu->bad_mt_xwr & (1ull << low5))) != 0;
> 
Why not drop second != 0 then?

> and consider adding the bits to bad_mt_xwr to detect non-presence
> at the same time as reserved bits (as non-presence is also tested
> near both callers of is_rsvd_bits_set).
We need to distinguish between two of them at least at one call site.

> 
> On the other hand, it starts to look like useless complication not
> backed by any benchmark (and probably unmeasurable anyway).  Neither of
> the conditions are particularly easy to compute, and they are probably
> more expensive than a well-predicted branch.  Thus, in this case I
> would prefer to have clearer code and just use two "if"s, the second
> of which can be guarded to be done only for EPT (it doesn't have to
> be an #ifdef, no?).

return a | b

is less clear than:

if (a)
 return true;
if (b)
 return true;

?

> 
> If you do not want to do that, now that this is checked also on non-EPT
> PTEs we should rename it to something else like bad_low_six_bits?
> Regular page tables have no MT and XWR bits.
> 
I think the current name better describes the purpose of the field. It
shows that for non ept it is irrelevant, but if we will use it to detect
nonpresent ptes for regular page tables too the rename make perfect
sense. Lest rename it then.

--
			Gleb.

next prev parent reply	other threads:[~2013-08-01 11:48 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-31 14:48 [PATCH v5 00/14] Nested EPT Gleb Natapov
2013-07-31 14:48 ` [PATCH v5 01/14] nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 Gleb Natapov
2013-08-01 11:22   ` Orit Wasserman
2013-07-31 14:48 ` [PATCH v5 02/14] nEPT: Fix cr3 handling in nested exit and entry Gleb Natapov
2013-08-01 11:28   ` Orit Wasserman
2013-07-31 14:48 ` [PATCH v5 03/14] nEPT: Fix wrong test in kvm_set_cr3 Gleb Natapov
2013-08-01 12:07   ` Orit Wasserman
2013-07-31 14:48 ` [PATCH v5 04/14] nEPT: Move common code to paging_tmpl.h Gleb Natapov
2013-07-31 14:48 ` [PATCH v5 05/14] nEPT: make guest's A/D bits depends on guest's paging mode Gleb Natapov
2013-08-01  6:51   ` Xiao Guangrong
2013-07-31 14:48 ` [PATCH v5 06/14] nEPT: Support shadow paging for guest paging without A/D bits Gleb Natapov
2013-08-01  6:54   ` Xiao Guangrong
2013-07-31 14:48 ` [PATCH v5 07/14] nEPT: Add EPT tables support to paging_tmpl.h Gleb Natapov
2013-08-01  7:00   ` Xiao Guangrong
2013-08-01  7:10     ` Gleb Natapov
2013-08-01  7:18       ` Xiao Guangrong
2013-08-01  7:31         ` Xiao Guangrong
2013-08-01  7:42           ` Gleb Natapov
2013-08-01  7:51             ` Xiao Guangrong
2013-08-01  7:56               ` Gleb Natapov
2013-08-01 11:05               ` Paolo Bonzini
2013-08-01 11:07                 ` Gleb Natapov
2013-07-31 14:48 ` [PATCH v5 08/14] nEPT: Redefine EPT-specific link_shadow_page() Gleb Natapov
2013-08-01  7:24   ` Xiao Guangrong
2013-08-01  7:27     ` Gleb Natapov
2013-07-31 14:48 ` [PATCH v5 09/14] nEPT: Nested INVEPT Gleb Natapov
2013-07-31 14:48 ` [PATCH v5 10/14] nEPT: Add nEPT violation/misconfigration support Gleb Natapov
2013-08-01  8:31   ` Xiao Guangrong
2013-08-01  8:45     ` Gleb Natapov
2013-08-01 11:19       ` Paolo Bonzini
2013-08-01 11:47         ` Gleb Natapov [this message]
2013-08-01 12:03           ` Paolo Bonzini
2013-08-01 12:14             ` Gleb Natapov
2013-08-01 13:13               ` Paolo Bonzini
2013-08-01 13:20                 ` Gleb Natapov
2013-07-31 14:48 ` [PATCH v5 11/14] nEPT: MMU context for nested EPT Gleb Natapov
2013-08-01  9:16   ` Xiao Guangrong
2013-08-01  9:37     ` Gleb Natapov
2013-08-01  9:51     ` Xiao Guangrong
2013-07-31 14:48 ` [PATCH v5 12/14] nEPT: Advertise EPT to L1 Gleb Natapov
2013-07-31 14:48 ` [PATCH v5 13/14] nEPT: Some additional comments Gleb Natapov
2013-07-31 14:48 ` [PATCH v5 14/14] nEPT: Miscelleneous cleanups Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130801114758.GD6042@redhat.com \
    --to=gleb@redhat.com \
    --cc=jun.nakajima@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    --cc=yang.z.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox