xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Tim Deegan <tim@xen.org>
To: xiantao.zhang@intel.com
Cc: keir@xen.org, xen-devel@lists.xensource.com,
	eddie.dong@intel.com, JBeulich@suse.com, jun.nakajima@intel.com
Subject: Re: [PATCH 03/11] nEPT: Implement guest ept's walker
Date: Thu, 13 Dec 2012 15:41:40 +0000	[thread overview]
Message-ID: <20121213154140.GI75286@ocelot.phlegethon.org> (raw)
In-Reply-To: <1355162243-11857-4-git-send-email-xiantao.zhang@intel.com>

At 01:57 +0800 on 11 Dec (1355191035), xiantao.zhang@intel.com wrote:
> From: Zhang Xiantao <xiantao.zhang@intel.com>
> 
> Implment guest EPT PT walker, some logic is based on shadow's
> ia32e PT walker. During the PT walking, if the target pages are
> not in memory, use RETRY mechanism and get a chance to let the
> target page back.
> 
> Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>

The design looks pretty good.  A few comments below on code details --
I think the only big one is that the ept walker shouldn't force eptes
into 'normal' pte types just so it can reuse the walk_t struct.

> @@ -88,10 +88,11 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
>  
>  /* If the map is non-NULL, we leave this function having 
>   * acquired an extra ref on mfn_to_page(*mfn) */
> -static inline void *map_domain_gfn(struct p2m_domain *p2m,
> +void *map_domain_gfn(struct p2m_domain *p2m,
>                                     gfn_t gfn, 
>                                     mfn_t *mfn,
>                                     p2m_type_t *p2mt,
> +                                   p2m_query_t *q,

I think this should just be a plain p2m_query_t and not a pointer to
one; the code below only dereferences the pointer to read it.  That will
save you having a variable just to hold 'P2M_ALLOC | P2M_UNSHARE' in a
few places below.

> --- /dev/null
> +++ b/xen/arch/x86/mm/hap/nested_ept.c
> +/* For EPT's walker reserved bits and EMT check  */
> +#define EPT_MUST_RSV_BITS (((1ull << PADDR_BITS) -1) & \
> +                     ~((1ull << paddr_bits) - 1))
> +
> +
> +#define EPT_EMT_WB  6
> +#define EPT_EMT_UC  0

These two definitions should be in vmx.h along with the other
architectural constants for EPTEs.

> +
> +#define NEPT_VPID_CAP_BITS 0
> +
> +#define NEPT_1G_ENTRY_FLAG (1 << 11)
> +#define NEPT_2M_ENTRY_FLAG (1 << 10)
> +#define NEPT_4K_ENTRY_FLAG (1 << 9)
> +
> +/* Always expose 1G and 2M capability to guest, 
> + so don't need additional check */
> +bool_t nept_sp_entry(uint64_t entry)
> +{
> +    return !!(entry & EPTE_SUPER_PAGE_MASK);
> +}
> +
> +static bool_t nept_rsv_bits_check(uint64_t entry, uint32_t level)
> +{
> +    uint64_t rsv_bits = EPT_MUST_RSV_BITS;
> +
> +    switch ( level ){
> +        case 1:
> +            break;
> +        case 2 ... 3:
> +            if (nept_sp_entry(entry))
> +                rsv_bits |=  ((1ull << (9 * (level -1 ))) -1) << PAGE_SHIFT;
> +            else
> +                rsv_bits |= 0xfull << 3;

Please use EPTE_EMT_MASK rather than open-coding it.

> +            break;
> +        case 4:
> +        rsv_bits |= 0xf8;

Again, please use EPTE_EMT_MASK | EPTE_IGMT_MASK | EPTE_SUPER_PAGE_MASK.

> +        break;
> +        default:
> +            printk("Unsupported EPT paging level: %d\n", level);
> +    }
> +    if ( ((entry & rsv_bits) ^ rsv_bits) == rsv_bits )
> +        return 0;

This XOR is useful in the normal walker because we care about _which_
bits are wrong.  Here, you can just return !(entry & rsv_bits) for the
same result.

> +    return 1;
> +}
> +
> +/* EMT checking*/
> +static bool_t nept_emt_bits_check(uint64_t entry, uint32_t level)
> +{
> +    ept_entry_t e;
> +    e.epte = entry;
> +    if ( e.sp || level == 1 ) {
> +        if ( e.emt == 2 || e.emt == 3 || e.emt == 7 )
> +            return 1;

Please define more of the EPT_EMT_* constants for these values and use
them.

> +    }
> +    return 0;
> +}
> +
> +static bool_t nept_rwx_bits_check(uint64_t entry) {
> +    /*write only or write/execute only*/
> +    uint8_t rwx_bits = entry & 0x7;
> +
> +    if ( rwx_bits == 2 || rwx_bits == 6)
> +        return 1;
> +    if ( rwx_bits == 4 && !(NEPT_VPID_CAP_BITS &
> +                        VMX_EPT_EXEC_ONLY_SUPPORTED))

Please pass the entry as an ept_entry_t and check the named r, w and x
fields rather than using magic numbers. 

> +        return 1;
> +    return 0;
> +}
> +
> +/* nept's misconfiguration check */
> +static bool_t nept_misconfiguration_check(uint64_t entry, uint32_t level)
> +{
> +    return (nept_rsv_bits_check(entry, level) ||
> +                nept_emt_bits_check(entry, level) ||
> +                nept_rwx_bits_check(entry));
> +}
> +
> +static bool_t nept_present_check(uint64_t entry)
> +{
> +    if (entry & 0x7)

Again, please pass an ept_entry_t and check the r/w/x fields. 

> +        return 1;
> +    return 0;
> +}
> +
> +uint64_t nept_get_ept_vpid_cap(void)
> +{
> +    /*TODO: exposed ept and vpid features*/

This TODO comment doesn't get removed later in the series.  Is returning
0 here always OK?

> +    return NEPT_VPID_CAP_BITS;
> +}
> +
> +static uint32_t
> +nept_walk_tables(struct vcpu *v, unsigned long l2ga, walk_t *gw)
> +{
> +    p2m_type_t p2mt;
> +    uint32_t rc = 0, ret = 0, gflags;
> +    struct domain *d = v->domain;
> +    struct p2m_domain *p2m = d->arch.p2m;
> +    gfn_t base_gfn = _gfn(nhvm_vcpu_p2m_base(v) >> PAGE_SHIFT);
> +    p2m_query_t qt = P2M_ALLOC;
> +
> +    guest_l1e_t *l1p = NULL;
> +    guest_l2e_t *l2p = NULL;
> +    guest_l3e_t *l3p = NULL;
> +    guest_l4e_t *l4p = NULL;

These aren't realy guest_l*es, so I think you should use ept_entry_t *
to point to them.  While you're at it, why not define an equivalent
ept_walk_t struct that uses the ept-specific types instead of putting
EPT entries in a normal walk_t?

Also, unlike the normal guest walker, you don't need to hold these maps
open for writing A/D bits, so you could just use a single pointer and
unmap as you go.

> +    sp = nept_sp_entry(gw->l3e.l3);
> +    /* Super 1G entry */
> +    if ( sp )
> +    {
> +        /* Generate a fake l1 table entry so callers don't all 
> +         * have to understand superpages. */

You only have one caller for this function, and it does understand
superpages -- it explicitly check for them.  So I think you can avoid
this part altogether (likewise for 2M superpages) and just DTRT in the
caller.

Cheers,

Tim.

> +        gfn_t start = guest_l3e_get_gfn(gw->l3e);
> +
> +        /* Increment the pfn by the right number of 4k pages. */
> +        start = _gfn((gfn_x(start) & ~GUEST_L3_GFN_MASK) +
> +                     ((l2ga >> PAGE_SHIFT) & GUEST_L3_GFN_MASK));
> +        gflags = (gw->l3e.l3 & 0x7f) | NEPT_1G_ENTRY_FLAG;
> +        gw->l1e = guest_l1e_from_gfn(start, gflags);
> +        gw->l2mfn = gw->l1mfn = _mfn(INVALID_MFN);
> +        goto done;
> +    }
> +

  reply	other threads:[~2012-12-13 15:41 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-10 17:57 [PATCH 00/11] Add virtual EPT support Xen xiantao.zhang
2012-12-10 17:57 ` [PATCH 01/11] nestedhap: Change hostcr3 and p2m->cr3 to meaningful words xiantao.zhang
2012-12-13 14:52   ` Tim Deegan
2012-12-10 17:57 ` [PATCH 02/11] nestedhap: Change nested p2m's walker to vendor-specific xiantao.zhang
2012-12-13 14:52   ` Tim Deegan
2012-12-10 17:57 ` [PATCH 03/11] nEPT: Implement guest ept's walker xiantao.zhang
2012-12-13 15:41   ` Tim Deegan [this message]
2012-12-10 17:57 ` [PATCH 04/11] nEPT: Do further permission check for sucessful translation xiantao.zhang
2012-12-13 15:47   ` Tim Deegan
2012-12-10 17:57 ` [PATCH 05/11] EPT: Make ept data structure or operations neutral xiantao.zhang
2012-12-13 16:04   ` Tim Deegan
2012-12-17  8:57     ` Zhang, Xiantao
2012-12-17  9:56       ` Jan Beulich
2012-12-10 17:57 ` [PATCH 06/11] nEPT: Try to enable EPT paging for L2 guest xiantao.zhang
2012-12-13 16:16   ` Tim Deegan
2012-12-10 17:57 ` [PATCH 07/11] nEPT: Sync PDPTR fields if L2 guest in PAE paging mode xiantao.zhang
2012-12-13 16:17   ` Tim Deegan
2012-12-10 17:57 ` [PATCH 08/11] nEPT: Use minimal permission for nested p2m xiantao.zhang
2012-12-13 16:43   ` Tim Deegan
2012-12-10 17:57 ` [PATCH 09/11] nEPT: handle invept instruction from L1 VMM xiantao.zhang
2012-12-13 16:56   ` Tim Deegan
2012-12-10 17:57 ` [PATCH 10/11] nEPT: expost EPT capablity to " xiantao.zhang
2012-12-13 17:03   ` Tim Deegan
2012-12-10 17:57 ` [PATCH 11/11] nVMX: Expose VPID capability to nested VMM xiantao.zhang
2012-12-13 17:15   ` Tim Deegan
2012-12-13  0:31 ` [PATCH 00/11] Add virtual EPT support Xen Zhang, Xiantao
2012-12-13 10:25   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121213154140.GI75286@ocelot.phlegethon.org \
    --to=tim@xen.org \
    --cc=JBeulich@suse.com \
    --cc=eddie.dong@intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xensource.com \
    --cc=xiantao.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).