From: Tim Deegan <tim@xen.org>
To: xiantao.zhang@intel.com
Cc: keir@xen.org, xen-devel@lists.xensource.com,
eddie.dong@intel.com, JBeulich@suse.com, jun.nakajima@intel.com
Subject: Re: [PATCH 03/11] nEPT: Implement guest ept's walker
Date: Thu, 13 Dec 2012 15:41:40 +0000 [thread overview]
Message-ID: <20121213154140.GI75286@ocelot.phlegethon.org> (raw)
In-Reply-To: <1355162243-11857-4-git-send-email-xiantao.zhang@intel.com>
At 01:57 +0800 on 11 Dec (1355191035), xiantao.zhang@intel.com wrote:
> From: Zhang Xiantao <xiantao.zhang@intel.com>
>
> Implment guest EPT PT walker, some logic is based on shadow's
> ia32e PT walker. During the PT walking, if the target pages are
> not in memory, use RETRY mechanism and get a chance to let the
> target page back.
>
> Signed-off-by: Zhang Xiantao <xiantao.zhang@intel.com>
The design looks pretty good. A few comments below on code details --
I think the only big one is that the ept walker shouldn't force eptes
into 'normal' pte types just so it can reuse the walk_t struct.
> @@ -88,10 +88,11 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
>
> /* If the map is non-NULL, we leave this function having
> * acquired an extra ref on mfn_to_page(*mfn) */
> -static inline void *map_domain_gfn(struct p2m_domain *p2m,
> +void *map_domain_gfn(struct p2m_domain *p2m,
> gfn_t gfn,
> mfn_t *mfn,
> p2m_type_t *p2mt,
> + p2m_query_t *q,
I think this should just be a plain p2m_query_t and not a pointer to
one; the code below only dereferences the pointer to read it. That will
save you having a variable just to hold 'P2M_ALLOC | P2M_UNSHARE' in a
few places below.
> --- /dev/null
> +++ b/xen/arch/x86/mm/hap/nested_ept.c
> +/* For EPT's walker reserved bits and EMT check */
> +#define EPT_MUST_RSV_BITS (((1ull << PADDR_BITS) -1) & \
> + ~((1ull << paddr_bits) - 1))
> +
> +
> +#define EPT_EMT_WB 6
> +#define EPT_EMT_UC 0
These two definitions should be in vmx.h along with the other
architectural constants for EPTEs.
> +
> +#define NEPT_VPID_CAP_BITS 0
> +
> +#define NEPT_1G_ENTRY_FLAG (1 << 11)
> +#define NEPT_2M_ENTRY_FLAG (1 << 10)
> +#define NEPT_4K_ENTRY_FLAG (1 << 9)
> +
> +/* Always expose 1G and 2M capability to guest,
> + so don't need additional check */
> +bool_t nept_sp_entry(uint64_t entry)
> +{
> + return !!(entry & EPTE_SUPER_PAGE_MASK);
> +}
> +
> +static bool_t nept_rsv_bits_check(uint64_t entry, uint32_t level)
> +{
> + uint64_t rsv_bits = EPT_MUST_RSV_BITS;
> +
> + switch ( level ){
> + case 1:
> + break;
> + case 2 ... 3:
> + if (nept_sp_entry(entry))
> + rsv_bits |= ((1ull << (9 * (level -1 ))) -1) << PAGE_SHIFT;
> + else
> + rsv_bits |= 0xfull << 3;
Please use EPTE_EMT_MASK rather than open-coding it.
> + break;
> + case 4:
> + rsv_bits |= 0xf8;
Again, please use EPTE_EMT_MASK | EPTE_IGMT_MASK | EPTE_SUPER_PAGE_MASK.
> + break;
> + default:
> + printk("Unsupported EPT paging level: %d\n", level);
> + }
> + if ( ((entry & rsv_bits) ^ rsv_bits) == rsv_bits )
> + return 0;
This XOR is useful in the normal walker because we care about _which_
bits are wrong. Here, you can just return !(entry & rsv_bits) for the
same result.
> + return 1;
> +}
> +
> +/* EMT checking*/
> +static bool_t nept_emt_bits_check(uint64_t entry, uint32_t level)
> +{
> + ept_entry_t e;
> + e.epte = entry;
> + if ( e.sp || level == 1 ) {
> + if ( e.emt == 2 || e.emt == 3 || e.emt == 7 )
> + return 1;
Please define more of the EPT_EMT_* constants for these values and use
them.
> + }
> + return 0;
> +}
> +
> +static bool_t nept_rwx_bits_check(uint64_t entry) {
> + /*write only or write/execute only*/
> + uint8_t rwx_bits = entry & 0x7;
> +
> + if ( rwx_bits == 2 || rwx_bits == 6)
> + return 1;
> + if ( rwx_bits == 4 && !(NEPT_VPID_CAP_BITS &
> + VMX_EPT_EXEC_ONLY_SUPPORTED))
Please pass the entry as an ept_entry_t and check the named r, w and x
fields rather than using magic numbers.
> + return 1;
> + return 0;
> +}
> +
> +/* nept's misconfiguration check */
> +static bool_t nept_misconfiguration_check(uint64_t entry, uint32_t level)
> +{
> + return (nept_rsv_bits_check(entry, level) ||
> + nept_emt_bits_check(entry, level) ||
> + nept_rwx_bits_check(entry));
> +}
> +
> +static bool_t nept_present_check(uint64_t entry)
> +{
> + if (entry & 0x7)
Again, please pass an ept_entry_t and check the r/w/x fields.
> + return 1;
> + return 0;
> +}
> +
> +uint64_t nept_get_ept_vpid_cap(void)
> +{
> + /*TODO: exposed ept and vpid features*/
This TODO comment doesn't get removed later in the series. Is returning
0 here always OK?
> + return NEPT_VPID_CAP_BITS;
> +}
> +
> +static uint32_t
> +nept_walk_tables(struct vcpu *v, unsigned long l2ga, walk_t *gw)
> +{
> + p2m_type_t p2mt;
> + uint32_t rc = 0, ret = 0, gflags;
> + struct domain *d = v->domain;
> + struct p2m_domain *p2m = d->arch.p2m;
> + gfn_t base_gfn = _gfn(nhvm_vcpu_p2m_base(v) >> PAGE_SHIFT);
> + p2m_query_t qt = P2M_ALLOC;
> +
> + guest_l1e_t *l1p = NULL;
> + guest_l2e_t *l2p = NULL;
> + guest_l3e_t *l3p = NULL;
> + guest_l4e_t *l4p = NULL;
These aren't realy guest_l*es, so I think you should use ept_entry_t *
to point to them. While you're at it, why not define an equivalent
ept_walk_t struct that uses the ept-specific types instead of putting
EPT entries in a normal walk_t?
Also, unlike the normal guest walker, you don't need to hold these maps
open for writing A/D bits, so you could just use a single pointer and
unmap as you go.
> + sp = nept_sp_entry(gw->l3e.l3);
> + /* Super 1G entry */
> + if ( sp )
> + {
> + /* Generate a fake l1 table entry so callers don't all
> + * have to understand superpages. */
You only have one caller for this function, and it does understand
superpages -- it explicitly check for them. So I think you can avoid
this part altogether (likewise for 2M superpages) and just DTRT in the
caller.
Cheers,
Tim.
> + gfn_t start = guest_l3e_get_gfn(gw->l3e);
> +
> + /* Increment the pfn by the right number of 4k pages. */
> + start = _gfn((gfn_x(start) & ~GUEST_L3_GFN_MASK) +
> + ((l2ga >> PAGE_SHIFT) & GUEST_L3_GFN_MASK));
> + gflags = (gw->l3e.l3 & 0x7f) | NEPT_1G_ENTRY_FLAG;
> + gw->l1e = guest_l1e_from_gfn(start, gflags);
> + gw->l2mfn = gw->l1mfn = _mfn(INVALID_MFN);
> + goto done;
> + }
> +
next prev parent reply other threads:[~2012-12-13 15:41 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-10 17:57 [PATCH 00/11] Add virtual EPT support Xen xiantao.zhang
2012-12-10 17:57 ` [PATCH 01/11] nestedhap: Change hostcr3 and p2m->cr3 to meaningful words xiantao.zhang
2012-12-13 14:52 ` Tim Deegan
2012-12-10 17:57 ` [PATCH 02/11] nestedhap: Change nested p2m's walker to vendor-specific xiantao.zhang
2012-12-13 14:52 ` Tim Deegan
2012-12-10 17:57 ` [PATCH 03/11] nEPT: Implement guest ept's walker xiantao.zhang
2012-12-13 15:41 ` Tim Deegan [this message]
2012-12-10 17:57 ` [PATCH 04/11] nEPT: Do further permission check for sucessful translation xiantao.zhang
2012-12-13 15:47 ` Tim Deegan
2012-12-10 17:57 ` [PATCH 05/11] EPT: Make ept data structure or operations neutral xiantao.zhang
2012-12-13 16:04 ` Tim Deegan
2012-12-17 8:57 ` Zhang, Xiantao
2012-12-17 9:56 ` Jan Beulich
2012-12-10 17:57 ` [PATCH 06/11] nEPT: Try to enable EPT paging for L2 guest xiantao.zhang
2012-12-13 16:16 ` Tim Deegan
2012-12-10 17:57 ` [PATCH 07/11] nEPT: Sync PDPTR fields if L2 guest in PAE paging mode xiantao.zhang
2012-12-13 16:17 ` Tim Deegan
2012-12-10 17:57 ` [PATCH 08/11] nEPT: Use minimal permission for nested p2m xiantao.zhang
2012-12-13 16:43 ` Tim Deegan
2012-12-10 17:57 ` [PATCH 09/11] nEPT: handle invept instruction from L1 VMM xiantao.zhang
2012-12-13 16:56 ` Tim Deegan
2012-12-10 17:57 ` [PATCH 10/11] nEPT: expost EPT capablity to " xiantao.zhang
2012-12-13 17:03 ` Tim Deegan
2012-12-10 17:57 ` [PATCH 11/11] nVMX: Expose VPID capability to nested VMM xiantao.zhang
2012-12-13 17:15 ` Tim Deegan
2012-12-13 0:31 ` [PATCH 00/11] Add virtual EPT support Xen Zhang, Xiantao
2012-12-13 10:25 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121213154140.GI75286@ocelot.phlegethon.org \
--to=tim@xen.org \
--cc=JBeulich@suse.com \
--cc=eddie.dong@intel.com \
--cc=jun.nakajima@intel.com \
--cc=keir@xen.org \
--cc=xen-devel@lists.xensource.com \
--cc=xiantao.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).