* [PATCH V7 1/5] x86/hvm: pkeys, disable pkeys for guests in non-paging mode
2016-01-27 8:30 [PATCH V7 0/5] x86/hvm: pkeys, add memory protection-key support Huaitong Han
@ 2016-01-27 8:30 ` Huaitong Han
2016-01-29 16:27 ` Jan Beulich
2016-01-27 8:30 ` [PATCH V7 2/5] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
` (3 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: Huaitong Han @ 2016-01-27 8:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, george.dunlap, tim, keir
Cc: Huaitong Han, xen-devel
Changes in v7:
no changes.
----
This patch disables pkeys for guest in non-paging mode, However XEN always uses
paging mode to emulate guest non-paging mode, To emulate this behavior, pkeys
needs to be manually disabled when guest switches to non-paging mode.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
xen/arch/x86/hvm/vmx/vmx.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 04dde83..a0d51cb 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1368,12 +1368,13 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
if ( !hvm_paging_enabled(v) )
{
/*
- * SMEP/SMAP is disabled if CPU is in non-paging mode in hardware.
- * However Xen always uses paging mode to emulate guest non-paging
- * mode. To emulate this behavior, SMEP/SMAP needs to be manually
- * disabled when guest VCPU is in non-paging mode.
+ * SMEP/SMAP/PKU is disabled if CPU is in non-paging mode in
+ * hardware. However Xen always uses paging mode to emulate guest
+ * non-paging mode. To emulate this behavior, SMEP/SMAP/PKU needs
+ * to be manually disabled when guest VCPU is in non-paging mode.
*/
- v->arch.hvm_vcpu.hw_cr[4] &= ~(X86_CR4_SMEP | X86_CR4_SMAP);
+ v->arch.hvm_vcpu.hw_cr[4] &=
+ ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
}
__vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]);
break;
--
2.4.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH V7 1/5] x86/hvm: pkeys, disable pkeys for guests in non-paging mode
2016-01-27 8:30 ` [PATCH V7 1/5] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
@ 2016-01-29 16:27 ` Jan Beulich
0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2016-01-29 16:27 UTC (permalink / raw)
To: Huaitong Han; +Cc: george.dunlap, andrew.cooper3, tim, keir, xen-devel
>>> On 27.01.16 at 09:30, <huaitong.han@intel.com> wrote:
> Changes in v7:
> no changes.
> ----
>
> This patch disables pkeys for guest in non-paging mode, However XEN always
> uses
> paging mode to emulate guest non-paging mode, To emulate this behavior, pkeys
> needs to be manually disabled when guest switches to non-paging mode.
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
The "Changes in" belongs here, the way it's done now will require
extra work while committing. Hence this needs to be resent in
proper shape.
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH V7 2/5] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2016-01-27 8:30 [PATCH V7 0/5] x86/hvm: pkeys, add memory protection-key support Huaitong Han
2016-01-27 8:30 ` [PATCH V7 1/5] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
@ 2016-01-27 8:30 ` Huaitong Han
2016-01-27 8:30 ` [PATCH V7 3/5] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Huaitong Han @ 2016-01-27 8:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, george.dunlap, tim, keir
Cc: Huaitong Han, xen-devel
Changes in v7:
*Add static for pkey_fault.
*Add a comment for page present check and adjust indentation.
*Init pkru_ad and pkru_wd.
*Delete l3e_get_pkey the outer parentheses.
*The first parameter of read_pkru_* use uint32_t type.
----
Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of
leaf entries of the page tables.
PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per
domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit for
protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key
i (WDi). PKEY is index to a defined domain.
A fault is considered as a PKU violation if all of the following conditions are
true:
1.CR4_PKE=1.
2.EFER_LMA=1.
3.Page is present with no reserved bit violations.
4.The access is not an instruction fetch.
5.The access is to a user page.
6.PKRU.AD=1
or The access is a data write and PKRU.WD=1
and either CR0.WP=1 or it is a user access.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
---
xen/arch/x86/mm/guest_walk.c | 54 +++++++++++++++++++++++++++++++++++++++
xen/arch/x86/mm/hap/guest_walk.c | 3 +++
xen/include/asm-x86/guest_pt.h | 12 +++++++++
xen/include/asm-x86/hvm/hvm.h | 2 ++
xen/include/asm-x86/page.h | 5 ++++
xen/include/asm-x86/processor.h | 40 +++++++++++++++++++++++++++++
xen/include/asm-x86/x86_64/page.h | 12 +++++++++
7 files changed, 128 insertions(+)
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 18d1acf..5e1111b 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -90,6 +90,54 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
return 0;
}
+#if GUEST_PAGING_LEVELS >= 4
+static bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
+ uint32_t pte_flags, uint32_t pte_pkey)
+{
+ uint32_t pkru = 0;
+ bool_t pkru_ad = 0, pkru_wd = 0;
+
+ /* When page isn't present, PKEY isn't checked. */
+ if ( !(pfec & PFEC_page_present) || is_pv_vcpu(vcpu) )
+ return 0;
+
+ /*
+ * PKU: additional mechanism by which the paging controls
+ * access to user-mode addresses based on the value in the
+ * PKRU register. A fault is considered as a PKU violation if all
+ * of the following conditions are true:
+ * 1.CR4_PKE=1.
+ * 2.EFER_LMA=1.
+ * 3.Page is present with no reserved bit violations.
+ * 4.The access is not an instruction fetch.
+ * 5.The access is to a user page.
+ * 6.PKRU.AD=1 or
+ * the access is a data write and PKRU.WD=1 and
+ * either CR0.WP=1 or it is a user access.
+ */
+ if ( !hvm_pku_enabled(vcpu) ||
+ !hvm_long_mode_enabled(vcpu) ||
+ /* The persent bit is guaranteed by the caller. */
+ (pfec & PFEC_reserved_bit) ||
+ (pfec & PFEC_insn_fetch) ||
+ !(pte_flags & _PAGE_USER) )
+ return 0;
+
+ pkru = read_pkru();
+ if ( unlikely(pkru) )
+ {
+ pkru_ad = read_pkru_ad(pkru, pte_pkey);
+ pkru_wd = read_pkru_wd(pkru, pte_pkey);
+ /* Condition 6 */
+ if ( pkru_ad || (pkru_wd && (pfec & PFEC_write_access) &&
+ (hvm_wp_enabled(vcpu) || (pfec & PFEC_user_mode))))
+ return 1;
+ }
+
+ return 0;
+}
+#endif
+
/* Walk the guest pagetables, after the manner of a hardware walker. */
/* Because the walk is essentially random, it can cause a deadlock
* warning in the p2m locking code. Highly unlikely this is an actual
@@ -107,6 +155,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
guest_l3e_t *l3p = NULL;
guest_l4e_t *l4p;
#endif
+ unsigned int pkey;
uint32_t gflags, mflags, iflags, rc = 0;
bool_t smep = 0, smap = 0;
bool_t pse1G = 0, pse2M = 0;
@@ -190,6 +239,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
goto out;
/* Get the l3e and check its flags*/
gw->l3e = l3p[guest_l3_table_offset(va)];
+ pkey = guest_l3e_get_pkey(gw->l3e);
gflags = guest_l3e_get_flags(gw->l3e) ^ iflags;
if ( !(gflags & _PAGE_PRESENT) ) {
rc |= _PAGE_PRESENT;
@@ -261,6 +311,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
#endif /* All levels... */
+ pkey = guest_l2e_get_pkey(gw->l2e);
gflags = guest_l2e_get_flags(gw->l2e) ^ iflags;
if ( !(gflags & _PAGE_PRESENT) ) {
rc |= _PAGE_PRESENT;
@@ -324,6 +375,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
if(l1p == NULL)
goto out;
gw->l1e = l1p[guest_l1_table_offset(va)];
+ pkey = guest_l1e_get_pkey(gw->l1e);
gflags = guest_l1e_get_flags(gw->l1e) ^ iflags;
if ( !(gflags & _PAGE_PRESENT) ) {
rc |= _PAGE_PRESENT;
@@ -334,6 +386,8 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
#if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
set_ad:
+ if ( pkey_fault(v, pfec, gflags, pkey) )
+ rc |= _PAGE_PKEY_BITS;
#endif
/* Now re-invert the user-mode requirement for SMEP and SMAP */
if ( smep || smap )
diff --git a/xen/arch/x86/mm/hap/guest_walk.c b/xen/arch/x86/mm/hap/guest_walk.c
index 11c1b35..49d0328 100644
--- a/xen/arch/x86/mm/hap/guest_walk.c
+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -130,6 +130,9 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
if ( missing & _PAGE_INVALID_BITS )
pfec[0] |= PFEC_reserved_bit;
+ if ( missing & _PAGE_PKEY_BITS )
+ pfec[0] |= PFEC_prot_key;
+
if ( missing & _PAGE_PAGED )
pfec[0] = PFEC_page_paged;
diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
index 3447973..eb29e62 100644
--- a/xen/include/asm-x86/guest_pt.h
+++ b/xen/include/asm-x86/guest_pt.h
@@ -81,6 +81,11 @@ static inline u32 guest_l1e_get_flags(guest_l1e_t gl1e)
static inline u32 guest_l2e_get_flags(guest_l2e_t gl2e)
{ return gl2e.l2 & 0xfff; }
+static inline u32 guest_l1e_get_pkey(guest_l1e_t gl1e)
+{ return 0; }
+static inline u32 guest_l2e_get_pkey(guest_l2e_t gl2e)
+{ return 0; }
+
static inline guest_l1e_t guest_l1e_from_gfn(gfn_t gfn, u32 flags)
{ return (guest_l1e_t) { (gfn_x(gfn) << PAGE_SHIFT) | flags }; }
static inline guest_l2e_t guest_l2e_from_gfn(gfn_t gfn, u32 flags)
@@ -154,6 +159,13 @@ static inline u32 guest_l4e_get_flags(guest_l4e_t gl4e)
{ return l4e_get_flags(gl4e); }
#endif
+static inline u32 guest_l1e_get_pkey(guest_l1e_t gl1e)
+{ return l1e_get_pkey(gl1e); }
+static inline u32 guest_l2e_get_pkey(guest_l2e_t gl2e)
+{ return l2e_get_pkey(gl2e); }
+static inline u32 guest_l3e_get_pkey(guest_l3e_t gl3e)
+{ return l3e_get_pkey(gl3e); }
+
static inline guest_l1e_t guest_l1e_from_gfn(gfn_t gfn, u32 flags)
{ return l1e_from_pfn(gfn_x(gfn), flags); }
static inline guest_l2e_t guest_l2e_from_gfn(gfn_t gfn, u32 flags)
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index a87224b..731dd44 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -277,6 +277,8 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode);
(hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMAP))
#define hvm_nx_enabled(v) \
(!!((v)->arch.hvm_vcpu.guest_efer & EFER_NX))
+#define hvm_pku_enabled(v) \
+ (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE))
/* Can we use superpages in the HAP p2m table? */
#define hap_has_1gb (!!(hvm_funcs.hap_capabilities & HVM_HAP_SUPERPAGE_1GB))
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index a095a93..9202f3d 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -93,6 +93,11 @@
#define l3e_get_flags(x) (get_pte_flags((x).l3))
#define l4e_get_flags(x) (get_pte_flags((x).l4))
+/* Get pte pkeys (unsigned int). */
+#define l1e_get_pkey(x) get_pte_pkey((x).l1)
+#define l2e_get_pkey(x) get_pte_pkey((x).l2)
+#define l3e_get_pkey(x) get_pte_pkey((x).l3)
+
/* Construct an empty pte. */
#define l1e_empty() ((l1_pgentry_t) { 0 })
#define l2e_empty() ((l2_pgentry_t) { 0 })
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 26ba141..9799dd3 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -374,6 +374,46 @@ static always_inline void clear_in_cr4 (unsigned long mask)
write_cr4(read_cr4() & ~mask);
}
+static inline unsigned int read_pkru(void)
+{
+ unsigned int pkru;
+ unsigned long cr4 = read_cr4();
+
+ /*
+ * _PAGE_PKEY_BITS have a conflict with _PAGE_GNTTAB used by PV guests,
+ * so that X86_CR4_PKE is disabled on hypervisor. To use RDPKRU, CR4.PKE
+ * gets temporarily enabled.
+ */
+ write_cr4(cr4 | X86_CR4_PKE);
+ asm volatile (".byte 0x0f,0x01,0xee"
+ : "=a" (pkru) : "c" (0) : "dx");
+ write_cr4(cr4);
+
+ return pkru;
+}
+
+/* Macros for PKRU domain */
+#define PKRU_READ (0)
+#define PKRU_WRITE (1)
+#define PKRU_ATTRS (2)
+
+/*
+ * PKRU defines 32 bits, there are 16 domains and 2 attribute bits per
+ * domain in pkru, pkeys is index to a defined domain, so the value of
+ * pte_pkeys * PKRU_ATTRS + R/W is offset of a defined domain attribute.
+ */
+static inline bool_t read_pkru_ad(uint32_t pkru, unsigned int pkey)
+{
+ ASSERT(pkey < 16);
+ return (pkru >> (pkey * PKRU_ATTRS + PKRU_READ)) & 1;
+}
+
+static inline bool_t read_pkru_wd(uint32_t pkru, unsigned int pkey)
+{
+ ASSERT(pkey < 16);
+ return (pkru >> (pkey * PKRU_ATTRS + PKRU_WRITE)) & 1;
+}
+
/*
* NSC/Cyrix CPU configuration register indexes
*/
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index 19ab4d0..86abb94 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -134,6 +134,18 @@ typedef l4_pgentry_t root_pgentry_t;
#define get_pte_flags(x) (((int)((x) >> 40) & ~0xFFF) | ((int)(x) & 0xFFF))
#define put_pte_flags(x) (((intpte_t)((x) & ~0xFFF) << 40) | ((x) & 0xFFF))
+/*
+ * Protection keys define a new 4-bit protection key field
+ * (PKEY) in bits 62:59 of leaf entries of the page tables.
+ * This corresponds to bit 22:19 of a 24-bit flags.
+ *
+ * Notice: Bit 22 is used by _PAGE_GNTTAB which is visible to PV guests,
+ * so Protection keys must be disabled on PV guests.
+ */
+#define _PAGE_PKEY_BITS (0x780000) /* Protection Keys, 22:19 */
+
+#define get_pte_pkey(x) (MASK_EXTR(get_pte_flags(x), _PAGE_PKEY_BITS))
+
/* Bit 23 of a 24-bit flag mask. This corresponds to bit 63 of a pte.*/
#define _PAGE_NX_BIT (1U<<23)
--
2.4.3
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH V7 3/5] x86/hvm: pkeys, add xstate support for pkeys
2016-01-27 8:30 [PATCH V7 0/5] x86/hvm: pkeys, add memory protection-key support Huaitong Han
2016-01-27 8:30 ` [PATCH V7 1/5] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
2016-01-27 8:30 ` [PATCH V7 2/5] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
@ 2016-01-27 8:30 ` Huaitong Han
2016-01-27 8:30 ` [PATCH V7 4/5] xen/mm: Clean up pfec handling in gva_to_gfn Huaitong Han
2016-01-27 8:30 ` [PATCH V7 5/5] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
4 siblings, 0 replies; 7+ messages in thread
From: Huaitong Han @ 2016-01-27 8:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, george.dunlap, tim, keir
Cc: Huaitong Han, xen-devel
Changes in v7:
*Use EOPNOTSUPP instead of EINVAL as return value on is_pv_vcpu condition.
---
The XSAVE feature set can operate on PKRU state only if the feature set is
enabled (CR4.OSXSAVE = 1) and has been configured to manage PKRU state
(XCR0[9] = 1). And XCR0.PKRU is disabled on PV mode without PKU feature
enabled.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
xen/arch/x86/xstate.c | 4 ++++
xen/include/asm-x86/xstate.h | 4 +++-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 4e87ab3..50d9e48 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -579,6 +579,10 @@ int handle_xsetbv(u32 index, u64 new_bv)
if ( (new_bv & ~xfeature_mask) || !valid_xcr0(new_bv) )
return -EINVAL;
+ /* XCR0.PKRU is disabled on PV mode. */
+ if ( is_pv_vcpu(curr) && (new_bv & XSTATE_PKRU) )
+ return -EOPNOTSUPP;
+
if ( !set_xcr0(new_bv) )
return -EFAULT;
diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index 12d939b..f7c41ba 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -34,13 +34,15 @@
#define XSTATE_OPMASK (1ULL << 5)
#define XSTATE_ZMM (1ULL << 6)
#define XSTATE_HI_ZMM (1ULL << 7)
+#define XSTATE_PKRU (1ULL << 9)
#define XSTATE_LWP (1ULL << 62) /* AMD lightweight profiling */
#define XSTATE_FP_SSE (XSTATE_FP | XSTATE_SSE)
#define XCNTXT_MASK (XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \
XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
#define XSTATE_ALL (~(1ULL << 63))
-#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
+#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \
+ XSTATE_PKRU)
#define XSTATE_LAZY (XSTATE_ALL & ~XSTATE_NONLAZY)
#define XSTATE_COMPACTION_ENABLED (1ULL << 63)
--
2.4.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH V7 4/5] xen/mm: Clean up pfec handling in gva_to_gfn
2016-01-27 8:30 [PATCH V7 0/5] x86/hvm: pkeys, add memory protection-key support Huaitong Han
` (2 preceding siblings ...)
2016-01-27 8:30 ` [PATCH V7 3/5] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
@ 2016-01-27 8:30 ` Huaitong Han
2016-01-27 8:30 ` [PATCH V7 5/5] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
4 siblings, 0 replies; 7+ messages in thread
From: Huaitong Han @ 2016-01-27 8:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, george.dunlap, tim, keir
Cc: Huaitong Han, George Dunlap, xen-devel
From: George Dunlap <george.dunlap@citrix.com>
Changes in v7:
*Update SDM chapter comments.
*Add hvm_vcpu check in sh_gva_to_gfn.
---
At the moment, the pfec argument to gva_to_gfn has two functions:
* To inform guest_walk what kind of access is happenind
* As a value to pass back into the guest in the event of a fault.
Unfortunately this is not quite treated consistently: the hvm_fetch_*
function will "pre-clear" the PFEC_insn_fetch flag before calling
gva_to_gfn; meaning guest_walk doesn't actually know whether a given
access is an instruction fetch or not. This works now, but will cause
issues when pkeys are introduced, since guest_walk will need to know
whether an access is an instruction fetch even if it doesn't return
PFEC_insn_fetch.
Fix this by making a clean separation for in and out functionalities
of the pfec argument:
1. Always pass in the access type to gva_to_gfn
2. Filter out inappropriate access flags before returning from gva_to_gfn.
(The PFEC_insn_fetch flag should only be passed to the guest if either NX or
SMEP is enabled. See Intel 64 Developer's Manual, Volume 3, Chapter Paging,
PAGE-FAULT EXCEPTIONS)
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
xen/arch/x86/hvm/hvm.c | 8 ++------
xen/arch/x86/mm/hap/guest_walk.c | 10 +++++++++-
xen/arch/x86/mm/shadow/multi.c | 6 ++++++
3 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 674feea..5ec2ae1 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4438,11 +4438,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt(
enum hvm_copy_result hvm_fetch_from_guest_virt(
void *buf, unsigned long vaddr, int size, uint32_t pfec)
{
- if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) )
- pfec |= PFEC_insn_fetch;
return __hvm_copy(buf, vaddr, size,
HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
- PFEC_page_present | pfec);
+ PFEC_page_present | PFEC_insn_fetch | pfec);
}
enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
@@ -4464,11 +4462,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
void *buf, unsigned long vaddr, int size, uint32_t pfec)
{
- if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) )
- pfec |= PFEC_insn_fetch;
return __hvm_copy(buf, vaddr, size,
HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
- PFEC_page_present | pfec);
+ PFEC_page_present | PFEC_insn_fetch | pfec);
}
unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
diff --git a/xen/arch/x86/mm/hap/guest_walk.c b/xen/arch/x86/mm/hap/guest_walk.c
index 49d0328..d2716f9 100644
--- a/xen/arch/x86/mm/hap/guest_walk.c
+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -82,7 +82,7 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
if ( !top_page )
{
pfec[0] &= ~PFEC_page_present;
- return INVALID_GFN;
+ goto out_tweak_pfec;
}
top_mfn = _mfn(page_to_mfn(top_page));
@@ -139,6 +139,14 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
if ( missing & _PAGE_SHARED )
pfec[0] = PFEC_page_shared;
+ out_tweak_pfec:
+ /*
+ * SDM Intel 64 Volume 3, Chapter Paging, PAGE-FAULT EXCEPTIONS:
+ * The PFEC_insn_fetch flag is set only when NX or SMEP are enabled.
+ */
+ if ( !hvm_nx_enabled(v) && !hvm_smep_enabled(v) )
+ pfec[0] &= ~PFEC_insn_fetch;
+
return INVALID_GFN;
}
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 162c06f..d42597c 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3669,6 +3669,12 @@ sh_gva_to_gfn(struct vcpu *v, struct p2m_domain *p2m,
pfec[0] &= ~PFEC_page_present;
if ( missing & _PAGE_INVALID_BITS )
pfec[0] |= PFEC_reserved_bit;
+ /*
+ * SDM Intel 64 Volume 3, Chapter Paging, PAGE-FAULT EXCEPTIONS:
+ * The PFEC_insn_fetch flag is set only when NX or SMEP are enabled.
+ */
+ if ( is_hvm_vcpu(v) && !hvm_nx_enabled(v) && !hvm_smep_enabled(v) )
+ pfec[0] &= ~PFEC_insn_fetch;
return INVALID_GFN;
}
gfn = guest_walk_to_gfn(&gw);
--
2.4.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH V7 5/5] x86/hvm: pkeys, add pkeys support for cpuid handling
2016-01-27 8:30 [PATCH V7 0/5] x86/hvm: pkeys, add memory protection-key support Huaitong Han
` (3 preceding siblings ...)
2016-01-27 8:30 ` [PATCH V7 4/5] xen/mm: Clean up pfec handling in gva_to_gfn Huaitong Han
@ 2016-01-27 8:30 ` Huaitong Han
4 siblings, 0 replies; 7+ messages in thread
From: Huaitong Han @ 2016-01-27 8:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, george.dunlap, tim, keir
Cc: Huaitong Han, xen-devel
Changes in v7:
*Rebase in the latest tree.
*Add a comment for cpu_has_xsave adjustment.
*Adjust indentation.
---
This patch adds pkeys support for cpuid handing.
Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.
X86_FEATURE_OSXSAVE depends on guest X86_FEATURE_XSAVE, but cpu_has_xsave
function reflects hypervisor X86_FEATURE_XSAVE, it is fixed too.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
---
tools/libxc/xc_cpufeature.h | 3 +++
tools/libxc/xc_cpuid_x86.c | 6 ++++--
xen/arch/x86/hvm/hvm.c | 18 +++++++++++++-----
3 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
index ee53679..866cf0b 100644
--- a/tools/libxc/xc_cpufeature.h
+++ b/tools/libxc/xc_cpufeature.h
@@ -144,4 +144,7 @@
#define X86_FEATURE_CLFLUSHOPT 23 /* CLFLUSHOPT instruction */
#define X86_FEATURE_CLWB 24 /* CLWB instruction */
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
+#define X86_FEATURE_PKU 3
+
#endif /* __LIBXC_CPUFEATURE_H */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index c142595..5408dd0 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -430,9 +430,11 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
bitmaskof(X86_FEATURE_PCOMMIT) |
bitmaskof(X86_FEATURE_CLWB) |
bitmaskof(X86_FEATURE_CLFLUSHOPT));
+ regs[2] &= bitmaskof(X86_FEATURE_PKU);
} else
- regs[1] = 0;
- regs[0] = regs[2] = regs[3] = 0;
+ regs[1] = regs[2] = 0;
+
+ regs[0] = regs[3] = 0;
break;
case 0x0000000d:
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 5ec2ae1..1389173 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4572,7 +4572,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
__clear_bit(X86_FEATURE_APIC & 31, edx);
/* Fix up OSXSAVE. */
- if ( cpu_has_xsave )
+ if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
*ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ?
cpufeat_mask(X86_FEATURE_OSXSAVE) : 0;
@@ -4593,16 +4593,24 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
if ( !cpu_has_smap )
*ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
- /* Don't expose MPX to hvm when VMX support is not available */
+ /* Don't expose MPX to hvm when VMX support is not available. */
if ( !(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
!(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS) )
*ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
- /* Don't expose INVPCID to non-hap hvm. */
if ( !hap_enabled(d) )
- *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
+ {
+ /* Don't expose INVPCID to non-hap hvm. */
+ *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
+ /* X86_FEATURE_PKU is not yet implemented for shadow paging. */
+ *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
+ }
+
+ if ( (*ecx & cpufeat_mask(X86_FEATURE_PKU)) &&
+ (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) )
+ *ecx |= cpufeat_mask(X86_FEATURE_OSPKE);
- /* Don't expose PCOMMIT to hvm when VMX support is not available */
+ /* Don't expose PCOMMIT to hvm when VMX support is not available. */
if ( !cpu_has_vmx_pcommit )
*ebx &= ~cpufeat_mask(X86_FEATURE_PCOMMIT);
}
--
2.4.3
^ permalink raw reply related [flat|nested] 7+ messages in thread