* [PATCH V5 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys
2015-12-22 10:30 [PATCH V5 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
@ 2015-12-22 10:30 ` Huaitong Han
2015-12-22 12:43 ` Jan Beulich
2015-12-22 10:30 ` [PATCH V5 2/6] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
` (4 subsequent siblings)
5 siblings, 1 reply; 15+ messages in thread
From: Huaitong Han @ 2015-12-22 10:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
This patch adds the flag("pku") to enable Memory Protection Keys, and updates
the markdown.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
docs/misc/xen-command-line.markdown | 10 ++++++++++
xen/arch/x86/cpu/common.c | 10 +++++++++-
xen/include/asm-x86/cpufeature.h | 6 +++++-
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index c103894..36ecf80 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1177,6 +1177,16 @@ This option can be specified more than once (up to 8 times at present).
### ple\_window
> `= <integer>`
+### pku
+> `= <boolean>`
+
+> Default: `true`
+
+Flag to enable Memory Protection Keys.
+
+The protection-key feature provides an additional mechanism by which IA-32e
+paging controls access to usermode addresses.
+
### psr (Intel)
> `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> | cos_max:<integer> | cdp:<boolean> )`
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 310ec85..a018855 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -22,6 +22,10 @@ boolean_param("xsave", use_xsave);
bool_t opt_arat = 1;
boolean_param("arat", opt_arat);
+/* pku: Flag to enable Memory Protection Keys (default on). */
+static bool_t opt_pku = 1;
+boolean_param("pku", opt_pku);
+
unsigned int opt_cpuid_mask_ecx = ~0u;
integer_param("cpuid_mask_ecx", opt_cpuid_mask_ecx);
unsigned int opt_cpuid_mask_edx = ~0u;
@@ -270,7 +274,8 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
if ( c->cpuid_level >= 0x00000007 )
cpuid_count(0x00000007, 0, &tmp,
&c->x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)],
- &tmp, &tmp);
+ &c->x86_capability[cpufeat_word(X86_FEATURE_PKU)],
+ &tmp);
}
/*
@@ -323,6 +328,9 @@ void __cpuinit identify_cpu(struct cpuinfo_x86 *c)
if ( cpu_has_xsave )
xstate_init(c);
+ if ( !opt_pku )
+ setup_clear_cpu_cap(X86_FEATURE_PKU);
+
/*
* The vendor-specific functions might have changed features. Now
* we do "generic changes."
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index af127cf..ef96514 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -11,7 +11,7 @@
#include <xen/const.h>
-#define NCAPINTS 8 /* N 32-bit words worth of info */
+#define NCAPINTS 9 /* N 32-bit words worth of info */
/* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */
#define X86_FEATURE_FPU (0*32+ 0) /* Onboard FPU */
@@ -163,6 +163,10 @@
#define X86_FEATURE_ADX (7*32+19) /* ADCX, ADOX instructions */
#define X86_FEATURE_SMAP (7*32+20) /* Supervisor Mode Access Prevention */
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 8 */
+#define X86_FEATURE_PKU (8*32+ 3) /* Protection Keys for Userspace */
+#define X86_FEATURE_OSPKE (8*32+ 4) /* OS Protection Keys Enable */
+
#define cpufeat_word(idx) ((idx) / 32)
#define cpufeat_bit(idx) ((idx) % 32)
#define cpufeat_mask(idx) (_AC(1, U) << cpufeat_bit(idx))
--
2.4.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH V5 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys
2015-12-22 10:30 ` [PATCH V5 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
@ 2015-12-22 12:43 ` Jan Beulich
0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2015-12-22 12:43 UTC (permalink / raw)
To: Huaitong Han
Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel, eddie.dong,
jun.nakajima, keir
>>> On 22.12.15 at 11:30, <huaitong.han@intel.com> wrote:
> This patch adds the flag("pku") to enable Memory Protection Keys, and updates
> the markdown.
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Please avoid resending patches which got applied already.
Jan
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH V5 2/6] x86/hvm: pkeys, add pkeys support when setting CR4
2015-12-22 10:30 [PATCH V5 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
2015-12-22 10:30 ` [PATCH V5 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
@ 2015-12-22 10:30 ` Huaitong Han
2015-12-22 10:30 ` [PATCH V5 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
` (3 subsequent siblings)
5 siblings, 0 replies; 15+ messages in thread
From: Huaitong Han @ 2015-12-22 10:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
CR4.PKE(bit 22) enables support for the RDPKRU/WRPKRU instructions to access PKRU and
the protection keys check (a page fault trigger).
This patch adds X86_CR4_PKE to hvm_cr4_guest_reserved_bits so that CR4 check works
before setting.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
xen/arch/x86/hvm/hvm.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index db0aeba..59916ed 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1924,6 +1924,7 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,
leaf1_edx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_VME)];
leaf1_ecx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PCID)];
leaf7_0_ebx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)];
+ leaf7_0_ecx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PKU)];
}
return ~(unsigned long)
@@ -1959,7 +1960,9 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,
(leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMEP) ?
X86_CR4_SMEP : 0) |
(leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMAP) ?
- X86_CR4_SMAP : 0));
+ X86_CR4_SMAP : 0) |
+ (leaf7_0_ecx & cpufeat_mask(X86_FEATURE_PKU) ?
+ X86_CR4_PKE : 0));
}
static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
--
2.4.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH V5 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode
2015-12-22 10:30 [PATCH V5 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
2015-12-22 10:30 ` [PATCH V5 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
2015-12-22 10:30 ` [PATCH V5 2/6] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
@ 2015-12-22 10:30 ` Huaitong Han
2015-12-22 10:30 ` [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
` (2 subsequent siblings)
5 siblings, 0 replies; 15+ messages in thread
From: Huaitong Han @ 2015-12-22 10:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
This patch disables pkeys for guest in non-paging mode, However XEN always uses
paging mode to emulate guest non-paging mode, To emulate this behavior, pkeys
needs to be manually disabled when guest switches to non-paging mode.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
xen/arch/x86/hvm/vmx/vmx.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 2581e97..7123912 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1380,12 +1380,13 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
if ( !hvm_paging_enabled(v) )
{
/*
- * SMEP/SMAP is disabled if CPU is in non-paging mode in hardware.
- * However Xen always uses paging mode to emulate guest non-paging
- * mode. To emulate this behavior, SMEP/SMAP needs to be manually
- * disabled when guest VCPU is in non-paging mode.
+ * SMEP/SMAP/PKU is disabled if CPU is in non-paging mode in
+ * hardware. However Xen always uses paging mode to emulate guest
+ * non-paging mode. To emulate this behavior, SMEP/SMAP/PKU needs
+ * to be manually disabled when guest VCPU is in non-paging mode.
*/
- v->arch.hvm_vcpu.hw_cr[4] &= ~(X86_CR4_SMEP | X86_CR4_SMAP);
+ v->arch.hvm_vcpu.hw_cr[4] &=
+ ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
}
__vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]);
break;
--
2.4.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2015-12-22 10:30 [PATCH V5 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
` (2 preceding siblings ...)
2015-12-22 10:30 ` [PATCH V5 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
@ 2015-12-22 10:30 ` Huaitong Han
2015-12-22 11:17 ` Andrew Cooper
` (3 more replies)
2015-12-22 10:30 ` [PATCH V5 5/6] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
2015-12-22 10:30 ` [PATCH V5 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
5 siblings, 4 replies; 15+ messages in thread
From: Huaitong Han @ 2015-12-22 10:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of
leaf entries of the page tables.
PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per
domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit for
protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key
i (WDi). PKEY is index to a defined domain.
A fault is considered as a PKU violation if all of the following conditions are
ture:
1.CR4_PKE=1.
2.EFER_LMA=1.
3.Page is present with no reserved bit violations.
4.The access is not an instruction fetch.
5.The access is to a user page.
6.PKRU.AD=1
or The access is a data write and PKRU.WD=1
and either CR0.WP=1 or it is a user access.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
---
xen/arch/x86/mm/guest_walk.c | 64 +++++++++++++++++++++++++++++++++++++++
xen/arch/x86/mm/hap/guest_walk.c | 3 ++
xen/include/asm-x86/guest_pt.h | 12 ++++++++
xen/include/asm-x86/hvm/hvm.h | 2 ++
xen/include/asm-x86/page.h | 5 +++
xen/include/asm-x86/processor.h | 39 ++++++++++++++++++++++++
xen/include/asm-x86/x86_64/page.h | 12 ++++++++
7 files changed, 137 insertions(+)
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 18d1acf..9cdd607 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
return 0;
}
+extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
+ uint32_t pte_flags, uint32_t pte_pkey);
+#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS
+bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
+ uint32_t pte_flags, uint32_t pte_pkey)
+{
+ unsigned int pkru = 0;
+ bool_t pkru_ad, pkru_wd;
+
+ bool_t pf = !!(pfec & PFEC_page_present);
+ bool_t uf = !!(pfec & PFEC_user_mode);
+ bool_t wf = !!(pfec & PFEC_write_access);
+ bool_t ff = !!(pfec & PFEC_insn_fetch);
+ bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
+
+ /* When page isn't present, PKEY isn't checked. */
+ if ( !pf || is_pv_vcpu(vcpu) )
+ return 0;
+
+ /*
+ * PKU: additional mechanism by which the paging controls
+ * access to user-mode addresses based on the value in the
+ * PKRU register. A fault is considered as a PKU violation if all
+ * of the following conditions are ture:
+ * 1.CR4_PKE=1.
+ * 2.EFER_LMA=1.
+ * 3.page is present with no reserved bit violations.
+ * 4.the access is not an instruction fetch.
+ * 5.the access is to a user page.
+ * 6.PKRU.AD=1
+ * or The access is a data write and PKRU.WD=1
+ * and either CR0.WP=1 or it is a user access.
+ */
+ if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) ||
+ rsvdf || ff || !(pte_flags & _PAGE_USER) )
+ return 0;
+
+ pkru = read_pkru();
+ if ( unlikely(pkru) )
+ {
+ pkru_ad = read_pkru_ad(pkru, pte_pkey);
+ pkru_wd = read_pkru_wd(pkru, pte_pkey);
+ /* Condition 6 */
+ if ( pkru_ad || (pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf)))
+ return 1;
+ }
+
+ return 0;
+}
+#endif
+
/* Walk the guest pagetables, after the manner of a hardware walker. */
/* Because the walk is essentially random, it can cause a deadlock
* warning in the p2m locking code. Highly unlikely this is an actual
@@ -107,6 +158,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
guest_l3e_t *l3p = NULL;
guest_l4e_t *l4p;
#endif
+ unsigned int pkey;
uint32_t gflags, mflags, iflags, rc = 0;
bool_t smep = 0, smap = 0;
bool_t pse1G = 0, pse2M = 0;
@@ -190,6 +242,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
goto out;
/* Get the l3e and check its flags*/
gw->l3e = l3p[guest_l3_table_offset(va)];
+ pkey = guest_l3e_get_pkey(gw->l3e);
gflags = guest_l3e_get_flags(gw->l3e) ^ iflags;
if ( !(gflags & _PAGE_PRESENT) ) {
rc |= _PAGE_PRESENT;
@@ -199,6 +252,9 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
pse1G = (gflags & _PAGE_PSE) && guest_supports_1G_superpages(v);
+ if ( pse1G && pkey_fault(v, pfec, gflags, pkey) )
+ rc |= _PAGE_PKEY_BITS;
+
if ( pse1G )
{
/* Generate a fake l1 table entry so callers don't all
@@ -270,6 +326,10 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v);
+ pkey = guest_l2e_get_pkey(gw->l2e);
+ if ( pse2M && pkey_fault(v, pfec, gflags, pkey) )
+ rc |= _PAGE_PKEY_BITS;
+
if ( pse2M )
{
/* Special case: this guest VA is in a PSE superpage, so there's
@@ -330,6 +390,10 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
goto out;
}
rc |= ((gflags & mflags) ^ mflags);
+
+ pkey = guest_l1e_get_pkey(gw->l1e);
+ if ( pkey_fault(v, pfec, gflags, pkey) )
+ rc |= _PAGE_PKEY_BITS;
}
#if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
diff --git a/xen/arch/x86/mm/hap/guest_walk.c b/xen/arch/x86/mm/hap/guest_walk.c
index 11c1b35..49d0328 100644
--- a/xen/arch/x86/mm/hap/guest_walk.c
+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -130,6 +130,9 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
if ( missing & _PAGE_INVALID_BITS )
pfec[0] |= PFEC_reserved_bit;
+ if ( missing & _PAGE_PKEY_BITS )
+ pfec[0] |= PFEC_prot_key;
+
if ( missing & _PAGE_PAGED )
pfec[0] = PFEC_page_paged;
diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
index 3447973..eb29e62 100644
--- a/xen/include/asm-x86/guest_pt.h
+++ b/xen/include/asm-x86/guest_pt.h
@@ -81,6 +81,11 @@ static inline u32 guest_l1e_get_flags(guest_l1e_t gl1e)
static inline u32 guest_l2e_get_flags(guest_l2e_t gl2e)
{ return gl2e.l2 & 0xfff; }
+static inline u32 guest_l1e_get_pkey(guest_l1e_t gl1e)
+{ return 0; }
+static inline u32 guest_l2e_get_pkey(guest_l2e_t gl2e)
+{ return 0; }
+
static inline guest_l1e_t guest_l1e_from_gfn(gfn_t gfn, u32 flags)
{ return (guest_l1e_t) { (gfn_x(gfn) << PAGE_SHIFT) | flags }; }
static inline guest_l2e_t guest_l2e_from_gfn(gfn_t gfn, u32 flags)
@@ -154,6 +159,13 @@ static inline u32 guest_l4e_get_flags(guest_l4e_t gl4e)
{ return l4e_get_flags(gl4e); }
#endif
+static inline u32 guest_l1e_get_pkey(guest_l1e_t gl1e)
+{ return l1e_get_pkey(gl1e); }
+static inline u32 guest_l2e_get_pkey(guest_l2e_t gl2e)
+{ return l2e_get_pkey(gl2e); }
+static inline u32 guest_l3e_get_pkey(guest_l3e_t gl3e)
+{ return l3e_get_pkey(gl3e); }
+
static inline guest_l1e_t guest_l1e_from_gfn(gfn_t gfn, u32 flags)
{ return l1e_from_pfn(gfn_x(gfn), flags); }
static inline guest_l2e_t guest_l2e_from_gfn(gfn_t gfn, u32 flags)
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index f80e143..79b3421 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -275,6 +275,8 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode);
(hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMAP))
#define hvm_nx_enabled(v) \
(!!((v)->arch.hvm_vcpu.guest_efer & EFER_NX))
+#define hvm_pku_enabled(v) \
+ (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE))
/* Can we use superpages in the HAP p2m table? */
#define hap_has_1gb (!!(hvm_funcs.hap_capabilities & HVM_HAP_SUPERPAGE_1GB))
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index a095a93..d69d91d 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -93,6 +93,11 @@
#define l3e_get_flags(x) (get_pte_flags((x).l3))
#define l4e_get_flags(x) (get_pte_flags((x).l4))
+/* Get pte pkeys (unsigned int). */
+#define l1e_get_pkey(x) (get_pte_pkey((x).l1))
+#define l2e_get_pkey(x) (get_pte_pkey((x).l2))
+#define l3e_get_pkey(x) (get_pte_pkey((x).l3))
+
/* Construct an empty pte. */
#define l1e_empty() ((l1_pgentry_t) { 0 })
#define l2e_empty() ((l2_pgentry_t) { 0 })
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 3f8411f..3f251aa 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -373,6 +373,45 @@ static always_inline void clear_in_cr4 (unsigned long mask)
write_cr4(read_cr4() & ~mask);
}
+static inline unsigned int read_pkru(void)
+{
+ unsigned int pkru;
+
+ /*
+ * _PAGE_PKEY_BITS have a conflict with _PAGE_GNTTAB used by PV guests,
+ * so that X86_CR4_PKE is disable on hypervisor, RDPKRU instruction can
+ * be used with temporarily setting CR4.PKE.
+ */
+ set_in_cr4(X86_CR4_PKE);
+ asm volatile (".byte 0x0f,0x01,0xee"
+ : "=a" (pkru) : "c" (0) : "dx");
+ clear_in_cr4(X86_CR4_PKE);
+
+ return pkru;
+}
+
+/* Macros for PKRU domain */
+#define PKRU_READ (0)
+#define PKRU_WRITE (1)
+#define PKRU_ATTRS (2)
+
+/*
+ * PKRU defines 32 bits, there are 16 domains and 2 attribute bits per
+ * domain in pkru, pkeys is index to a defined domain, so the value of
+ * pte_pkeys * PKRU_ATTRS + R/W is offset of a defined domain attribute.
+ */
+static inline bool_t read_pkru_ad(unsigned int pkru, unsigned int pkey)
+{
+ ASSERT(pkey < 16);
+ return (pkru >> (pkey * PKRU_ATTRS + PKRU_READ)) & 1;
+}
+
+static inline bool_t read_pkru_wd(unsigned int pkru, unsigned int pkey)
+{
+ ASSERT(pkey < 16);
+ return (pkru >> (pkey * PKRU_ATTRS + PKRU_WRITE)) & 1;
+}
+
/*
* NSC/Cyrix CPU configuration register indexes
*/
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index 19ab4d0..86abb94 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -134,6 +134,18 @@ typedef l4_pgentry_t root_pgentry_t;
#define get_pte_flags(x) (((int)((x) >> 40) & ~0xFFF) | ((int)(x) & 0xFFF))
#define put_pte_flags(x) (((intpte_t)((x) & ~0xFFF) << 40) | ((x) & 0xFFF))
+/*
+ * Protection keys define a new 4-bit protection key field
+ * (PKEY) in bits 62:59 of leaf entries of the page tables.
+ * This corresponds to bit 22:19 of a 24-bit flags.
+ *
+ * Notice: Bit 22 is used by _PAGE_GNTTAB which is visible to PV guests,
+ * so Protection keys must be disabled on PV guests.
+ */
+#define _PAGE_PKEY_BITS (0x780000) /* Protection Keys, 22:19 */
+
+#define get_pte_pkey(x) (MASK_EXTR(get_pte_flags(x), _PAGE_PKEY_BITS))
+
/* Bit 23 of a 24-bit flag mask. This corresponds to bit 63 of a pte.*/
#define _PAGE_NX_BIT (1U<<23)
--
2.4.3
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2015-12-22 10:30 ` [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
@ 2015-12-22 11:17 ` Andrew Cooper
2015-12-22 13:02 ` Jan Beulich
` (2 subsequent siblings)
3 siblings, 0 replies; 15+ messages in thread
From: Andrew Cooper @ 2015-12-22 11:17 UTC (permalink / raw)
To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: xen-devel
On 22/12/15 10:30, Huaitong Han wrote:
> --- a/xen/include/asm-x86/processor.h
> +++ b/xen/include/asm-x86/processor.h
> @@ -373,6 +373,45 @@ static always_inline void clear_in_cr4 (unsigned long mask)
> write_cr4(read_cr4() & ~mask);
> }
>
> +static inline unsigned int read_pkru(void)
> +{
> + unsigned int pkru;
> +
> + /*
> + * _PAGE_PKEY_BITS have a conflict with _PAGE_GNTTAB used by PV guests,
> + * so that X86_CR4_PKE is disable on hypervisor, RDPKRU instruction can
> + * be used with temporarily setting CR4.PKE.
> + */
> + set_in_cr4(X86_CR4_PKE);
> + asm volatile (".byte 0x0f,0x01,0xee"
> + : "=a" (pkru) : "c" (0) : "dx");
> + clear_in_cr4(X86_CR4_PKE);
You can't use set/clear_in_cr4 here, as it modifies the global
mmu_cr4_features variable.
I would recommend
unsigned long cr4 = read_cr4();
write_cr4(cr4 | X86_CR4_PKE);
...
write_cr4(cr4);
instead.
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2015-12-22 10:30 ` [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
2015-12-22 11:17 ` Andrew Cooper
@ 2015-12-22 13:02 ` Jan Beulich
2015-12-22 15:23 ` George Dunlap
2015-12-22 15:39 ` George Dunlap
2015-12-22 16:38 ` George Dunlap
3 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2015-12-22 13:02 UTC (permalink / raw)
To: Huaitong Han
Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
jun.nakajima, keir
>>> On 22.12.15 at 11:30, <huaitong.han@intel.com> wrote:
I dislike having to repeat this: Please trim your Cc lists.
> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
> return 0;
> }
>
> +extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
> + uint32_t pte_flags, uint32_t pte_pkey);
> +#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS
> +bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
> + uint32_t pte_flags, uint32_t pte_pkey)
> +{
See my comments on the previous version. Please avoid sending new
versions without having addressed all comments on the previous one
(verbally or by code changes). Having done the suggested change
just partially (by removing the #ifdef-s from the call sites) you now
do the key check universally, and things remain correct just because
of the long mode check in the middle of the function.
> + unsigned int pkru = 0;
> + bool_t pkru_ad, pkru_wd;
> +
> + bool_t pf = !!(pfec & PFEC_page_present);
There's still this stray blank line above (and I continue to wonder
whether you really need all these boolean variables many of which
get used just once).
> + bool_t uf = !!(pfec & PFEC_user_mode);
> + bool_t wf = !!(pfec & PFEC_write_access);
> + bool_t ff = !!(pfec & PFEC_insn_fetch);
> + bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
> +
> + /* When page isn't present, PKEY isn't checked. */
> + if ( !pf || is_pv_vcpu(vcpu) )
> + return 0;
> +
> + /*
> + * PKU: additional mechanism by which the paging controls
> + * access to user-mode addresses based on the value in the
> + * PKRU register. A fault is considered as a PKU violation if all
> + * of the following conditions are ture:
> + * 1.CR4_PKE=1.
> + * 2.EFER_LMA=1.
> + * 3.page is present with no reserved bit violations.
> + * 4.the access is not an instruction fetch.
> + * 5.the access is to a user page.
> + * 6.PKRU.AD=1
> + * or The access is a data write and PKRU.WD=1
> + * and either CR0.WP=1 or it is a user access.
> + */
> + if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) ||
> + rsvdf || ff || !(pte_flags & _PAGE_USER) )
Indentation (I think this also was broken already in the previous
version).
Jan
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2015-12-22 13:02 ` Jan Beulich
@ 2015-12-22 15:23 ` George Dunlap
0 siblings, 0 replies; 15+ messages in thread
From: George Dunlap @ 2015-12-22 15:23 UTC (permalink / raw)
To: Jan Beulich, Huaitong Han
Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
jun.nakajima, keir
On 22/12/15 13:02, Jan Beulich wrote:
>>>> On 22.12.15 at 11:30, <huaitong.han@intel.com> wrote:
>
> I dislike having to repeat this: Please trim your Cc lists.
>
>> --- a/xen/arch/x86/mm/guest_walk.c
>> +++ b/xen/arch/x86/mm/guest_walk.c
>> @@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
>> return 0;
>> }
>>
>> +extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
>> + uint32_t pte_flags, uint32_t pte_pkey);
>> +#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS
>> +bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
>> + uint32_t pte_flags, uint32_t pte_pkey)
>> +{
>
> See my comments on the previous version. Please avoid sending new
> versions without having addressed all comments on the previous one
> (verbally or by code changes). Having done the suggested change
> just partially (by removing the #ifdef-s from the call sites) you now
> do the key check universally, and things remain correct just because
> of the long mode check in the middle of the function.
>
>> + unsigned int pkru = 0;
>> + bool_t pkru_ad, pkru_wd;
>> +
>> + bool_t pf = !!(pfec & PFEC_page_present);
>
> There's still this stray blank line above (and I continue to wonder
> whether you really need all these boolean variables many of which
> get used just once).
I suspect the "stray blank line" was added for readability.
But I agree that I'd prefer not to use local boolean variables, and just
to put the flag checking inline.
>
>> + bool_t uf = !!(pfec & PFEC_user_mode);
>> + bool_t wf = !!(pfec & PFEC_write_access);
>> + bool_t ff = !!(pfec & PFEC_insn_fetch);
>> + bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
>> +
>> + /* When page isn't present, PKEY isn't checked. */
>> + if ( !pf || is_pv_vcpu(vcpu) )
>> + return 0;
>> +
>> + /*
>> + * PKU: additional mechanism by which the paging controls
>> + * access to user-mode addresses based on the value in the
>> + * PKRU register. A fault is considered as a PKU violation if all
>> + * of the following conditions are ture:
*true
>> + * 1.CR4_PKE=1.
>> + * 2.EFER_LMA=1.
>> + * 3.page is present with no reserved bit violations.
>> + * 4.the access is not an instruction fetch.
>> + * 5.the access is to a user page.
>> + * 6.PKRU.AD=1
>> + * or The access is a data write and PKRU.WD=1
>> + * and either CR0.WP=1 or it is a user access.
>> + */
>> + if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) ||
>> + rsvdf || ff || !(pte_flags & _PAGE_USER) )
And I think you might as well make this one line per condition,
something like this:
if ( is_pv_vcpu(vcpu) ||
!hvm_pku_enabled(vcpu) ||
!hvm_long_mode_enabled(vcpu)
!(pfec & PFEC_page_present) ||
(pfec & (PFEC_insn_fetch|PFEC_reserved_bit)) ||
!(pte_flags & _PAGE_USER) )
return 0;
-George
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2015-12-22 10:30 ` [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
2015-12-22 11:17 ` Andrew Cooper
2015-12-22 13:02 ` Jan Beulich
@ 2015-12-22 15:39 ` George Dunlap
2015-12-22 16:38 ` George Dunlap
3 siblings, 0 replies; 15+ messages in thread
From: George Dunlap @ 2015-12-22 15:39 UTC (permalink / raw)
To: Huaitong Han, jbeulich, andrew.cooper3, jun.nakajima, eddie.dong,
kevin.tian, george.dunlap, ian.jackson, stefano.stabellini,
ian.campbell, wei.liu2, keir
Cc: xen-devel
On 22/12/15 10:30, Huaitong Han wrote:
> Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of
> leaf entries of the page tables.
>
> PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per
> domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit for
> protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key
> i (WDi). PKEY is index to a defined domain.
>
> A fault is considered as a PKU violation if all of the following conditions are
> ture:
> 1.CR4_PKE=1.
> 2.EFER_LMA=1.
> 3.Page is present with no reserved bit violations.
> 4.The access is not an instruction fetch.
> 5.The access is to a user page.
> 6.PKRU.AD=1
> or The access is a data write and PKRU.WD=1
> and either CR0.WP=1 or it is a user access.
Thanks, I think this architecture is much better. More comments inline.
> diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
> index 18d1acf..9cdd607 100644
> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -90,6 +90,57 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
> return 0;
> }
>
> +extern bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
> + uint32_t pte_flags, uint32_t pte_pkey);
> +#if GUEST_PAGING_LEVELS == CONFIG_PAGING_LEVELS
> +bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
> + uint32_t pte_flags, uint32_t pte_pkey)
> +{
> + unsigned int pkru = 0;
> + bool_t pkru_ad, pkru_wd;
> +
> + bool_t pf = !!(pfec & PFEC_page_present);
> + bool_t uf = !!(pfec & PFEC_user_mode);
> + bool_t wf = !!(pfec & PFEC_write_access);
> + bool_t ff = !!(pfec & PFEC_insn_fetch);
> + bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
> +
> + /* When page isn't present, PKEY isn't checked. */
> + if ( !pf || is_pv_vcpu(vcpu) )
> + return 0;
> +
> + /*
> + * PKU: additional mechanism by which the paging controls
> + * access to user-mode addresses based on the value in the
> + * PKRU register. A fault is considered as a PKU violation if all
> + * of the following conditions are ture:
> + * 1.CR4_PKE=1.
> + * 2.EFER_LMA=1.
> + * 3.page is present with no reserved bit violations.
> + * 4.the access is not an instruction fetch.
> + * 5.the access is to a user page.
> + * 6.PKRU.AD=1
> + * or The access is a data write and PKRU.WD=1
> + * and either CR0.WP=1 or it is a user access.
> + */
> + if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) ||
> + rsvdf || ff || !(pte_flags & _PAGE_USER) )
> + return 0;
See my comment in response to Jan's comment.
> @@ -199,6 +252,9 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>
> pse1G = (gflags & _PAGE_PSE) && guest_supports_1G_superpages(v);
>
> + if ( pse1G && pkey_fault(v, pfec, gflags, pkey) )
> + rc |= _PAGE_PKEY_BITS;
> +
> if ( pse1G )
So it looks like you're only doing a pkey check on the 'leaf' ptes, is
that right?
In which case we have a lot of duplication here -- duplicate pse1G /
pse2M checks, and exact copies of the call to pkey_fault(). Would it
make sense to move the pkey_fault() check down below the set_ad label?
-George
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2015-12-22 10:30 ` [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
` (2 preceding siblings ...)
2015-12-22 15:39 ` George Dunlap
@ 2015-12-22 16:38 ` George Dunlap
3 siblings, 0 replies; 15+ messages in thread
From: George Dunlap @ 2015-12-22 16:38 UTC (permalink / raw)
To: Huaitong Han, jbeulich, andrew.cooper3, jun.nakajima, eddie.dong,
kevin.tian, george.dunlap, ian.jackson, stefano.stabellini,
ian.campbell, wei.liu2, keir
Cc: xen-devel
[-- Attachment #1: Type: text/plain, Size: 1623 bytes --]
On 22/12/15 10:30, Huaitong Han wrote:
> Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of
> leaf entries of the page tables.
>
> PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per
> domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit for
> protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key
> i (WDi). PKEY is index to a defined domain.
>
> A fault is considered as a PKU violation if all of the following conditions are
> ture:
> 1.CR4_PKE=1.
> 2.EFER_LMA=1.
> 3.Page is present with no reserved bit violations.
> 4.The access is not an instruction fetch.
> 5.The access is to a user page.
> 6.PKRU.AD=1
> or The access is a data write and PKRU.WD=1
> and either CR0.WP=1 or it is a user access.
One comment you didn't address from v3, however:
"At the moment PFEC_insn_fetch is only set in
hvm_fetch_from_guest_virt() if hvm_nx_enabled() or hvm_smep_enabled()
are true. Which means that if you *don't* have nx or smep enabled, then
the patch series as is will fault on instruction fetches when it
shouldn't. (I don't remember anyone mentioning nx or smep being enabled
as a prerequisite for pkeys.)"
I think realistically the only way to address this is to start making
the clean separation between "pfec in" and "pfec out" I mentioned in the
previous discussion.
I've coded up the attached patch, but only compile-tested it. Can you
give it a look to see if you think it is correct, test it, include it in
your next patch series?
Thanks,
-George
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-xen-mm-Clean-up-pfec-handling-in-gva_to_gfn.patch --]
[-- Type: text/x-patch; name="0001-xen-mm-Clean-up-pfec-handling-in-gva_to_gfn.patch", Size: 4512 bytes --]
From b8ec8e14c670215587a4e74c3aaec8dab6f26a8c Mon Sep 17 00:00:00 2001
From: George Dunlap <george.dunlap@eu.citrix.com>
Date: Tue, 22 Dec 2015 16:17:22 +0000
Subject: [PATCH] xen/mm: Clean up pfec handling in gva_to_gfn
At the moment, the pfec argument to gva_to_gfn has two functions:
* To inform guest_walk what kind of access is happenind
* As a value to pass back into the guest in the event of a fault.
Unfortunately this is not quite treated consistently: the hvm_fetch_*
function will "pre-clear" the PFEC_insn_fetch flag before calling
gva_to_gfn; meaning guest_walk doesn't actually know whether a given
access is an instruction fetch or not. This works now, but will cause
issues when pkeys are introduced, since guest_walk will need to know
whether an access is an instruction fetch even if it doesn't return
PFEC_insn_fetch.
Fix this by making a clean separation for in and out functionalities
of the pfec argument:
1. Always pass in the access type to gva_to_gfn
2. Filter out inappropriate access flags before returning from gva_to_gfn.
(The PFEC_insn_fetch flag should only be passed to the guest if either NX or
SMEP is enabled. See Intel 64 Developer's Manual, Volume 3, Section 4.7.)
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
xen/arch/x86/hvm/hvm.c | 8 ++------
xen/arch/x86/mm/hap/guest_walk.c | 10 +++++++++-
xen/arch/x86/mm/shadow/multi.c | 6 ++++++
3 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 08cef1f..aa1f64f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4423,11 +4423,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt(
enum hvm_copy_result hvm_fetch_from_guest_virt(
void *buf, unsigned long vaddr, int size, uint32_t pfec)
{
- if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) )
- pfec |= PFEC_insn_fetch;
return __hvm_copy(buf, vaddr, size,
HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
- PFEC_page_present | pfec);
+ PFEC_page_present | PFEC_insn_fetch | pfec);
}
enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
@@ -4449,11 +4447,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
void *buf, unsigned long vaddr, int size, uint32_t pfec)
{
- if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) )
- pfec |= PFEC_insn_fetch;
return __hvm_copy(buf, vaddr, size,
HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
- PFEC_page_present | pfec);
+ PFEC_page_present | PFEC_insn_fetch | pfec);
}
unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
diff --git a/xen/arch/x86/mm/hap/guest_walk.c b/xen/arch/x86/mm/hap/guest_walk.c
index 11c1b35..2dce111 100644
--- a/xen/arch/x86/mm/hap/guest_walk.c
+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -82,7 +82,7 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
if ( !top_page )
{
pfec[0] &= ~PFEC_page_present;
- return INVALID_GFN;
+ goto out_tweak_pfec;
}
top_mfn = _mfn(page_to_mfn(top_page));
@@ -136,6 +136,14 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
if ( missing & _PAGE_SHARED )
pfec[0] = PFEC_page_shared;
+out_tweak_pfec:
+ /*
+ * Intel 64 Volume 3, Section 4.7: The PFEC_insn_fetch flag is set
+ * only when NX or SMEP are enabled.
+ */
+ if ( !hvm_nx_enabled(v) && !hvm_smep_enabled(v) )
+ pfec[0] &= ~PFEC_insn_fetch;
+
return INVALID_GFN;
}
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 58f7e72..3844c2d 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3668,6 +3668,12 @@ sh_gva_to_gfn(struct vcpu *v, struct p2m_domain *p2m,
pfec[0] &= ~PFEC_page_present;
if ( missing & _PAGE_INVALID_BITS )
pfec[0] |= PFEC_reserved_bit;
+ /*
+ * Intel 64 Volume 3, Section 4.7: The PFEC_insn_fetch flag is
+ * set only when NX or SMEP are enabled.
+ */
+ if ( !hvm_nx_enabled(v) && !hvm_smep_enabled(v) )
+ pfec[0] &= ~PFEC_insn_fetch;
return INVALID_GFN;
}
gfn = guest_walk_to_gfn(&gw);
--
2.1.4
[-- Attachment #3: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH V5 5/6] x86/hvm: pkeys, add xstate support for pkeys
2015-12-22 10:30 [PATCH V5 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
` (3 preceding siblings ...)
2015-12-22 10:30 ` [PATCH V5 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
@ 2015-12-22 10:30 ` Huaitong Han
2015-12-22 13:03 ` Jan Beulich
2015-12-22 10:30 ` [PATCH V5 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
5 siblings, 1 reply; 15+ messages in thread
From: Huaitong Han @ 2015-12-22 10:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
This patch adds xstate support for pkeys.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
xen/arch/x86/xstate.c | 3 ++-
xen/include/asm-x86/xstate.h | 4 +++-
2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index b65da38..baa4b58 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -487,7 +487,8 @@ void xstate_init(struct cpuinfo_x86 *c)
* Set CR4_OSXSAVE and run "cpuid" to get xsave_cntxt_size.
*/
set_in_cr4(X86_CR4_OSXSAVE);
- if ( !set_xcr0(feature_mask) )
+ /* PKU is disabled on PV mode. */
+ if ( !set_xcr0(feature_mask & ~XSTATE_PKRU) )
BUG();
if ( bsp )
diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index 12d939b..f7c41ba 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -34,13 +34,15 @@
#define XSTATE_OPMASK (1ULL << 5)
#define XSTATE_ZMM (1ULL << 6)
#define XSTATE_HI_ZMM (1ULL << 7)
+#define XSTATE_PKRU (1ULL << 9)
#define XSTATE_LWP (1ULL << 62) /* AMD lightweight profiling */
#define XSTATE_FP_SSE (XSTATE_FP | XSTATE_SSE)
#define XCNTXT_MASK (XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \
XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
#define XSTATE_ALL (~(1ULL << 63))
-#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
+#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \
+ XSTATE_PKRU)
#define XSTATE_LAZY (XSTATE_ALL & ~XSTATE_NONLAZY)
#define XSTATE_COMPACTION_ENABLED (1ULL << 63)
--
2.4.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH V5 5/6] x86/hvm: pkeys, add xstate support for pkeys
2015-12-22 10:30 ` [PATCH V5 5/6] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
@ 2015-12-22 13:03 ` Jan Beulich
0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2015-12-22 13:03 UTC (permalink / raw)
To: Huaitong Han
Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
jun.nakajima, keir
>>> On 22.12.15 at 11:30, <huaitong.han@intel.com> wrote:
> --- a/xen/include/asm-x86/xstate.h
> +++ b/xen/include/asm-x86/xstate.h
> @@ -34,13 +34,15 @@
> #define XSTATE_OPMASK (1ULL << 5)
> #define XSTATE_ZMM (1ULL << 6)
> #define XSTATE_HI_ZMM (1ULL << 7)
> +#define XSTATE_PKRU (1ULL << 9)
> #define XSTATE_LWP (1ULL << 62) /* AMD lightweight profiling */
> #define XSTATE_FP_SSE (XSTATE_FP | XSTATE_SSE)
> #define XCNTXT_MASK (XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \
> XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
>
> #define XSTATE_ALL (~(1ULL << 63))
> -#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
> +#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \
> + XSTATE_PKRU)
Same thing here - I don't recall any reply to my inquiry regarding this
change.
Jan
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH V5 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
2015-12-22 10:30 [PATCH V5 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
` (4 preceding siblings ...)
2015-12-22 10:30 ` [PATCH V5 5/6] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
@ 2015-12-22 10:30 ` Huaitong Han
2015-12-22 13:06 ` Jan Beulich
5 siblings, 1 reply; 15+ messages in thread
From: Huaitong Han @ 2015-12-22 10:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
This patch adds pkeys support for cpuid handing.
Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
---
tools/libxc/xc_cpufeature.h | 2 ++
tools/libxc/xc_cpuid_x86.c | 6 ++++--
xen/arch/x86/hvm/hvm.c | 36 +++++++++++++++++++++++-------------
3 files changed, 29 insertions(+), 15 deletions(-)
diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
index c3ddc80..f6a9778 100644
--- a/tools/libxc/xc_cpufeature.h
+++ b/tools/libxc/xc_cpufeature.h
@@ -141,5 +141,7 @@
#define X86_FEATURE_ADX 19 /* ADCX, ADOX instructions */
#define X86_FEATURE_SMAP 20 /* Supervisor Mode Access Protection */
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
+#define X86_FEATURE_PKU 3
#endif /* __LIBXC_CPUFEATURE_H */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 8882c01..1ce979b 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -427,9 +427,11 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
bitmaskof(X86_FEATURE_ADX) |
bitmaskof(X86_FEATURE_SMAP) |
bitmaskof(X86_FEATURE_FSGSBASE));
+ regs[2] &= bitmaskof(X86_FEATURE_PKU);
} else
- regs[1] = 0;
- regs[0] = regs[2] = regs[3] = 0;
+ regs[1] = regs[2] = 0;
+
+ regs[0] = regs[3] = 0;
break;
case 0x0000000d:
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 59916ed..076313b 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4572,7 +4572,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
__clear_bit(X86_FEATURE_APIC & 31, edx);
/* Fix up OSXSAVE. */
- if ( cpu_has_xsave )
+ if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
*ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ?
cpufeat_mask(X86_FEATURE_OSXSAVE) : 0;
@@ -4585,21 +4585,31 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
*edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
break;
case 0x7:
- if ( (count == 0) && !cpu_has_smep )
- *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
+ if ( count == 0 )
+ {
+ if ( !cpu_has_smep )
+ *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
- if ( (count == 0) && !cpu_has_smap )
- *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
+ if ( !cpu_has_smap )
+ *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
- /* Don't expose MPX to hvm when VMX support is not available */
- if ( (count == 0) &&
- (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
- !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS)) )
- *ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
+ /* Don't expose MPX to hvm when VMX support is not available */
+ if (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
+ !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS))
+ *ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
- /* Don't expose INVPCID to non-hap hvm. */
- if ( (count == 0) && !hap_enabled(d) )
- *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
+ if ( !hap_enabled(d) )
+ {
+ /* Don't expose INVPCID to non-hap hvm. */
+ *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
+ /* X86_FEATURE_PKU is not yet implemented for shadow paging. */
+ *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
+ }
+
+ if ( *ecx & cpufeat_mask(X86_FEATURE_PKU))
+ *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
+ cpufeat_mask(X86_FEATURE_OSPKE) : 0;
+ }
break;
case 0xb:
/* Fix the x2APIC identifier. */
--
2.4.3
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH V5 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
2015-12-22 10:30 ` [PATCH V5 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
@ 2015-12-22 13:06 ` Jan Beulich
0 siblings, 0 replies; 15+ messages in thread
From: Jan Beulich @ 2015-12-22 13:06 UTC (permalink / raw)
To: Huaitong Han
Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
jun.nakajima, keir
>>> On 22.12.15 at 11:30, <huaitong.han@intel.com> wrote:
> This patch adds pkeys support for cpuid handing.
>
> Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
> CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> ---
> tools/libxc/xc_cpufeature.h | 2 ++
> tools/libxc/xc_cpuid_x86.c | 6 ++++--
> xen/arch/x86/hvm/hvm.c | 36 +++++++++++++++++++++++-------------
> 3 files changed, 29 insertions(+), 15 deletions(-)
>
> diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
> index c3ddc80..f6a9778 100644
> --- a/tools/libxc/xc_cpufeature.h
> +++ b/tools/libxc/xc_cpufeature.h
> @@ -141,5 +141,7 @@
> #define X86_FEATURE_ADX 19 /* ADCX, ADOX instructions */
> #define X86_FEATURE_SMAP 20 /* Supervisor Mode Access Protection */
>
> +/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
> +#define X86_FEATURE_PKU 3
>
> #endif /* __LIBXC_CPUFEATURE_H */
> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> index 8882c01..1ce979b 100644
> --- a/tools/libxc/xc_cpuid_x86.c
> +++ b/tools/libxc/xc_cpuid_x86.c
> @@ -427,9 +427,11 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
> bitmaskof(X86_FEATURE_ADX) |
> bitmaskof(X86_FEATURE_SMAP) |
> bitmaskof(X86_FEATURE_FSGSBASE));
> + regs[2] &= bitmaskof(X86_FEATURE_PKU);
> } else
> - regs[1] = 0;
> - regs[0] = regs[2] = regs[3] = 0;
> + regs[1] = regs[2] = 0;
> +
> + regs[0] = regs[3] = 0;
> break;
>
> case 0x0000000d:
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 59916ed..076313b 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4572,7 +4572,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax,
> unsigned int *ebx,
> __clear_bit(X86_FEATURE_APIC & 31, edx);
>
> /* Fix up OSXSAVE. */
> - if ( cpu_has_xsave )
> + if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
> *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ?
> cpufeat_mask(X86_FEATURE_OSXSAVE) : 0;
>
> @@ -4585,21 +4585,31 @@ void hvm_cpuid(unsigned int input, unsigned int *eax,
> unsigned int *ebx,
> *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
> break;
> case 0x7:
> - if ( (count == 0) && !cpu_has_smep )
> - *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
> + if ( count == 0 )
> + {
> + if ( !cpu_has_smep )
> + *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
>
> - if ( (count == 0) && !cpu_has_smap )
> - *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
> + if ( !cpu_has_smap )
> + *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
>
> - /* Don't expose MPX to hvm when VMX support is not available */
> - if ( (count == 0) &&
> - (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
> - !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS)) )
> - *ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
> + /* Don't expose MPX to hvm when VMX support is not available */
Please fix the comment style if you touch this anyway.
> + if (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
> + !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS))
> + *ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
>
> - /* Don't expose INVPCID to non-hap hvm. */
> - if ( (count == 0) && !hap_enabled(d) )
> - *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> + if ( !hap_enabled(d) )
> + {
> + /* Don't expose INVPCID to non-hap hvm. */
> + *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> + /* X86_FEATURE_PKU is not yet implemented for shadow paging. */
> + *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
> + }
> +
> + if ( *ecx & cpufeat_mask(X86_FEATURE_PKU))
> + *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
> + cpufeat_mask(X86_FEATURE_OSPKE) : 0;
Bogus use of ?: considering that you or in zero in its "false" case.
Just extend the if() condition.
Jan
^ permalink raw reply [flat|nested] 15+ messages in thread