* [PATCH V6 1/5] x86/hvm: pkeys, disable pkeys for guests in non-paging mode
2016-01-19 7:30 [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support Huaitong Han
@ 2016-01-19 7:30 ` Huaitong Han
2016-01-19 7:30 ` [PATCH V6 2/5] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
` (4 subsequent siblings)
5 siblings, 0 replies; 19+ messages in thread
From: Huaitong Han @ 2016-01-19 7:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, george.dunlap, wei.liu2, keir
Cc: Huaitong Han, xen-devel
This patch disables pkeys for guest in non-paging mode, However XEN always uses
paging mode to emulate guest non-paging mode, To emulate this behavior, pkeys
needs to be manually disabled when guest switches to non-paging mode.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
xen/arch/x86/hvm/vmx/vmx.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index b918b8a..9a8cfb5 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1368,12 +1368,13 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
if ( !hvm_paging_enabled(v) )
{
/*
- * SMEP/SMAP is disabled if CPU is in non-paging mode in hardware.
- * However Xen always uses paging mode to emulate guest non-paging
- * mode. To emulate this behavior, SMEP/SMAP needs to be manually
- * disabled when guest VCPU is in non-paging mode.
+ * SMEP/SMAP/PKU is disabled if CPU is in non-paging mode in
+ * hardware. However Xen always uses paging mode to emulate guest
+ * non-paging mode. To emulate this behavior, SMEP/SMAP/PKU needs
+ * to be manually disabled when guest VCPU is in non-paging mode.
*/
- v->arch.hvm_vcpu.hw_cr[4] &= ~(X86_CR4_SMEP | X86_CR4_SMAP);
+ v->arch.hvm_vcpu.hw_cr[4] &=
+ ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
}
__vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]);
break;
--
2.4.3
^ permalink raw reply related [flat|nested] 19+ messages in thread* [PATCH V6 2/5] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2016-01-19 7:30 [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support Huaitong Han
2016-01-19 7:30 ` [PATCH V6 1/5] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
@ 2016-01-19 7:30 ` Huaitong Han
2016-01-25 15:46 ` Jan Beulich
2016-01-19 7:30 ` [PATCH V6 3/5] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
` (3 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Huaitong Han @ 2016-01-19 7:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, george.dunlap, wei.liu2, keir
Cc: Huaitong Han, xen-devel
Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of
leaf entries of the page tables.
PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per
domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit for
protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key
i (WDi). PKEY is index to a defined domain.
A fault is considered as a PKU violation if all of the following conditions are
ture:
1.CR4_PKE=1.
2.EFER_LMA=1.
3.Page is present with no reserved bit violations.
4.The access is not an instruction fetch.
5.The access is to a user page.
6.PKRU.AD=1
or The access is a data write and PKRU.WD=1
and either CR0.WP=1 or it is a user access.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
---
xen/arch/x86/mm/guest_walk.c | 53 +++++++++++++++++++++++++++++++++++++++
xen/arch/x86/mm/hap/guest_walk.c | 3 +++
xen/include/asm-x86/guest_pt.h | 12 +++++++++
xen/include/asm-x86/hvm/hvm.h | 2 ++
xen/include/asm-x86/page.h | 5 ++++
xen/include/asm-x86/processor.h | 40 +++++++++++++++++++++++++++++
xen/include/asm-x86/x86_64/page.h | 12 +++++++++
7 files changed, 127 insertions(+)
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 18d1acf..dfee43f 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -90,6 +90,53 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
return 0;
}
+#if GUEST_PAGING_LEVELS >= 4
+bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
+ uint32_t pte_flags, uint32_t pte_pkey)
+{
+ unsigned int pkru = 0;
+ bool_t pkru_ad, pkru_wd;
+
+ /* When page isn't present, PKEY isn't checked. */
+ if ( !(pfec & PFEC_page_present) || is_pv_vcpu(vcpu) )
+ return 0;
+
+ /*
+ * PKU: additional mechanism by which the paging controls
+ * access to user-mode addresses based on the value in the
+ * PKRU register. A fault is considered as a PKU violation if all
+ * of the following conditions are true:
+ * 1.CR4_PKE=1.
+ * 2.EFER_LMA=1.
+ * 3.Page is present with no reserved bit violations.
+ * 4.The access is not an instruction fetch.
+ * 5.The access is to a user page.
+ * 6.PKRU.AD=1 or
+ * the access is a data write and PKRU.WD=1 and
+ * either CR0.WP=1 or it is a user access.
+ */
+ if ( !hvm_pku_enabled(vcpu) ||
+ !hvm_long_mode_enabled(vcpu) ||
+ (pfec & PFEC_reserved_bit) ||
+ (pfec & PFEC_insn_fetch) ||
+ !(pte_flags & _PAGE_USER) )
+ return 0;
+
+ pkru = read_pkru();
+ if ( unlikely(pkru) )
+ {
+ pkru_ad = read_pkru_ad(pkru, pte_pkey);
+ pkru_wd = read_pkru_wd(pkru, pte_pkey);
+ /* Condition 6 */
+ if ( pkru_ad || (pkru_wd && (pfec & PFEC_write_access) &&
+ (hvm_wp_enabled(vcpu) || (pfec & PFEC_user_mode))))
+ return 1;
+ }
+
+ return 0;
+}
+#endif
+
/* Walk the guest pagetables, after the manner of a hardware walker. */
/* Because the walk is essentially random, it can cause a deadlock
* warning in the p2m locking code. Highly unlikely this is an actual
@@ -107,6 +154,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
guest_l3e_t *l3p = NULL;
guest_l4e_t *l4p;
#endif
+ unsigned int pkey;
uint32_t gflags, mflags, iflags, rc = 0;
bool_t smep = 0, smap = 0;
bool_t pse1G = 0, pse2M = 0;
@@ -190,6 +238,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
goto out;
/* Get the l3e and check its flags*/
gw->l3e = l3p[guest_l3_table_offset(va)];
+ pkey = guest_l3e_get_pkey(gw->l3e);
gflags = guest_l3e_get_flags(gw->l3e) ^ iflags;
if ( !(gflags & _PAGE_PRESENT) ) {
rc |= _PAGE_PRESENT;
@@ -261,6 +310,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
#endif /* All levels... */
+ pkey = guest_l2e_get_pkey(gw->l2e);
gflags = guest_l2e_get_flags(gw->l2e) ^ iflags;
if ( !(gflags & _PAGE_PRESENT) ) {
rc |= _PAGE_PRESENT;
@@ -324,6 +374,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
if(l1p == NULL)
goto out;
gw->l1e = l1p[guest_l1_table_offset(va)];
+ pkey = guest_l1e_get_pkey(gw->l1e);
gflags = guest_l1e_get_flags(gw->l1e) ^ iflags;
if ( !(gflags & _PAGE_PRESENT) ) {
rc |= _PAGE_PRESENT;
@@ -334,6 +385,8 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
#if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
set_ad:
+ if ( pkey_fault(v, pfec, gflags, pkey) )
+ rc |= _PAGE_PKEY_BITS;
#endif
/* Now re-invert the user-mode requirement for SMEP and SMAP */
if ( smep || smap )
diff --git a/xen/arch/x86/mm/hap/guest_walk.c b/xen/arch/x86/mm/hap/guest_walk.c
index 11c1b35..49d0328 100644
--- a/xen/arch/x86/mm/hap/guest_walk.c
+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -130,6 +130,9 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
if ( missing & _PAGE_INVALID_BITS )
pfec[0] |= PFEC_reserved_bit;
+ if ( missing & _PAGE_PKEY_BITS )
+ pfec[0] |= PFEC_prot_key;
+
if ( missing & _PAGE_PAGED )
pfec[0] = PFEC_page_paged;
diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
index 3447973..eb29e62 100644
--- a/xen/include/asm-x86/guest_pt.h
+++ b/xen/include/asm-x86/guest_pt.h
@@ -81,6 +81,11 @@ static inline u32 guest_l1e_get_flags(guest_l1e_t gl1e)
static inline u32 guest_l2e_get_flags(guest_l2e_t gl2e)
{ return gl2e.l2 & 0xfff; }
+static inline u32 guest_l1e_get_pkey(guest_l1e_t gl1e)
+{ return 0; }
+static inline u32 guest_l2e_get_pkey(guest_l2e_t gl2e)
+{ return 0; }
+
static inline guest_l1e_t guest_l1e_from_gfn(gfn_t gfn, u32 flags)
{ return (guest_l1e_t) { (gfn_x(gfn) << PAGE_SHIFT) | flags }; }
static inline guest_l2e_t guest_l2e_from_gfn(gfn_t gfn, u32 flags)
@@ -154,6 +159,13 @@ static inline u32 guest_l4e_get_flags(guest_l4e_t gl4e)
{ return l4e_get_flags(gl4e); }
#endif
+static inline u32 guest_l1e_get_pkey(guest_l1e_t gl1e)
+{ return l1e_get_pkey(gl1e); }
+static inline u32 guest_l2e_get_pkey(guest_l2e_t gl2e)
+{ return l2e_get_pkey(gl2e); }
+static inline u32 guest_l3e_get_pkey(guest_l3e_t gl3e)
+{ return l3e_get_pkey(gl3e); }
+
static inline guest_l1e_t guest_l1e_from_gfn(gfn_t gfn, u32 flags)
{ return l1e_from_pfn(gfn_x(gfn), flags); }
static inline guest_l2e_t guest_l2e_from_gfn(gfn_t gfn, u32 flags)
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index b9d893d..2aee292 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -275,6 +275,8 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode);
(hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMAP))
#define hvm_nx_enabled(v) \
(!!((v)->arch.hvm_vcpu.guest_efer & EFER_NX))
+#define hvm_pku_enabled(v) \
+ (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE))
/* Can we use superpages in the HAP p2m table? */
#define hap_has_1gb (!!(hvm_funcs.hap_capabilities & HVM_HAP_SUPERPAGE_1GB))
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index a095a93..d69d91d 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -93,6 +93,11 @@
#define l3e_get_flags(x) (get_pte_flags((x).l3))
#define l4e_get_flags(x) (get_pte_flags((x).l4))
+/* Get pte pkeys (unsigned int). */
+#define l1e_get_pkey(x) (get_pte_pkey((x).l1))
+#define l2e_get_pkey(x) (get_pte_pkey((x).l2))
+#define l3e_get_pkey(x) (get_pte_pkey((x).l3))
+
/* Construct an empty pte. */
#define l1e_empty() ((l1_pgentry_t) { 0 })
#define l2e_empty() ((l2_pgentry_t) { 0 })
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 26ba141..0b4e7ca 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -374,6 +374,46 @@ static always_inline void clear_in_cr4 (unsigned long mask)
write_cr4(read_cr4() & ~mask);
}
+static inline unsigned int read_pkru(void)
+{
+ unsigned int pkru;
+ unsigned long cr4 = read_cr4();
+
+ /*
+ * _PAGE_PKEY_BITS have a conflict with _PAGE_GNTTAB used by PV guests,
+ * so that X86_CR4_PKE is disable on hypervisor, RDPKRU instruction can
+ * be used with temporarily setting CR4.PKE.
+ */
+ write_cr4(cr4 | X86_CR4_PKE);
+ asm volatile (".byte 0x0f,0x01,0xee"
+ : "=a" (pkru) : "c" (0) : "dx");
+ write_cr4(cr4);
+
+ return pkru;
+}
+
+/* Macros for PKRU domain */
+#define PKRU_READ (0)
+#define PKRU_WRITE (1)
+#define PKRU_ATTRS (2)
+
+/*
+ * PKRU defines 32 bits, there are 16 domains and 2 attribute bits per
+ * domain in pkru, pkeys is index to a defined domain, so the value of
+ * pte_pkeys * PKRU_ATTRS + R/W is offset of a defined domain attribute.
+ */
+static inline bool_t read_pkru_ad(unsigned int pkru, unsigned int pkey)
+{
+ ASSERT(pkey < 16);
+ return (pkru >> (pkey * PKRU_ATTRS + PKRU_READ)) & 1;
+}
+
+static inline bool_t read_pkru_wd(unsigned int pkru, unsigned int pkey)
+{
+ ASSERT(pkey < 16);
+ return (pkru >> (pkey * PKRU_ATTRS + PKRU_WRITE)) & 1;
+}
+
/*
* NSC/Cyrix CPU configuration register indexes
*/
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index 19ab4d0..86abb94 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -134,6 +134,18 @@ typedef l4_pgentry_t root_pgentry_t;
#define get_pte_flags(x) (((int)((x) >> 40) & ~0xFFF) | ((int)(x) & 0xFFF))
#define put_pte_flags(x) (((intpte_t)((x) & ~0xFFF) << 40) | ((x) & 0xFFF))
+/*
+ * Protection keys define a new 4-bit protection key field
+ * (PKEY) in bits 62:59 of leaf entries of the page tables.
+ * This corresponds to bit 22:19 of a 24-bit flags.
+ *
+ * Notice: Bit 22 is used by _PAGE_GNTTAB which is visible to PV guests,
+ * so Protection keys must be disabled on PV guests.
+ */
+#define _PAGE_PKEY_BITS (0x780000) /* Protection Keys, 22:19 */
+
+#define get_pte_pkey(x) (MASK_EXTR(get_pte_flags(x), _PAGE_PKEY_BITS))
+
/* Bit 23 of a 24-bit flag mask. This corresponds to bit 63 of a pte.*/
#define _PAGE_NX_BIT (1U<<23)
--
2.4.3
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH V6 2/5] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2016-01-19 7:30 ` [PATCH V6 2/5] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
@ 2016-01-25 15:46 ` Jan Beulich
2016-01-27 3:18 ` Han, Huaitong
0 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2016-01-25 15:46 UTC (permalink / raw)
To: Huaitong Han; +Cc: george.dunlap, andrew.cooper3, keir, wei.liu2, xen-devel
>>> On 19.01.16 at 08:30, <huaitong.han@intel.com> wrote:
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> ---
Please get used to put here per-patch info on what changed from the
previous revision.
> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -90,6 +90,53 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
> return 0;
> }
>
> +#if GUEST_PAGING_LEVELS >= 4
> +bool_t pkey_fault(struct vcpu *vcpu, uint32_t pfec,
static, especially now that you use >= in the #if.
> + uint32_t pte_flags, uint32_t pte_pkey)
> +{
> + unsigned int pkru = 0;
> + bool_t pkru_ad, pkru_wd;
> +
> + /* When page isn't present, PKEY isn't checked. */
> + if ( !(pfec & PFEC_page_present) || is_pv_vcpu(vcpu) )
> + return 0;
> +
> + /*
> + * PKU: additional mechanism by which the paging controls
> + * access to user-mode addresses based on the value in the
> + * PKRU register. A fault is considered as a PKU violation if all
> + * of the following conditions are true:
> + * 1.CR4_PKE=1.
> + * 2.EFER_LMA=1.
> + * 3.Page is present with no reserved bit violations.
> + * 4.The access is not an instruction fetch.
> + * 5.The access is to a user page.
> + * 6.PKRU.AD=1 or
> + * the access is a data write and PKRU.WD=1 and
> + * either CR0.WP=1 or it is a user access.
> + */
> + if ( !hvm_pku_enabled(vcpu) ||
> + !hvm_long_mode_enabled(vcpu) ||
> + (pfec & PFEC_reserved_bit) ||
This is only one half of 3 - please add a comment saying that the
other half is already being guaranteed by the caller.
> + (pfec & PFEC_insn_fetch) ||
> + !(pte_flags & _PAGE_USER) )
Also for the hole if() - indentation.
> + return 0;
> +
> + pkru = read_pkru();
> + if ( unlikely(pkru) )
> + {
> + pkru_ad = read_pkru_ad(pkru, pte_pkey);
> + pkru_wd = read_pkru_wd(pkru, pte_pkey);
Please make these the declarations (with initializers) of those
variables.
> --- a/xen/include/asm-x86/page.h
> +++ b/xen/include/asm-x86/page.h
> @@ -93,6 +93,11 @@
> #define l3e_get_flags(x) (get_pte_flags((x).l3))
> #define l4e_get_flags(x) (get_pte_flags((x).l4))
>
> +/* Get pte pkeys (unsigned int). */
> +#define l1e_get_pkey(x) (get_pte_pkey((x).l1))
> +#define l2e_get_pkey(x) (get_pte_pkey((x).l2))
> +#define l3e_get_pkey(x) (get_pte_pkey((x).l3))
Despite realizing that other code around here does so too, please
don't make things worse and omit unnecessary parentheses (like
the outer ones here).
> --- a/xen/include/asm-x86/processor.h
> +++ b/xen/include/asm-x86/processor.h
> @@ -374,6 +374,46 @@ static always_inline void clear_in_cr4 (unsigned long mask)
> write_cr4(read_cr4() & ~mask);
> }
>
> +static inline unsigned int read_pkru(void)
> +{
> + unsigned int pkru;
> + unsigned long cr4 = read_cr4();
> +
> + /*
> + * _PAGE_PKEY_BITS have a conflict with _PAGE_GNTTAB used by PV guests,
> + * so that X86_CR4_PKE is disable on hypervisor, RDPKRU instruction can
> + * be used with temporarily setting CR4.PKE.
> + */
... is disabled on hypervisor. To use RDPKRU, CR4.PKE gets
temporarily enabled.
> + write_cr4(cr4 | X86_CR4_PKE);
> + asm volatile (".byte 0x0f,0x01,0xee"
> + : "=a" (pkru) : "c" (0) : "dx");
> + write_cr4(cr4);
I think you will want to abstract out the actual writing of CR4 from
write_cr4(), as updating this_cpu(cr4) back and forth is quite
pointless here.
> +/*
> + * PKRU defines 32 bits, there are 16 domains and 2 attribute bits per
> + * domain in pkru, pkeys is index to a defined domain, so the value of
> + * pte_pkeys * PKRU_ATTRS + R/W is offset of a defined domain attribute.
> + */
> +static inline bool_t read_pkru_ad(unsigned int pkru, unsigned int pkey)
> +{
> + ASSERT(pkey < 16);
> + return (pkru >> (pkey * PKRU_ATTRS + PKRU_READ)) & 1;
> +}
> +
> +static inline bool_t read_pkru_wd(unsigned int pkru, unsigned int pkey)
> +{
> + ASSERT(pkey < 16);
> + return (pkru >> (pkey * PKRU_ATTRS + PKRU_WRITE)) & 1;
> +}
I think in both cases the first parameter should be uint32_t, even if
this is a benign change (serving documentation purposes though).
Jan
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH V6 2/5] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2016-01-25 15:46 ` Jan Beulich
@ 2016-01-27 3:18 ` Han, Huaitong
2016-01-27 10:20 ` Jan Beulich
0 siblings, 1 reply; 19+ messages in thread
From: Han, Huaitong @ 2016-01-27 3:18 UTC (permalink / raw)
To: JBeulich@suse.com
Cc: george.dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
keir@xen.org, wei.liu2@citrix.com, xen-devel@lists.xen.org
On Mon, 2016-01-25 at 08:46 -0700, Jan Beulich wrote:
> > > > On 19.01.16 at 08:30, <huaitong.han@intel.com> wrote:
>
>
> > + write_cr4(cr4 | X86_CR4_PKE);
> > + asm volatile (".byte 0x0f,0x01,0xee"
> > + : "=a" (pkru) : "c" (0) : "dx");
> > + write_cr4(cr4);
>
> I think you will want to abstract out the actual writing of CR4 from
> write_cr4(), as updating this_cpu(cr4) back and forth is quite
> pointless here.
>
Updating this_cpu(cr4) back and forth is pointless, but using
"asm volatile ( "mov %0,%%cr4" : : "r" (val) )" directly here is also
bad code style, as there are two places you can update cr4 directly.
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH V6 2/5] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2016-01-27 3:18 ` Han, Huaitong
@ 2016-01-27 10:20 ` Jan Beulich
0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2016-01-27 10:20 UTC (permalink / raw)
To: Huaitong Han
Cc: george.dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
keir@xen.org, wei.liu2@citrix.com, xen-devel@lists.xen.org
>>> On 27.01.16 at 04:18, <huaitong.han@intel.com> wrote:
> On Mon, 2016-01-25 at 08:46 -0700, Jan Beulich wrote:
>
>> > > > On 19.01.16 at 08:30, <huaitong.han@intel.com> wrote:
>>
>>
>> > + write_cr4(cr4 | X86_CR4_PKE);
>> > + asm volatile (".byte 0x0f,0x01,0xee"
>> > + : "=a" (pkru) : "c" (0) : "dx");
>> > + write_cr4(cr4);
>>
>> I think you will want to abstract out the actual writing of CR4 from
>> write_cr4(), as updating this_cpu(cr4) back and forth is quite
>> pointless here.
>>
> Updating this_cpu(cr4) back and forth is pointless, but using
> "asm volatile ( "mov %0,%%cr4" : : "r" (val) )" directly here is also
> bad code style, as there are two places you can update cr4 directly.
Not sure what two places you refer to, but note that I specifically
said "abstract out", not "use inline assembly directly".
Jan
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH V6 3/5] x86/hvm: pkeys, add xstate support for pkeys
2016-01-19 7:30 [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support Huaitong Han
2016-01-19 7:30 ` [PATCH V6 1/5] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
2016-01-19 7:30 ` [PATCH V6 2/5] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
@ 2016-01-19 7:30 ` Huaitong Han
2016-01-25 15:48 ` Jan Beulich
2016-01-19 7:30 ` [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn Huaitong Han
` (2 subsequent siblings)
5 siblings, 1 reply; 19+ messages in thread
From: Huaitong Han @ 2016-01-19 7:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, george.dunlap, wei.liu2, keir
Cc: Huaitong Han, xen-devel
The XSAVE feature set can operate on PKRU state only if the feature set is
enabled (CR4.OSXSAVE = 1) and has been configured to manage PKRU state
(XCR0[9] = 1). And XCR0.PKRU is disabled on PV mode without PKU feature
enabled.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
xen/arch/x86/xstate.c | 4 ++++
xen/include/asm-x86/xstate.h | 4 +++-
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 4e87ab3..b79b20b 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -579,6 +579,10 @@ int handle_xsetbv(u32 index, u64 new_bv)
if ( (new_bv & ~xfeature_mask) || !valid_xcr0(new_bv) )
return -EINVAL;
+ /* XCR0.PKRU is disabled on PV mode. */
+ if ( is_pv_vcpu(curr) && (new_bv & XSTATE_PKRU) )
+ return -EINVAL;
+
if ( !set_xcr0(new_bv) )
return -EFAULT;
diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index 12d939b..f7c41ba 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -34,13 +34,15 @@
#define XSTATE_OPMASK (1ULL << 5)
#define XSTATE_ZMM (1ULL << 6)
#define XSTATE_HI_ZMM (1ULL << 7)
+#define XSTATE_PKRU (1ULL << 9)
#define XSTATE_LWP (1ULL << 62) /* AMD lightweight profiling */
#define XSTATE_FP_SSE (XSTATE_FP | XSTATE_SSE)
#define XCNTXT_MASK (XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \
XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
#define XSTATE_ALL (~(1ULL << 63))
-#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
+#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \
+ XSTATE_PKRU)
#define XSTATE_LAZY (XSTATE_ALL & ~XSTATE_NONLAZY)
#define XSTATE_COMPACTION_ENABLED (1ULL << 63)
--
2.4.3
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH V6 3/5] x86/hvm: pkeys, add xstate support for pkeys
2016-01-19 7:30 ` [PATCH V6 3/5] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
@ 2016-01-25 15:48 ` Jan Beulich
0 siblings, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2016-01-25 15:48 UTC (permalink / raw)
To: Huaitong Han; +Cc: george.dunlap, andrew.cooper3, keir, wei.liu2, xen-devel
>>> On 19.01.16 at 08:30, <huaitong.han@intel.com> wrote:
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -579,6 +579,10 @@ int handle_xsetbv(u32 index, u64 new_bv)
> if ( (new_bv & ~xfeature_mask) || !valid_xcr0(new_bv) )
> return -EINVAL;
>
> + /* XCR0.PKRU is disabled on PV mode. */
> + if ( is_pv_vcpu(curr) && (new_bv & XSTATE_PKRU) )
> + return -EINVAL;
How about making this more distinguishable by using e.g.
-EOPNOTSUPP here?
Jan
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn
2016-01-19 7:30 [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support Huaitong Han
` (2 preceding siblings ...)
2016-01-19 7:30 ` [PATCH V6 3/5] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
@ 2016-01-19 7:30 ` Huaitong Han
2016-01-25 15:56 ` Jan Beulich
2016-01-26 14:30 ` Tim Deegan
2016-01-19 7:30 ` [PATCH V6 5/5] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
2016-01-25 15:25 ` [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support Jan Beulich
5 siblings, 2 replies; 19+ messages in thread
From: Huaitong Han @ 2016-01-19 7:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, george.dunlap, wei.liu2, keir
Cc: Huaitong Han, George Dunlap, xen-devel
At the moment, the pfec argument to gva_to_gfn has two functions:
* To inform guest_walk what kind of access is happenind
* As a value to pass back into the guest in the event of a fault.
Unfortunately this is not quite treated consistently: the hvm_fetch_*
function will "pre-clear" the PFEC_insn_fetch flag before calling
gva_to_gfn; meaning guest_walk doesn't actually know whether a given
access is an instruction fetch or not. This works now, but will cause
issues when pkeys are introduced, since guest_walk will need to know
whether an access is an instruction fetch even if it doesn't return
PFEC_insn_fetch.
Fix this by making a clean separation for in and out functionalities
of the pfec argument:
1. Always pass in the access type to gva_to_gfn
2. Filter out inappropriate access flags before returning from gva_to_gfn.
(The PFEC_insn_fetch flag should only be passed to the guest if either NX or
SMEP is enabled. See Intel 64 Developer's Manual, Volume 3, Section 4.7.)
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
---
xen/arch/x86/hvm/hvm.c | 8 ++------
xen/arch/x86/mm/hap/guest_walk.c | 10 +++++++++-
xen/arch/x86/mm/shadow/multi.c | 6 ++++++
3 files changed, 17 insertions(+), 7 deletions(-)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 21470ec..688d200 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4432,11 +4432,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt(
enum hvm_copy_result hvm_fetch_from_guest_virt(
void *buf, unsigned long vaddr, int size, uint32_t pfec)
{
- if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) )
- pfec |= PFEC_insn_fetch;
return __hvm_copy(buf, vaddr, size,
HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
- PFEC_page_present | pfec);
+ PFEC_page_present | PFEC_insn_fetch | pfec);
}
enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
@@ -4458,11 +4456,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
void *buf, unsigned long vaddr, int size, uint32_t pfec)
{
- if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) )
- pfec |= PFEC_insn_fetch;
return __hvm_copy(buf, vaddr, size,
HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
- PFEC_page_present | pfec);
+ PFEC_page_present | PFEC_insn_fetch | pfec);
}
unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
diff --git a/xen/arch/x86/mm/hap/guest_walk.c b/xen/arch/x86/mm/hap/guest_walk.c
index 49d0328..3eb8597 100644
--- a/xen/arch/x86/mm/hap/guest_walk.c
+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -82,7 +82,7 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
if ( !top_page )
{
pfec[0] &= ~PFEC_page_present;
- return INVALID_GFN;
+ goto out_tweak_pfec;
}
top_mfn = _mfn(page_to_mfn(top_page));
@@ -139,6 +139,14 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
if ( missing & _PAGE_SHARED )
pfec[0] = PFEC_page_shared;
+out_tweak_pfec:
+ /*
+ * Intel 64 Volume 3, Section 4.7: The PFEC_insn_fetch flag is set
+ * only when NX or SMEP are enabled.
+ */
+ if ( !hvm_nx_enabled(v) && !hvm_smep_enabled(v) )
+ pfec[0] &= ~PFEC_insn_fetch;
+
return INVALID_GFN;
}
diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index 58f7e72..bbbc706 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3668,6 +3668,12 @@ sh_gva_to_gfn(struct vcpu *v, struct p2m_domain *p2m,
pfec[0] &= ~PFEC_page_present;
if ( missing & _PAGE_INVALID_BITS )
pfec[0] |= PFEC_reserved_bit;
+ /*
+ * Intel 64 Volume 3, Section 4.7: The PFEC_insn_fetch flag is
+ * set only when NX or SMEP are enabled.
+ */
+ if ( !hvm_nx_enabled(v) && !hvm_smep_enabled(v) )
+ pfec[0] &= ~PFEC_insn_fetch;
return INVALID_GFN;
}
gfn = guest_walk_to_gfn(&gw);
--
2.4.3
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn
2016-01-19 7:30 ` [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn Huaitong Han
@ 2016-01-25 15:56 ` Jan Beulich
2016-01-26 14:30 ` Tim Deegan
1 sibling, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2016-01-25 15:56 UTC (permalink / raw)
To: George Dunlap, Huaitong Han
Cc: keir, george.dunlap, andrew.cooper3, Tim Deegan, xen-devel,
wei.liu2
>>> On 19.01.16 at 08:30, <huaitong.han@intel.com> wrote:
> At the moment, the pfec argument to gva_to_gfn has two functions:
>
> * To inform guest_walk what kind of access is happenind
>
> * As a value to pass back into the guest in the event of a fault.
>
> Unfortunately this is not quite treated consistently: the hvm_fetch_*
> function will "pre-clear" the PFEC_insn_fetch flag before calling
> gva_to_gfn; meaning guest_walk doesn't actually know whether a given
> access is an instruction fetch or not. This works now, but will cause
> issues when pkeys are introduced, since guest_walk will need to know
> whether an access is an instruction fetch even if it doesn't return
> PFEC_insn_fetch.
>
> Fix this by making a clean separation for in and out functionalities
> of the pfec argument:
>
> 1. Always pass in the access type to gva_to_gfn
>
> 2. Filter out inappropriate access flags before returning from gva_to_gfn.
>
> (The PFEC_insn_fetch flag should only be passed to the guest if either NX or
> SMEP is enabled. See Intel 64 Developer's Manual, Volume 3, Section 4.7.)
As mentioned in various other contexts not so long ago - no SDM
section number please (as they tend to change); use section titles
instead (also below in code comments).
> Signed-off-by: George Dunlap <george.dunlap@citrix.com>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> ---
> xen/arch/x86/hvm/hvm.c | 8 ++------
> xen/arch/x86/mm/hap/guest_walk.c | 10 +++++++++-
> xen/arch/x86/mm/shadow/multi.c | 6 ++++++
You failed to Cc the shadow code maintainer (now added).
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4432,11 +4432,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt(
> enum hvm_copy_result hvm_fetch_from_guest_virt(
> void *buf, unsigned long vaddr, int size, uint32_t pfec)
> {
> - if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) )
> - pfec |= PFEC_insn_fetch;
> return __hvm_copy(buf, vaddr, size,
> HVMCOPY_from_guest | HVMCOPY_fault | HVMCOPY_virt,
> - PFEC_page_present | pfec);
> + PFEC_page_present | PFEC_insn_fetch | pfec);
> }
>
> enum hvm_copy_result hvm_copy_to_guest_virt_nofault(
> @@ -4458,11 +4456,9 @@ enum hvm_copy_result hvm_copy_from_guest_virt_nofault(
> enum hvm_copy_result hvm_fetch_from_guest_virt_nofault(
> void *buf, unsigned long vaddr, int size, uint32_t pfec)
> {
> - if ( hvm_nx_enabled(current) || hvm_smep_enabled(current) )
> - pfec |= PFEC_insn_fetch;
> return __hvm_copy(buf, vaddr, size,
> HVMCOPY_from_guest | HVMCOPY_no_fault | HVMCOPY_virt,
> - PFEC_page_present | pfec);
> + PFEC_page_present | PFEC_insn_fetch | pfec);
> }
>
> unsigned long copy_to_user_hvm(void *to, const void *from, unsigned int len)
This part
Acked-by: Jan Beulich <jbeulich@suse.com>
> --- a/xen/arch/x86/mm/hap/guest_walk.c
> +++ b/xen/arch/x86/mm/hap/guest_walk.c
> @@ -82,7 +82,7 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
> if ( !top_page )
> {
> pfec[0] &= ~PFEC_page_present;
> - return INVALID_GFN;
> + goto out_tweak_pfec;
> }
> top_mfn = _mfn(page_to_mfn(top_page));
>
> @@ -139,6 +139,14 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
> if ( missing & _PAGE_SHARED )
> pfec[0] = PFEC_page_shared;
>
> +out_tweak_pfec:
To avoid corrupting the context in patches, labels should be indented
by at least one space please.
Jan
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn
2016-01-19 7:30 ` [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn Huaitong Han
2016-01-25 15:56 ` Jan Beulich
@ 2016-01-26 14:30 ` Tim Deegan
2016-01-27 7:22 ` Han, Huaitong
1 sibling, 1 reply; 19+ messages in thread
From: Tim Deegan @ 2016-01-26 14:30 UTC (permalink / raw)
To: Huaitong Han
Cc: keir, george.dunlap, andrew.cooper3, George Dunlap, xen-devel,
jbeulich, wei.liu2
Hi,
At 15:30 +0800 on 19 Jan (1453217458), Huaitong Han wrote:
> At the moment, the pfec argument to gva_to_gfn has two functions:
>
> * To inform guest_walk what kind of access is happenind
>
> * As a value to pass back into the guest in the event of a fault.
>
> Unfortunately this is not quite treated consistently: the hvm_fetch_*
> function will "pre-clear" the PFEC_insn_fetch flag before calling
> gva_to_gfn; meaning guest_walk doesn't actually know whether a given
> access is an instruction fetch or not. This works now, but will cause
> issues when pkeys are introduced, since guest_walk will need to know
> whether an access is an instruction fetch even if it doesn't return
> PFEC_insn_fetch.
>
> Fix this by making a clean separation for in and out functionalities
> of the pfec argument:
>
> 1. Always pass in the access type to gva_to_gfn
>
> 2. Filter out inappropriate access flags before returning from gva_to_gfn.
This seems OK. But can you please:
- Add this new adjustment once, in paging_gva_to_gfn(), instead of
adding it to each implementation; and
- Adjust the comment above the declaration of paging_gva_to_gfn() in
paging.h to describe this new behaviour.
Also:
> diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
> index 58f7e72..bbbc706 100644
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3668,6 +3668,12 @@ sh_gva_to_gfn(struct vcpu *v, struct p2m_domain *p2m,
> pfec[0] &= ~PFEC_page_present;
> if ( missing & _PAGE_INVALID_BITS )
> pfec[0] |= PFEC_reserved_bit;
> + /*
> + * Intel 64 Volume 3, Section 4.7: The PFEC_insn_fetch flag is
> + * set only when NX or SMEP are enabled.
> + */
> + if ( !hvm_nx_enabled(v) && !hvm_smep_enabled(v) )
> + pfec[0] &= ~PFEC_insn_fetch;
This needs to either DTRT for PV guests or assert that it always sees
a HVM guest (I think this is the case but haven't tested).
Cheers,
Tim.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn
2016-01-26 14:30 ` Tim Deegan
@ 2016-01-27 7:22 ` Han, Huaitong
2016-01-27 9:34 ` Tim Deegan
0 siblings, 1 reply; 19+ messages in thread
From: Han, Huaitong @ 2016-01-27 7:22 UTC (permalink / raw)
To: tjd@phlegethon.org
Cc: wei.liu2@citrix.com, george.dunlap@eu.citrix.com,
andrew.cooper3@citrix.com, george.dunlap@citrix.com,
xen-devel@lists.xen.org, jbeulich@suse.com, keir@xen.org
On Tue, 2016-01-26 at 14:30 +0000, Tim Deegan wrote:
> Hi,
>
> At 15:30 +0800 on 19 Jan (1453217458), Huaitong Han wrote:
> > At the moment, the pfec argument to gva_to_gfn has two functions:
> >
> > * To inform guest_walk what kind of access is happenind
> >
> > * As a value to pass back into the guest in the event of a fault.
> >
> > Unfortunately this is not quite treated consistently: the
> > hvm_fetch_*
> > function will "pre-clear" the PFEC_insn_fetch flag before calling
> > gva_to_gfn; meaning guest_walk doesn't actually know whether a
> > given
> > access is an instruction fetch or not. This works now, but will
> > cause
> > issues when pkeys are introduced, since guest_walk will need to
> > know
> > whether an access is an instruction fetch even if it doesn't return
> > PFEC_insn_fetch.
> >
> > Fix this by making a clean separation for in and out
> > functionalities
> > of the pfec argument:
> >
> > 1. Always pass in the access type to gva_to_gfn
> >
> > 2. Filter out inappropriate access flags before returning from
> > gva_to_gfn.
>
> This seems OK. But can you please:
> - Add this new adjustment once, in paging_gva_to_gfn(), instead of
> adding it to each implementation; and
> - Adjust the comment above the declaration of paging_gva_to_gfn() in
> paging.h to describe this new behaviour.
Although adding adjustment in paging_gva_to_gfn can reduce code
duplication, adding it to each implementation is more readable, becasue
other sections of pfec are handled in each implementation.
>
> Cheers,
>
> Tim.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn
2016-01-27 7:22 ` Han, Huaitong
@ 2016-01-27 9:34 ` Tim Deegan
2016-01-27 10:13 ` Han, Huaitong
0 siblings, 1 reply; 19+ messages in thread
From: Tim Deegan @ 2016-01-27 9:34 UTC (permalink / raw)
To: Han, Huaitong
Cc: wei.liu2@citrix.com, george.dunlap@eu.citrix.com,
andrew.cooper3@citrix.com, george.dunlap@citrix.com,
xen-devel@lists.xen.org, jbeulich@suse.com, keir@xen.org
Hi,
At 07:22 +0000 on 27 Jan (1453879344), Han, Huaitong wrote:
> On Tue, 2016-01-26 at 14:30 +0000, Tim Deegan wrote:
> > This seems OK. But can you please:
> > - Add this new adjustment once, in paging_gva_to_gfn(), instead of
> > adding it to each implementation; and
> > - Adjust the comment above the declaration of paging_gva_to_gfn() in
> > paging.h to describe this new behaviour.
> Although adding adjustment in paging_gva_to_gfn can reduce code
> duplication, adding it to each implementation is more readable, becasue
> other sections of pfec are handled in each implementation.
True, but since paging_gva_to_gfn() is already non-trivial and this
is a different kind of adjustment, I'd still like it done there.
I'll leave this to George's discretion as x86/mm maintainer.
But in any case, please add the comment describing the new semantics.
Cheers,
Tim.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn
2016-01-27 9:34 ` Tim Deegan
@ 2016-01-27 10:13 ` Han, Huaitong
0 siblings, 0 replies; 19+ messages in thread
From: Han, Huaitong @ 2016-01-27 10:13 UTC (permalink / raw)
To: tim@xen.org, george.dunlap@citrix.com
Cc: andrew.cooper3@citrix.com, keir@xen.org, wei.liu2@citrix.com,
jbeulich@suse.com, xen-devel@lists.xen.org
On Wed, 2016-01-27 at 09:34 +0000, Tim Deegan wrote:
> Hi,
>
> At 07:22 +0000 on 27 Jan (1453879344), Han, Huaitong wrote:
> > On Tue, 2016-01-26 at 14:30 +0000, Tim Deegan wrote:
> > > This seems OK. But can you please:
> > > - Add this new adjustment once, in paging_gva_to_gfn(), instead
> > > of
> > > adding it to each implementation; and
> > > - Adjust the comment above the declaration of
> > > paging_gva_to_gfn() in
> > > paging.h to describe this new behaviour.
> > Although adding adjustment in paging_gva_to_gfn can reduce code
> > duplication, adding it to each implementation is more readable,
> > becasue
> > other sections of pfec are handled in each implementation.
>
> True, but since paging_gva_to_gfn() is already non-trivial and this
> is a different kind of adjustment, I'd still like it done there.
> I'll leave this to George's discretion as x86/mm maintainer.
>
> But in any case, please add the comment describing the new semantics.
To George:
What is your opinion on Tim's comment?
To Tim:
I will update the codes and the comment in patch serial V8.
> Cheers,
>
> Tim.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH V6 5/5] x86/hvm: pkeys, add pkeys support for cpuid handling
2016-01-19 7:30 [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support Huaitong Han
` (3 preceding siblings ...)
2016-01-19 7:30 ` [PATCH V6 4/5] xen/mm: Clean up pfec handling in gva_to_gfn Huaitong Han
@ 2016-01-19 7:30 ` Huaitong Han
2016-01-19 9:19 ` Wei Liu
2016-01-25 16:04 ` Jan Beulich
2016-01-25 15:25 ` [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support Jan Beulich
5 siblings, 2 replies; 19+ messages in thread
From: Huaitong Han @ 2016-01-19 7:30 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, george.dunlap, wei.liu2, keir
Cc: Huaitong Han, xen-devel
This patch adds pkeys support for cpuid handing.
Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
---
tools/libxc/xc_cpufeature.h | 2 ++
tools/libxc/xc_cpuid_x86.c | 6 ++++--
xen/arch/x86/hvm/hvm.c | 36 +++++++++++++++++++++++-------------
3 files changed, 29 insertions(+), 15 deletions(-)
diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
index c3ddc80..f6a9778 100644
--- a/tools/libxc/xc_cpufeature.h
+++ b/tools/libxc/xc_cpufeature.h
@@ -141,5 +141,7 @@
#define X86_FEATURE_ADX 19 /* ADCX, ADOX instructions */
#define X86_FEATURE_SMAP 20 /* Supervisor Mode Access Protection */
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
+#define X86_FEATURE_PKU 3
#endif /* __LIBXC_CPUFEATURE_H */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 8882c01..1ce979b 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -427,9 +427,11 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
bitmaskof(X86_FEATURE_ADX) |
bitmaskof(X86_FEATURE_SMAP) |
bitmaskof(X86_FEATURE_FSGSBASE));
+ regs[2] &= bitmaskof(X86_FEATURE_PKU);
} else
- regs[1] = 0;
- regs[0] = regs[2] = regs[3] = 0;
+ regs[1] = regs[2] = 0;
+
+ regs[0] = regs[3] = 0;
break;
case 0x0000000d:
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 688d200..39123fe 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4566,7 +4566,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
__clear_bit(X86_FEATURE_APIC & 31, edx);
/* Fix up OSXSAVE. */
- if ( cpu_has_xsave )
+ if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
*ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ?
cpufeat_mask(X86_FEATURE_OSXSAVE) : 0;
@@ -4579,21 +4579,31 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
*edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
break;
case 0x7:
- if ( (count == 0) && !cpu_has_smep )
- *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
+ if ( count == 0 )
+ {
+ if ( !cpu_has_smep )
+ *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
- if ( (count == 0) && !cpu_has_smap )
- *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
+ if ( !cpu_has_smap )
+ *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
- /* Don't expose MPX to hvm when VMX support is not available */
- if ( (count == 0) &&
- (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
- !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS)) )
- *ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
+ /* Don't expose MPX to hvm when VMX support is not available. */
+ if ( !(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
+ !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS) )
+ *ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
- /* Don't expose INVPCID to non-hap hvm. */
- if ( (count == 0) && !hap_enabled(d) )
- *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
+ if ( !hap_enabled(d) )
+ {
+ /* Don't expose INVPCID to non-hap hvm. */
+ *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
+ /* X86_FEATURE_PKU is not yet implemented for shadow paging. */
+ *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
+ }
+
+ if ( (*ecx & cpufeat_mask(X86_FEATURE_PKU)) &&
+ (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) )
+ *ecx |= cpufeat_mask(X86_FEATURE_OSPKE);
+ }
break;
case 0xb:
/* Fix the x2APIC identifier. */
--
2.4.3
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [PATCH V6 5/5] x86/hvm: pkeys, add pkeys support for cpuid handling
2016-01-19 7:30 ` [PATCH V6 5/5] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
@ 2016-01-19 9:19 ` Wei Liu
2016-01-25 16:04 ` Jan Beulich
1 sibling, 0 replies; 19+ messages in thread
From: Wei Liu @ 2016-01-19 9:19 UTC (permalink / raw)
To: Huaitong Han
Cc: keir, george.dunlap, andrew.cooper3, xen-devel, jbeulich,
wei.liu2
On Tue, Jan 19, 2016 at 03:30:59PM +0800, Huaitong Han wrote:
> This patch adds pkeys support for cpuid handing.
>
> Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
> CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.
>
I will (again) defer this to x86 maintainers.
Wei.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH V6 5/5] x86/hvm: pkeys, add pkeys support for cpuid handling
2016-01-19 7:30 ` [PATCH V6 5/5] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
2016-01-19 9:19 ` Wei Liu
@ 2016-01-25 16:04 ` Jan Beulich
1 sibling, 0 replies; 19+ messages in thread
From: Jan Beulich @ 2016-01-25 16:04 UTC (permalink / raw)
To: Huaitong Han; +Cc: george.dunlap, andrew.cooper3, keir, wei.liu2, xen-devel
>>> On 19.01.16 at 08:30, <huaitong.han@intel.com> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4566,7 +4566,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
> __clear_bit(X86_FEATURE_APIC & 31, edx);
>
> /* Fix up OSXSAVE. */
> - if ( cpu_has_xsave )
> + if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
> *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ?
> cpufeat_mask(X86_FEATURE_OSXSAVE) : 0;
Imo this change (aiming at consistency) would better be mentioned
in the commit message, to avoid the impression of an unintended /
unrelated change.
> @@ -4579,21 +4579,31 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
> *edx &= ~cpufeat_mask(X86_FEATURE_PSE36);
> break;
> case 0x7:
> - if ( (count == 0) && !cpu_has_smep )
> - *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
> + if ( count == 0 )
> + {
> + if ( !cpu_has_smep )
> + *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
>
> - if ( (count == 0) && !cpu_has_smap )
> - *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
> + if ( !cpu_has_smap )
> + *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
>
> - /* Don't expose MPX to hvm when VMX support is not available */
> - if ( (count == 0) &&
> - (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
> - !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS)) )
> - *ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
> + /* Don't expose MPX to hvm when VMX support is not available. */
> + if ( !(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
> + !(vmx_vmentry_control & VM_ENTRY_LOAD_BNDCFGS) )
> + *ebx &= ~cpufeat_mask(X86_FEATURE_MPX);
>
> - /* Don't expose INVPCID to non-hap hvm. */
> - if ( (count == 0) && !hap_enabled(d) )
> - *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> + if ( !hap_enabled(d) )
> + {
> + /* Don't expose INVPCID to non-hap hvm. */
> + *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> + /* X86_FEATURE_PKU is not yet implemented for shadow paging. */
> + *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
> + }
> +
> + if ( (*ecx & cpufeat_mask(X86_FEATURE_PKU)) &&
> + (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) )
Indentation. But this whole hunk needs re-basing anyway.
Jan
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support
2016-01-19 7:30 [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support Huaitong Han
` (4 preceding siblings ...)
2016-01-19 7:30 ` [PATCH V6 5/5] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
@ 2016-01-25 15:25 ` Jan Beulich
2016-01-26 8:06 ` Han, Huaitong
5 siblings, 1 reply; 19+ messages in thread
From: Jan Beulich @ 2016-01-25 15:25 UTC (permalink / raw)
To: Huaitong Han; +Cc: george.dunlap, andrew.cooper3, keir, wei.liu2, xen-devel
>>> On 19.01.16 at 08:30, <huaitong.han@intel.com> wrote:
> Changes in v6:
> *2 patches merged are not included.
> *Don't write XSTATE_PKRU to PV's xcr0.
> *Use "if()" instead of "?:" in cpuid handling patch.
> *Update read_pkru function.
> *Use value 4 instead of CONFIG_PAGING_LEVELS.
> *Add George's patch for PFEC_insn_fetch handling.
How does this last item match up with ...
> Huaitong Han (5):
> x86/hvm: pkeys, disable pkeys for guests in non-paging mode
> x86/hvm: pkeys, add pkeys support for guest_walk_tables
> x86/hvm: pkeys, add xstate support for pkeys
> xen/mm: Clean up pfec handling in gva_to_gfn
> x86/hvm: pkeys, add pkeys support for cpuid handling
... all five patches being yours?
Jan
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support
2016-01-25 15:25 ` [PATCH V6 0/5] x86/hvm: pkeys, add memory protection-key support Jan Beulich
@ 2016-01-26 8:06 ` Han, Huaitong
0 siblings, 0 replies; 19+ messages in thread
From: Han, Huaitong @ 2016-01-26 8:06 UTC (permalink / raw)
To: JBeulich@suse.com
Cc: george.dunlap@eu.citrix.com, andrew.cooper3@citrix.com,
keir@xen.org, wei.liu2@citrix.com, xen-devel@lists.xen.org
On Mon, 2016-01-25 at 08:25 -0700, Jan Beulich wrote:
> > > > On 19.01.16 at 08:30, <huaitong.han@intel.com> wrote:
> > Changes in v6:
> > *2 patches merged are not included.
> > *Don't write XSTATE_PKRU to PV's xcr0.
> > *Use "if()" instead of "?:" in cpuid handling patch.
> > *Update read_pkru function.
> > *Use value 4 instead of CONFIG_PAGING_LEVELS.
> > *Add George's patch for PFEC_insn_fetch handling.
>
> How does this last item match up with ...
"At the moment PFEC_insn_fetch is only set in
hvm_fetch_from_guest_virt() if hvm_nx_enabled() or hvm_smep_enabled()
are true. Which means that if you *don't* have nx or smep enabled,
then the patch series as is will fault on instruction fetches when it
shouldn't. (I don't remember anyone mentioning nx or smep being
enabled as a prerequisite for pkeys.)"
I think realistically the only way to address this is to start making
the clean separation between "pfec in" and "pfec out" I mentioned in
the previous discussion.
I've coded up the attached patch, but only compile-tested it. Can you
give it a look to see if you think it is correct, test it, include it
in your next patch series?
--from George's comments on V5 patches.
>
> > Huaitong Han (5):
> > x86/hvm: pkeys, disable pkeys for guests in non-paging mode
> > x86/hvm: pkeys, add pkeys support for guest_walk_tables
> > x86/hvm: pkeys, add xstate support for pkeys
> > xen/mm: Clean up pfec handling in gva_to_gfn
> > x86/hvm: pkeys, add pkeys support for cpuid handling
>
> ... all five patches being yours?
I will update a patch author to George.
>
> Jan
>
^ permalink raw reply [flat|nested] 19+ messages in thread