* (Reluctant) request to revert several changes, due to regressing VM migration @ 2014-06-04 15:45 Andrew Cooper 2014-06-04 16:00 ` Jan Beulich 0 siblings, 1 reply; 6+ messages in thread From: Andrew Cooper @ 2014-06-04 15:45 UTC (permalink / raw) To: Jan Beulich, Wu, Feng; +Cc: Xen-devel List Changeset 31ee951a3 "x86/HVM: correct the SMEP logic for HVM_CR0_GUEST_RESERVED_BITS" breaks migration for VMs using SMEP. For migration, the architectural state is restored before the cpuid policy is written. This appears to be the behaviour in libxl, and is certainly the behaviour in Xapi. As a result, a VM using SMEP will fail the CR4 check in hvm_load_cpu_ctxt(). This is easy to observe by performing a localhost migration of a modern HVM Linux VM which enables SMEP. Changeset 58658992 performs an equivalent action for SMAP, and as such will be equivalently broken on supporting hardware. Specifically, c/s f952f9c7f0e which is the backport of 31ee951a3 into staging-4.4 is the problematic change which is causing regressions in XenServer testing. This is a reluctant request as pragmatically the changeset is correct. However, there is a chicken & egg problem for libxc trying to calculate the cpuid policy, which requires bits from the "magic chunks" in the migration stream to be loaded first. Therefore, simply moving the cpuid policy calculation earlier will result in different problems. ~Andrew ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: (Reluctant) request to revert several changes, due to regressing VM migration 2014-06-04 15:45 (Reluctant) request to revert several changes, due to regressing VM migration Andrew Cooper @ 2014-06-04 16:00 ` Jan Beulich 2014-06-04 16:08 ` Andrew Cooper 0 siblings, 1 reply; 6+ messages in thread From: Jan Beulich @ 2014-06-04 16:00 UTC (permalink / raw) To: Andrew Cooper; +Cc: Feng Wu, Xen-devel List >>> On 04.06.14 at 17:45, <andrew.cooper3@citrix.com> wrote: > Changeset 31ee951a3 "x86/HVM: correct the SMEP logic for > HVM_CR0_GUEST_RESERVED_BITS" breaks migration for VMs using SMEP. > > For migration, the architectural state is restored before the cpuid > policy is written. This appears to be the behaviour in libxl, and is > certainly the behaviour in Xapi. > > As a result, a VM using SMEP will fail the CR4 check in > hvm_load_cpu_ctxt(). This is easy to observe by performing a localhost > migration of a modern HVM Linux VM which enables SMEP. > > Changeset 58658992 performs an equivalent action for SMAP, and as such > will be equivalently broken on supporting hardware. > > > Specifically, c/s f952f9c7f0e which is the backport of 31ee951a3 into > staging-4.4 is the problematic change which is causing regressions in > XenServer testing. > > > This is a reluctant request as pragmatically the changeset is correct. So as already hinted at on irc - what's wrong with using cpu_has_smep as long as the guest's d->arch.cpuid[] is blank? If the incoming guest didn't see SMEP available, all its CR4.SMEP would necessarily be clear (or if they weren't, this would sooner or later result in a guest crash). But then again - isn't there another problem here: hvm_cpuid() assumes to be on the subject vCPU, which hardly can be the case for the hvm_load_cpu_ctxt() code path using the macro in question. So perhaps it even needs to be further relaxed in using cpu_has_smep when not on current != v. Which of course would require care by eventual future users of this macro. Jan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: (Reluctant) request to revert several changes, due to regressing VM migration 2014-06-04 16:00 ` Jan Beulich @ 2014-06-04 16:08 ` Andrew Cooper 2014-06-05 11:58 ` [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() Jan Beulich 0 siblings, 1 reply; 6+ messages in thread From: Andrew Cooper @ 2014-06-04 16:08 UTC (permalink / raw) To: Jan Beulich; +Cc: Feng Wu, Xen-devel List On 04/06/14 17:00, Jan Beulich wrote: >>>> On 04.06.14 at 17:45, <andrew.cooper3@citrix.com> wrote: >> Changeset 31ee951a3 "x86/HVM: correct the SMEP logic for >> HVM_CR0_GUEST_RESERVED_BITS" breaks migration for VMs using SMEP. >> >> For migration, the architectural state is restored before the cpuid >> policy is written. This appears to be the behaviour in libxl, and is >> certainly the behaviour in Xapi. >> >> As a result, a VM using SMEP will fail the CR4 check in >> hvm_load_cpu_ctxt(). This is easy to observe by performing a localhost >> migration of a modern HVM Linux VM which enables SMEP. >> >> Changeset 58658992 performs an equivalent action for SMAP, and as such >> will be equivalently broken on supporting hardware. >> >> >> Specifically, c/s f952f9c7f0e which is the backport of 31ee951a3 into >> staging-4.4 is the problematic change which is causing regressions in >> XenServer testing. >> >> >> This is a reluctant request as pragmatically the changeset is correct. > So as already hinted at on irc - what's wrong with using > cpu_has_smep as long as the guest's d->arch.cpuid[] is blank? > If the incoming guest didn't see SMEP available, all its CR4.SMEP > would necessarily be clear (or if they weren't, this would sooner > or later result in a guest crash). > > But then again - isn't there another problem here: hvm_cpuid() > assumes to be on the subject vCPU, which hardly can be the > case for the hvm_load_cpu_ctxt() code path using the macro > in question. So perhaps it even needs to be further relaxed in > using cpu_has_smep when not on current != v. Which of course > would require care by eventual future users of this macro. > > Jan > As I have said, both previously on the list, and at the hackathon, the cpuid handling and domain cpuid policy infrastructure is a massive stinking swamp which gets worse every time I find a new bit of it. For XenServer and our VM feature levelling support, I am going to have to fix it somehow. Although, fixing the 32/64bit migration issue is still a more important problem. I am tempted to suggest reverting it back to what it was before and leaving it in that state until other bits of the infrastructure are actually working. This certainly isn't the only place where this code is fragile. ~Andrew ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() 2014-06-04 16:08 ` Andrew Cooper @ 2014-06-05 11:58 ` Jan Beulich 2014-06-05 13:07 ` David Vrabel 2014-06-06 1:20 ` Wu, Feng 0 siblings, 2 replies; 6+ messages in thread From: Jan Beulich @ 2014-06-05 11:58 UTC (permalink / raw) To: Andrew Cooper, Xen-devel List; +Cc: Keir Fraser, Feng Wu [-- Attachment #1: Type: text/plain, Size: 3660 bytes --] Andrew validly points out that the use of the macro on the restore path can't rely on the CPUID bits for the guest already being in place (as their setting by the tool stack in turn requires the other restore operations already having taken place). And even worse, using hvm_cpuid() is invalid here because that function assumes to be used in the context of the vCPU in question. Reverting to the behavior prior to the change from checking cpu_has_sm?p to hvm_vcpu_has_sm?p() would break the other (non-restore) use of the macro. So let's revert to the prior behavior only for the restore path, by adding a respective second parameter to the macro. Obviously the two cpu_has_* uses in the macro should really also be converted to hvm_cpuid() based checks at least for the non-restore path. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1754,7 +1754,7 @@ static int hvm_load_cpu_ctxt(struct doma return -EINVAL; } - if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v) ) + if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v, 1) ) { printk(XENLOG_G_ERR "HVM%d restore: bad CR4 %#" PRIx64 "\n", d->domain_id, ctxt.cr4); @@ -3186,7 +3186,7 @@ int hvm_set_cr4(unsigned long value) struct vcpu *v = current; unsigned long old_cr; - if ( value & HVM_CR4_GUEST_RESERVED_BITS(v) ) + if ( value & HVM_CR4_GUEST_RESERVED_BITS(v, 0) ) { HVM_DBG_LOG(DBG_LEVEL_1, "Guest attempts to set reserved bit in CR4: %lx", --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -406,19 +406,27 @@ static inline bool_t hvm_vcpu_has_smep(v (X86_CR4_VMXE | X86_CR4_PAE | X86_CR4_MCE)) /* These bits in CR4 cannot be set by the guest. */ -#define HVM_CR4_GUEST_RESERVED_BITS(_v) \ +#define HVM_CR4_GUEST_RESERVED_BITS(v, restore) ({ \ + const struct vcpu *_v = (v); \ + bool_t _restore = !!(restore); \ + ASSERT((_restore) || _v == current); \ (~((unsigned long) \ (X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | \ X86_CR4_DE | X86_CR4_PSE | X86_CR4_PAE | \ X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE | \ X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT | \ - (hvm_vcpu_has_smep() ? X86_CR4_SMEP : 0) | \ - (hvm_vcpu_has_smap() ? X86_CR4_SMAP : 0) | \ + (((_restore) ? cpu_has_smep : \ + hvm_vcpu_has_smep()) ? \ + X86_CR4_SMEP : 0) | \ + (((_restore) ? cpu_has_smap : \ + hvm_vcpu_has_smap()) ? \ + X86_CR4_SMAP : 0) | \ (cpu_has_fsgsbase ? X86_CR4_FSGSBASE : 0) | \ - ((nestedhvm_enabled((_v)->domain) && cpu_has_vmx)\ - ? X86_CR4_VMXE : 0) | \ - (cpu_has_pcid ? X86_CR4_PCIDE : 0) | \ - (cpu_has_xsave ? X86_CR4_OSXSAVE : 0)))) + ((nestedhvm_enabled(_v->domain) && cpu_has_vmx) \ + ? X86_CR4_VMXE : 0) | \ + (cpu_has_pcid ? X86_CR4_PCIDE : 0) | \ + (cpu_has_xsave ? X86_CR4_OSXSAVE : 0)))); \ +}) /* These exceptions must always be intercepted. */ #define HVM_TRAP_MASK ((1U << TRAP_machine_check) | (1U << TRAP_invalid_op)) [-- Attachment #2: x86-HVM-refine-SMxP-tests.patch --] [-- Type: text/plain, Size: 3722 bytes --] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() Andrew validly points out that the use of the macro on the restore path can't rely on the CPUID bits for the guest already being in place (as their setting by the tool stack in turn requires the other restore operations already having taken place). And even worse, using hvm_cpuid() is invalid here because that function assumes to be used in the context of the vCPU in question. Reverting to the behavior prior to the change from checking cpu_has_sm?p to hvm_vcpu_has_sm?p() would break the other (non-restore) use of the macro. So let's revert to the prior behavior only for the restore path, by adding a respective second parameter to the macro. Obviously the two cpu_has_* uses in the macro should really also be converted to hvm_cpuid() based checks at least for the non-restore path. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1754,7 +1754,7 @@ static int hvm_load_cpu_ctxt(struct doma return -EINVAL; } - if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v) ) + if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v, 1) ) { printk(XENLOG_G_ERR "HVM%d restore: bad CR4 %#" PRIx64 "\n", d->domain_id, ctxt.cr4); @@ -3186,7 +3186,7 @@ int hvm_set_cr4(unsigned long value) struct vcpu *v = current; unsigned long old_cr; - if ( value & HVM_CR4_GUEST_RESERVED_BITS(v) ) + if ( value & HVM_CR4_GUEST_RESERVED_BITS(v, 0) ) { HVM_DBG_LOG(DBG_LEVEL_1, "Guest attempts to set reserved bit in CR4: %lx", --- a/xen/include/asm-x86/hvm/hvm.h +++ b/xen/include/asm-x86/hvm/hvm.h @@ -406,19 +406,27 @@ static inline bool_t hvm_vcpu_has_smep(v (X86_CR4_VMXE | X86_CR4_PAE | X86_CR4_MCE)) /* These bits in CR4 cannot be set by the guest. */ -#define HVM_CR4_GUEST_RESERVED_BITS(_v) \ +#define HVM_CR4_GUEST_RESERVED_BITS(v, restore) ({ \ + const struct vcpu *_v = (v); \ + bool_t _restore = !!(restore); \ + ASSERT((_restore) || _v == current); \ (~((unsigned long) \ (X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | \ X86_CR4_DE | X86_CR4_PSE | X86_CR4_PAE | \ X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE | \ X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT | \ - (hvm_vcpu_has_smep() ? X86_CR4_SMEP : 0) | \ - (hvm_vcpu_has_smap() ? X86_CR4_SMAP : 0) | \ + (((_restore) ? cpu_has_smep : \ + hvm_vcpu_has_smep()) ? \ + X86_CR4_SMEP : 0) | \ + (((_restore) ? cpu_has_smap : \ + hvm_vcpu_has_smap()) ? \ + X86_CR4_SMAP : 0) | \ (cpu_has_fsgsbase ? X86_CR4_FSGSBASE : 0) | \ - ((nestedhvm_enabled((_v)->domain) && cpu_has_vmx)\ - ? X86_CR4_VMXE : 0) | \ - (cpu_has_pcid ? X86_CR4_PCIDE : 0) | \ - (cpu_has_xsave ? X86_CR4_OSXSAVE : 0)))) + ((nestedhvm_enabled(_v->domain) && cpu_has_vmx) \ + ? X86_CR4_VMXE : 0) | \ + (cpu_has_pcid ? X86_CR4_PCIDE : 0) | \ + (cpu_has_xsave ? X86_CR4_OSXSAVE : 0)))); \ +}) /* These exceptions must always be intercepted. */ #define HVM_TRAP_MASK ((1U << TRAP_machine_check) | (1U << TRAP_invalid_op)) [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() 2014-06-05 11:58 ` [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() Jan Beulich @ 2014-06-05 13:07 ` David Vrabel 2014-06-06 1:20 ` Wu, Feng 1 sibling, 0 replies; 6+ messages in thread From: David Vrabel @ 2014-06-05 13:07 UTC (permalink / raw) To: Jan Beulich, Andrew Cooper, Xen-devel List; +Cc: Feng Wu, Keir Fraser On 05/06/14 12:58, Jan Beulich wrote: > Andrew validly points out that the use of the macro on the restore path > can't rely on the CPUID bits for the guest already being in place (as > their setting by the tool stack in turn requires the other restore > operations already having taken place). And even worse, using > hvm_cpuid() is invalid here because that function assumes to be used in > the context of the vCPU in question. > > Reverting to the behavior prior to the change from checking > cpu_has_sm?p to hvm_vcpu_has_sm?p() would break the other (non-restore) > use of the macro. So let's revert to the prior behavior only for the > restore path, by adding a respective second parameter to the macro. > > Obviously the two cpu_has_* uses in the macro should really also be > converted to hvm_cpuid() based checks at least for the non-restore > path. FWIW, Tested-by: David Vrabel <david.vrabel@citrix.com> But on 4.4.x not unstable. David ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() 2014-06-05 11:58 ` [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() Jan Beulich 2014-06-05 13:07 ` David Vrabel @ 2014-06-06 1:20 ` Wu, Feng 1 sibling, 0 replies; 6+ messages in thread From: Wu, Feng @ 2014-06-06 1:20 UTC (permalink / raw) To: Jan Beulich, Andrew Cooper, Xen-devel List; +Cc: Keir Fraser Thanks Andrew for pointing this out and thanks a lot for Jan's fix! Thanks, Feng > -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Thursday, June 05, 2014 7:59 PM > To: Andrew Cooper; Xen-devel List > Cc: Wu, Feng; Keir Fraser > Subject: [PATCH] x86/HVM: refine SMEP/SMAP tests in > HVM_CR4_GUEST_RESERVED_BITS() > > Andrew validly points out that the use of the macro on the restore path > can't rely on the CPUID bits for the guest already being in place (as > their setting by the tool stack in turn requires the other restore > operations already having taken place). And even worse, using > hvm_cpuid() is invalid here because that function assumes to be used in > the context of the vCPU in question. > > Reverting to the behavior prior to the change from checking > cpu_has_sm?p to hvm_vcpu_has_sm?p() would break the other (non-restore) > use of the macro. So let's revert to the prior behavior only for the > restore path, by adding a respective second parameter to the macro. > > Obviously the two cpu_has_* uses in the macro should really also be > converted to hvm_cpuid() based checks at least for the non-restore > path. > > Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -1754,7 +1754,7 @@ static int hvm_load_cpu_ctxt(struct doma > return -EINVAL; > } > > - if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v) ) > + if ( ctxt.cr4 & HVM_CR4_GUEST_RESERVED_BITS(v, 1) ) > { > printk(XENLOG_G_ERR "HVM%d restore: bad CR4 %#" PRIx64 "\n", > d->domain_id, ctxt.cr4); > @@ -3186,7 +3186,7 @@ int hvm_set_cr4(unsigned long value) > struct vcpu *v = current; > unsigned long old_cr; > > - if ( value & HVM_CR4_GUEST_RESERVED_BITS(v) ) > + if ( value & HVM_CR4_GUEST_RESERVED_BITS(v, 0) ) > { > HVM_DBG_LOG(DBG_LEVEL_1, > "Guest attempts to set reserved bit in CR4: %lx", > --- a/xen/include/asm-x86/hvm/hvm.h > +++ b/xen/include/asm-x86/hvm/hvm.h > @@ -406,19 +406,27 @@ static inline bool_t hvm_vcpu_has_smep(v > (X86_CR4_VMXE | X86_CR4_PAE | X86_CR4_MCE)) > > /* These bits in CR4 cannot be set by the guest. */ > -#define HVM_CR4_GUEST_RESERVED_BITS(_v) \ > +#define HVM_CR4_GUEST_RESERVED_BITS(v, restore) ({ \ > + const struct vcpu *_v = (v); \ > + bool_t _restore = !!(restore); \ > + ASSERT((_restore) || _v == current); \ > (~((unsigned long) \ > (X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | \ > X86_CR4_DE | X86_CR4_PSE | X86_CR4_PAE | \ > X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE | \ > X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT | \ > - (hvm_vcpu_has_smep() ? X86_CR4_SMEP : 0) | \ > - (hvm_vcpu_has_smap() ? X86_CR4_SMAP : 0) | \ > + (((_restore) ? cpu_has_smep : \ > + hvm_vcpu_has_smep()) ? \ > + X86_CR4_SMEP : 0) | \ > + (((_restore) ? cpu_has_smap : \ > + hvm_vcpu_has_smap()) ? \ > + X86_CR4_SMAP : 0) | \ > (cpu_has_fsgsbase ? X86_CR4_FSGSBASE : 0) | \ > - ((nestedhvm_enabled((_v)->domain) && cpu_has_vmx)\ > - ? X86_CR4_VMXE : 0) | \ > - (cpu_has_pcid ? X86_CR4_PCIDE : 0) | \ > - (cpu_has_xsave ? X86_CR4_OSXSAVE : 0)))) > + ((nestedhvm_enabled(_v->domain) && cpu_has_vmx) \ > + ? X86_CR4_VMXE : 0) | \ > + (cpu_has_pcid ? X86_CR4_PCIDE : 0) | \ > + (cpu_has_xsave ? X86_CR4_OSXSAVE : 0)))); \ > +}) > > /* These exceptions must always be intercepted. */ > #define HVM_TRAP_MASK ((1U << TRAP_machine_check) | (1U << > TRAP_invalid_op)) > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-06-06 1:20 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-06-04 15:45 (Reluctant) request to revert several changes, due to regressing VM migration Andrew Cooper 2014-06-04 16:00 ` Jan Beulich 2014-06-04 16:08 ` Andrew Cooper 2014-06-05 11:58 ` [PATCH] x86/HVM: refine SMEP/SMAP tests in HVM_CR4_GUEST_RESERVED_BITS() Jan Beulich 2014-06-05 13:07 ` David Vrabel 2014-06-06 1:20 ` Wu, Feng
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).