From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Bader Subject: Re: Xen HVM regression on certain Intel CPUs Date: Wed, 03 Apr 2013 13:56:39 +0200 Message-ID: <515C18F7.9000305@canonical.com> References: <51530F9F.10805@canonical.com> <515315EC.4030803@canonical.com> <20130327160427.GB6688@phenom.dumpdata.com> <5153222B.3030605@canonical.com> <515323D4.2030806@zytor.com> <5153299A.7070108@canonical.com> <51532B2F.60506@zytor.com> <515454E002000078000C947C@nat28.tlf.novell.com> <51545B7C.4060106@canonical.com> <5154722E.7080605@canonical.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============7243676430921070825==" Return-path: In-Reply-To: <5154722E.7080605@canonical.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: "xen-devel@lists.xensource.com" , Konrad Rzeszutek Wilk , wei.y.yang@intel.com, haitao.shan@intel.com, xin.li@intel.com, "H. Peter Anvin" List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --===============7243676430921070825== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="------------enig9DA6CE2CE417F017603C29A6" This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig9DA6CE2CE417F017603C29A6 Content-Type: multipart/mixed; boundary="------------040406050903070006030404" This is a multi-part message in MIME format. --------------040406050903070006030404 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 28.03.2013 17:39, Stefan Bader wrote: > On 28.03.2013 16:02, Stefan Bader wrote: >> On 28.03.2013 14:34, Jan Beulich wrote: >>>>>> On 27.03.13 at 18:23, "H. Peter Anvin" wrote: >>>> On 03/27/2013 10:17 AM, Stefan Bader wrote: >>>>>> What does x86info and /proc/cpuinfo show in HVM? >>>>> >>>>> x86info cpuid[7].ebx =3D 0xbbb and /proc/cpuinfo also shows smep >>>>> set. >>>> >>>> On all CPUs? >>>> >>>>>> The inbound %cr4 shouldn't matter at all, we try to not rely on >>>>>> it. >>>>>> >>>>>> If the hypervisor presents SMEP to the guest then the guest is >>>>>> pretty obviously going to try to use it. >>>>> >>>>> To me it looks like when bootstrapping the APs things are not yet >>>>> ready to use it. If I did not miss something, the only place that >>>>> the saved contents of cr4 are used is in startup_32 when the cpus >>>>> are brought up. And then just stop dead. Would need to read more >>>>> code but a bit weird why the BP is not affected. >>>> >>>> This feels like a bug in Xen, but I don't know for sure yet. Either= >>>> which way, it is odd. That write to cr4 should be entirely legitima= te. >>> >>> And I would guess one that got fixed already. >>> >>> Stefan, please try 4.2.2-rc1, or (separately) >>> http://xenbits.xen.org/gitweb/?p=3Dxen.git;a=3Dcommitdiff;h=3D485f374= 230d39e153d7b9786e3d0336bd52ee661 >>> (which I think requires the immediately preceding >>> http://xenbits.xen.org/gitweb/?p=3Dxen.git;a=3Dcommitdiff;h=3D1e6275a= 95d3e35a72939b588f422bb761ba82f6b >>> too). >> >> The backing explanation does make a lot of sense in reasoning what is = going >> wrong. Unfortunately the two patches above on their own do not fix the= problem >> (I will try to make another go with 4.2.2-rc1). >=20 > The whole of 4.2.2-rc1 has the same (smep still present in > trampoline_cr4_features) outcome. >> >> For a bit more info I am running a kernel inside the HVM guest which s= hows the >> contents of the cr4 shadow used in the trampoline. Out of interest I c= ompared >> those values to the ones used on a bare metal boot and both are identi= cal >> (0x1407F0). >> >> That somehow gives some explanation for the patch above failing. Looki= ng at the >> code for cr4 updates in vmx_update_guest_cr() a few lines above the ne= w SMEP >> handling, there already was code which would clear the PAE flag when >> paging_mode_hap(v->domain) was true. And that would need to be true if= the SMEP >> flag should get cleared. And the PAE flag was (and has to be) set befo= re. >> >=20 >> Will be looking into this further. > Going back to gather more info and to find some fix. >=20 I added some more debugging output to the hypervisor to verify the state = of HAP. This showed that while HAP is available on the system, it is not used for= the HVM guests. It looks like this would require some flags to be set when cr= eating the guest domains and I assume this is not happening because I have to st= ay with the xm stack for the libvirt setup for now (requires some repackaging whi= ch hasn't been done, yet). So the guest isn't using HAP but does seem to use some form of paging eve= n if the guest VCPU is not using paging. So I changed the vmx_update_guest_cr(= ) function in that way and that seems to prevent the hangs. Does this look = like a reasonable upstream Xen change? =46rom eccbc4cf0916c6d4388f658965c79770bd0ba10f Mon Sep 17 00:00:00 2001 From: Stefan Bader Date: Wed, 3 Apr 2013 12:06:24 +0200 Subject: [PATCH] VMX: Always disable SMEP when guest is in non-paging mod= e commit e7dda8ec9fc9020e4f53345cdbb18a2e82e54a65 VMX: disable SMEP feature when guest is in non-paging mode disabled the SMEP bit if a guest VCPU was using HAP and was not in paging mode. However I could observe VCPUs getting stuck in the trampoline after the following patch in the Linux kernel changed the way CR4 gets set up: x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline The change will set CR4 from already set flags which includes the SMEP bit. On bare metal this does not matter as the CPU is in non- paging mode at that time. But Xen seems to use the emulated non- paging mode regardless of HAP (I verified that on the guests I was seeing the issue, HAP was not used). Therefor it seems right to unset the SMEP bit for a VCPU that is not in paging-mode, regardless of its HAP usage. Signed-off-by: Stefan Bader --- xen/arch/x86/hvm/vmx/vmx.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 04dbefb..a869ed4 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -1161,13 +1161,16 @@ static void vmx_update_guest_cr(struct vcpu *v, u= nsigned int cr) if ( paging_mode_hap(v->domain) && !hvm_paging_enabled(v) ) { v->arch.hvm_vcpu.hw_cr[4] |=3D X86_CR4_PSE; v->arch.hvm_vcpu.hw_cr[4] &=3D ~X86_CR4_PAE; + } + if ( !hvm_paging_enabled(v) ) + { /* * SMEP is disabled if CPU is in non-paging mode in hardware= =2E * However Xen always uses paging mode to emulate guest non-= paging - * mode with HAP. To emulate this behavior, SMEP needs to be= - * manually disabled when guest switches to non-paging mode.= + * mode. To emulate this behavior, SMEP needs to be manually= + * disabled when guest VCPU is in non-paging mode. */ v->arch.hvm_vcpu.hw_cr[4] &=3D ~X86_CR4_SMEP; } __vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]); --------------040406050903070006030404 Content-Type: text/x-diff; name="0001-VMX-Always-disable-SMEP-when-guest-is-in-non-paging-.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename*0="0001-VMX-Always-disable-SMEP-when-guest-is-in-non-paging-.pa"; filename*1="tch" =46rom eccbc4cf0916c6d4388f658965c79770bd0ba10f Mon Sep 17 00:00:00 2001 From: Stefan Bader Date: Wed, 3 Apr 2013 12:06:24 +0200 Subject: [PATCH] VMX: Always disable SMEP when guest is in non-paging mod= e commit e7dda8ec9fc9020e4f53345cdbb18a2e82e54a65 VMX: disable SMEP feature when guest is in non-paging mode disabled the SMEP bit if a guest VCPU was using HAP and was not in paging mode. However I could observe VCPUs getting stuck in the trampoline after the following patch in the Linux kernel changed the way CR4 gets set up: x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline The change will set CR4 from already set flags which includes the SMEP bit. On bare metal this does not matter as the CPU is in non- paging mode at that time. But Xen seems to use the emulated non- paging mode regardless of HAP (I verified that on the guests I was seeing the issue, HAP was not used). Therefor it seems right to unset the SMEP bit for a VCPU that is not in paging-mode, regardless of its HAP usage. Signed-off-by: Stefan Bader --- xen/arch/x86/hvm/vmx/vmx.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c index 04dbefb..a869ed4 100644 --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -1161,13 +1161,16 @@ static void vmx_update_guest_cr(struct vcpu *v, u= nsigned int cr) if ( paging_mode_hap(v->domain) && !hvm_paging_enabled(v) ) { v->arch.hvm_vcpu.hw_cr[4] |=3D X86_CR4_PSE; v->arch.hvm_vcpu.hw_cr[4] &=3D ~X86_CR4_PAE; + } + if ( !hvm_paging_enabled(v) ) + { /* * SMEP is disabled if CPU is in non-paging mode in hardware= =2E * However Xen always uses paging mode to emulate guest non-= paging - * mode with HAP. To emulate this behavior, SMEP needs to be= =20 - * manually disabled when guest switches to non-paging mode.= + * mode. To emulate this behavior, SMEP needs to be manually= + * disabled when guest VCPU is in non-paging mode. */ v->arch.hvm_vcpu.hw_cr[4] &=3D ~X86_CR4_SMEP; } __vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]); --=20 1.7.9.5 --------------040406050903070006030404-- --------------enig9DA6CE2CE417F017603C29A6 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCgAGBQJRXBj3AAoJEOhnXe7L7s6j1c8P/3uoDR/GXMwd+vKHaWJRs3Y0 7VBvDqj0Lb1Y107euxbV45ff3nnNEiTSILC1q20di/qcOtZdBU/6Uzz5hn9UJe+T qeR2rMyJR2akbjHMB7/PfSjAkArW7LOzMJapkTASN7Xd8xWn3Hwi/tZrfRshrG70 l8ABp0wZV0YvGaVONQ32qYsOHWRlucAL/2h31ijSplj+wOOTK4YtvxTu0Z2C38mW h5dD3rNtmHci4slSji5SlKrfpVD+nJanJ7c0bFhw6h9V8VuaEqUqEQDHBvzjZE+O Jrcbdc7fIEZIS/Fn8EruJzcE7PegXZf0gMAVyxDlRqMscLzsvTQ1O22Raqx1ha8X tRP1XwxM6GlP4c+msbMxuaEW031dY+iFS8AFYujmOm8h/I4AUt8hlSu04LUhXMGQ u8oRcBrBI4fExHlmRaUyokYJ77mSuuyiTLlrGyArwOba6Z8iRsaYRo3rzAdWcFxU oDUcwqRf14GH7R+UGe8tHzj2UfWluygJNcdbn0+BE8DpTYJ2TfAGUT4PXYPu8Vit VA3+CuxCX02AwiR6jCt5me2CQUfL7oePXvf/2f1oOFfCDkCDifiyozrldGL5xIj5 lINiSz00ZLTY3iq/5/BlZCuYCXE2kK+pEdpTtBTAlb1QFwxwA8sf6snJ5Dmde2B+ cPRcv/5XM+H1uD9b387s =KlVH -----END PGP SIGNATURE----- --------------enig9DA6CE2CE417F017603C29A6-- --===============7243676430921070825== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============7243676430921070825==--