From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefan Bader Subject: Re: Xen HVM regression on certain Intel CPUs Date: Wed, 27 Mar 2013 17:45:31 +0100 Message-ID: <5153222B.3030605@canonical.com> References: <51530F9F.10805@canonical.com> <515315EC.4030803@canonical.com> <20130327160427.GB6688@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5713857000656956202==" Return-path: In-Reply-To: <20130327160427.GB6688@phenom.dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk Cc: wei.y.yang@intel.com, "xen-devel@lists.xensource.com" , haitao.shan@intel.com, xin.li@intel.com, "H. Peter Anvin" List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --===============5713857000656956202== Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="------------enigD3571A951775CE336F2BBB1D" This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigD3571A951775CE336F2BBB1D Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 27.03.2013 17:04, Konrad Rzeszutek Wilk wrote: > On Wed, Mar 27, 2013 at 04:53:16PM +0100, Stefan Bader wrote: >> On 27.03.2013 16:26, Stefan Bader wrote: >>> Recently I ran some experiments on newer hardware and realized that w= hen booting >>> any kernel newer or equal to v3.5 (Xen version 4.2.1) in 64bit mode w= ould fail >>> to bring up any APs (message about CPU Stuck). I was able to normally= bisect >>> into a range of realmode changes and then manually drill down to the = following >>> commit: >>> >>> commit cda846f101fb1396b6924f1d9b68ac3d42de5403 >>> Author: Jarkko Sakkinen >>> Date: Tue May 8 21:22:46 2012 +0300 >>> >>> x86, realmode: read cr4 and EFER from kernel for 64-bit trampolin= e >>> >>> This patch changes 64-bit trampoline so that CR4 and >>> EFER are provided by the kernel instead of using fixed >>> values. >>> >>> From the Xen debugging console it was possible to gather a bit more d= ata which >>> pointed to a failure very close to setting CR4 in startup_32. On this= particular >>> hardware the saved CR4 (about to be set) was 0x1407f0. >>> >>> This would set two flags that somehow feel dangerous: PGE (page globa= l enable) >>> and SMEP (supervisor mode execution protection). SMEP turns out to be= the main >>> offender and the following change allows the APs to start: >>> >>> --- a/arch/x86/realmode/rm/trampoline_64.S >>> +++ b/arch/x86/realmode/rm/trampoline_64.S >>> @@ -93,7 +93,9 @@ ENTRY(startup_32) >>> movl %edx, %fs >>> movl %edx, %gs >>> >>> - movl pa_tr_cr4, %eax >>> + movl $X86_CR4_SMEP, %eax >>> + notl %eax >>> + andl pa_tr_cr4, %eax >>> movl %eax, %cr4 # Enable PAE mode >>> >>> # Setup trampoline 4 level pagetables >>> >>> Now I am not completely convinced that this is really the way to go. = Likely the >>> Xen hypervisor should not start up the guest with CR4 on the BP conta= ining those >>> flags. But maybe it still makes sense to mask some dangerous ones off= in the >>> realmode code (btw, it seemed that masking the assignments in arch_se= tup or >>> setup_realmode did not work). >>> >>> And finally I am wondering why the SMEP flag in CR4 is set anyway. My= >>> understanding would be that this should only be done if cpuid[7].ebx = has bit7 >>> set. And this does not seem to be the case at least on the one box I = was doing >>> the bisection on. >> >> Seems that I was relying on the wrong source of information when check= ing SMEP >> support. The cpuid command seems at fail. But /proc/cpuinfo reports it= =2E So that >> at least explains where that comes from... sorry for that. >=20 > OK, so if you boot Xen with smep=3D1 (which disables SMEP, kind of coun= terintuive flag) > that would work fine? Rebooting with smep=3D1 as a hv argument does not fix it. But I would be = careful since I just quickly did this without checking whether Xen 4.2.1 undestan= ds the flag already. Second using x86info --all on bare metal does show bits set for cpuid[7] = and /proc/cpuinfo values are consistent across BP and APs. So I am a tool for= using the wrong tool there. So I would say the main issue to look at is why reading cr4 as a HVM gues= t produces the flags on boot. Surely the hypervisor itself has set certain = things up but likely there are some epxectations about the initial setup on boot= =2E >=20 > CC-ing the Intel folks who added this in. >=20 > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >=20 --------------enigD3571A951775CE336F2BBB1D Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCgAGBQJRUyIrAAoJEOhnXe7L7s6j/AIP/R7twW/Yg5tTfRSqMBZ8ldT/ 8/GO8GiqwDQLlQyss6KJBL2b7jHYvyi/5V6K4y+ElQX8JA+IZctY+uey2g8ihFxF Sg967m+pQYWcNQx6f4anxN1KQUMTM8gHVVVfDA04fyAkwhLhPeD6nwTxUNjcoTwy I929Jw7RgMvXX50jSdiqhIpQBEjc+9VRfr4/PVcN3PNy9BXIeBfyp1pb9Waem7pj Jw16X+mMZW1sMFWtwspf5eaYNkzvOdaXWNKmw3VTyIlhfgARrqXldnp/L4kVM1f7 3uC/+qskcrsvM0LMmgiQIQX+MsMyMxPvbe9HAdzGFRGs9cQ9m7LFQ9/cWU4iVWMB NzyDVRZopC2flnUteKEj8u8NGZKmnIAYfuezPpdbrxfP6EthtgZxgdTKk/2a6Hs4 FBswnAqANgP2Oqfp2jigo54YuZuVvuHfty32hUXZeU6meSczVPWVOIfw/ZkgyEU5 iQ4S09yjO72LMAlrWu2nCuggXAQElQhl5A1Ie7fXn9bn9rjiHJiHn/iPSV2w9vT2 N6Ay1cHaBIPfCf9xGWnf+jnxJLzgz9PMGROXbhGrtsZ3mXsmCUK7ljguoXQKOwWw AJxU7oBxI92BPz8I+DScbecIb6NzRba8QgtRD673ZCgp2nBBbi98upYOIVoQ97v6 if50XqA4HPcCLKFseA+b =tWQM -----END PGP SIGNATURE----- --------------enigD3571A951775CE336F2BBB1D-- --===============5713857000656956202== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============5713857000656956202==--