From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: Xen HVM regression on certain Intel CPUs Date: Wed, 27 Mar 2013 12:04:27 -0400 Message-ID: <20130327160427.GB6688@phenom.dumpdata.com> References: <51530F9F.10805@canonical.com> <515315EC.4030803@canonical.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <515315EC.4030803@canonical.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Stefan Bader , wei.y.yang@intel.com, haitao.shan@intel.com, xin.li@intel.com Cc: "xen-devel@lists.xensource.com" , "H. Peter Anvin" List-Id: xen-devel@lists.xenproject.org On Wed, Mar 27, 2013 at 04:53:16PM +0100, Stefan Bader wrote: > On 27.03.2013 16:26, Stefan Bader wrote: > > Recently I ran some experiments on newer hardware and realized that when booting > > any kernel newer or equal to v3.5 (Xen version 4.2.1) in 64bit mode would fail > > to bring up any APs (message about CPU Stuck). I was able to normally bisect > > into a range of realmode changes and then manually drill down to the following > > commit: > > > > commit cda846f101fb1396b6924f1d9b68ac3d42de5403 > > Author: Jarkko Sakkinen > > Date: Tue May 8 21:22:46 2012 +0300 > > > > x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline > > > > This patch changes 64-bit trampoline so that CR4 and > > EFER are provided by the kernel instead of using fixed > > values. > > > > From the Xen debugging console it was possible to gather a bit more data which > > pointed to a failure very close to setting CR4 in startup_32. On this particular > > hardware the saved CR4 (about to be set) was 0x1407f0. > > > > This would set two flags that somehow feel dangerous: PGE (page global enable) > > and SMEP (supervisor mode execution protection). SMEP turns out to be the main > > offender and the following change allows the APs to start: > > > > --- a/arch/x86/realmode/rm/trampoline_64.S > > +++ b/arch/x86/realmode/rm/trampoline_64.S > > @@ -93,7 +93,9 @@ ENTRY(startup_32) > > movl %edx, %fs > > movl %edx, %gs > > > > - movl pa_tr_cr4, %eax > > + movl $X86_CR4_SMEP, %eax > > + notl %eax > > + andl pa_tr_cr4, %eax > > movl %eax, %cr4 # Enable PAE mode > > > > # Setup trampoline 4 level pagetables > > > > Now I am not completely convinced that this is really the way to go. Likely the > > Xen hypervisor should not start up the guest with CR4 on the BP containing those > > flags. But maybe it still makes sense to mask some dangerous ones off in the > > realmode code (btw, it seemed that masking the assignments in arch_setup or > > setup_realmode did not work). > > > > And finally I am wondering why the SMEP flag in CR4 is set anyway. My > > understanding would be that this should only be done if cpuid[7].ebx has bit7 > > set. And this does not seem to be the case at least on the one box I was doing > > the bisection on. > > Seems that I was relying on the wrong source of information when checking SMEP > support. The cpuid command seems at fail. But /proc/cpuinfo reports it. So that > at least explains where that comes from... sorry for that. OK, so if you boot Xen with smep=1 (which disables SMEP, kind of counterintuive flag) that would work fine? CC-ing the Intel folks who added this in.