From mboxrd@z Thu Jan  1 00:00:00 1970
From: Ian Campbell <ian.campbell@citrix.com>
Subject: Re: [xen-unstable test] 65141: regressions - FAIL
Date: Mon, 7 Dec 2015 16:28:36 +0000
Message-ID: <1449505716.29724.66.camel@citrix.com>
References: <osstest-65141-mainreport@xen.org>
	<22104.31917.532615.949661@mariner.uk.xensource.com>
	<9E79D1C9A97CFD4097BCE431828FDD31023BAE97@SHSMSX103.ccr.corp.intel.com>
	<1449052492.4424.30.camel@citrix.com>
	<1449064269.4424.73.camel@citrix.com>
	<1449302981.3451.3.camel@citrix.com>
	<5665BF5202000078000BCC0E@prv-mh.provo.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <5665BF5202000078000BCC0E@prv-mh.provo.novell.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>, Ian Jackson <Ian.Jackson@eu.citrix.com>, Jun Nakajima <jun.nakajima@intel.com>, KevinTian <kevin.tian@intel.com>, Robert Hu <robert.hu@intel.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>, "xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>, osstestservice owner <osstest-admin@xenproject.org>
List-Id: xen-devel@lists.xenproject.org

On Mon, 2015-12-07 at 09:18 -0700, Jan Beulich wrote:
> > > > On 05.12.15 at 09:09, <ian.campbell@citrix.com> wrote:
> > On Wed, 2015-12-02 at 13:51 +0000, Ian Campbell wrote:
> > 
> > > http://osstest.test-lab.xenproject.org/~osstest/pub/logs/65301/ 
> > > 
> > > I think that ought to give a baseline for the bisector to work with.
> > > I'll
> > > prod it to do so.
> > 
> > Results are below. TL;DR: d02e84b9d9d "vVMX: use latched VMCS machine
> > address" is somehow at fault.
> > 
> > It appears to be somewhat machine specific, the one this has been
> > failing on is godello* which says "CPU0: Intel(R) Xeon(R) CPU E3-1220
> > v3 @ 3.10GHz stepping 03" in its serial log.
> > 
> > Andy suggested this might be related to cpu_has_vmx_vmcs_shadowing
> > so Haswell and newer vs IvyBridge and older.
> 
> Yeah, but on irc it was also made clear that the regression is on a
> system without that capability.

What I was trying to say he said was that the difference between working
and broken hosts might be spread along the lines of >=Haswell vs
<=IvyBridge.

How that maps onto E3-1220, which is what is exhibiting the issue, I leave
to you guys.

> At this point we certainly need to seriously consider reverting the
> whole change. The reason I continue to be hesitant is that I'm
> afraid this may result in no-one trying to find out what the problem
> here is. While I could certainly try to, I'm sure I won't find time to
> do so within the foreseeable future. And since we didn't get any
> real feedback from Intel so far, I thought I'd ping them to at least
> share some status before we decide. That pinging has happened
> a few minutes ago. I'd therefore like to give it, say, another day,
> and if by then we don't have an estimate for when a fix might
> become available, I'd do the revert. Unless of course somebody
> feels strongly about doing the revert immediately.

I don't mind waiting.

One approach to fixing might be to disentangle the various things which
this patch did, such that the actual culprit is a smaller thing to analyse.

Ian.