From mboxrd@z Thu Jan  1 00:00:00 1970
From: Wei Liu <wei.liu2@citrix.com>
Subject: Re: [linux-4.1 test] 63030: regressions - FAIL
Date: Wed, 21 Oct 2015 12:07:01 +0100
Message-ID: <20151021110701.GD5060@zion.uk.xensource.com>
References: <osstest-63030-mainreport@xen.org>
	<20151019135155.GB13286@zion.uk.xensource.com>
	<22054.21022.517755.482055@mariner.uk.xensource.com>
	<20151020152423.GC29090@zion.uk.xensource.com>
	<1445418254.9563.55.camel@citrix.com>
	<20151021092457.GB5060@zion.uk.xensource.com>
	<1445420688.9563.71.camel@citrix.com>
	<20151021103529.GC5060@zion.uk.xensource.com>
	<1445424504.9563.85.camel@citrix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Content-Disposition: inline
In-Reply-To: <1445424504.9563.85.camel@citrix.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>, xen-devel@lists.xensource.com, Wei Liu <wei.liu2@citrix.com>, osstest service owner <osstest-admin@xenproject.org>
List-Id: xen-devel@lists.xenproject.org

On Wed, Oct 21, 2015 at 11:48:24AM +0100, Ian Campbell wrote:
> On Wed, 2015-10-21 at 11:35 +0100, Wei Liu wrote:
> > On Wed, Oct 21, 2015 at 10:44:48AM +0100, Ian Campbell wrote:
> > > On Wed, 2015-10-21 at 10:24 +0100, Wei Liu wrote:
> > > > On Wed, Oct 21, 2015 at 10:04:14AM +0100, Ian Campbell wrote:
> > > > > On Tue, 2015-10-20 at 16:24 +0100, Wei Liu wrote:
> > > > > > But this is only code inspection,  so I'm not very confident whether
> > > > > > everything does what it says it does.
> > > > > 
> > > > > Right,. I think this one probably needs someone to setup a system in a
> > > > > similar configuration and play with it.
> > > > > 
> > > > 
> > > > Is there an easy way to do that? Say, give me some runes so that I can
> > > > lock a machine in Cambridge instance, run the failing test case.
> > > 
> > > I could[0] but, why can't you just set things up on your existing test
> > > hosts, either using standalone mode or by just installing the guest by
> > > hand?
> > > 
> > > That's what I would do (probably the latter) in the first instance. It's
> > > very likely IME that you are going to need to poke at this interactively
> > > while debugging and to run repeated migrations etc to trigger the issue.
> > > IMHO trying to use osstest for such manual debugging is just going to get
> > > in the way.
> > > 
> > 
> > I could do all these manually, but not without paying much attention:
> > allocating a new test box (all my test boxes are in use at the moment),
> > run standalone mode, use standalone mode to install the test box, grab
> > various tarballs from osstest website if I don't want to build them
> > again, put them in suitable location and use standalone script to fiddle
> > with standalone mode database, manually install a guest etc etc,  let
> > alone the bug we're hunting might not be reproducible on the new test
> > box due to different hardware and external environment (as we've already
> > witnessed in production osstest system), then I'm left in dilemma
> > wondering whether I should repeat all these things (well, part of) again
> > or just give up.
> > 
> > This looks like a list of endless tedious tasks and it could go wrong
> > many places in between. If I can get OSSTest to lock a box and run up to
> > the point that it reproduces the issue that would be of great help.
> 
> This seems to me to be making a mountain out of a mole hill, installing a
> Xen host should be bread and butter for most of us.
> 
> However, since you insist, I recently added some explanation in README of
> how to make an adhoc job including cloning a previous flight and forcing it
> to run on a given machine (useful if you think it might be machine
> specific).
> 
> There is no mechanical way to then lock a host on failure. What I usually
> do is run the mg-allocate run I mentioned in my previous mail after the
> test case has already started. Since mg-allocate has a higher priority than
> regular jobs, but with -U waits for the current job to finish, you are
> basically guaranteed that your mg-allocate will get the host next.
> 

Thanks. I will have a look at Osstest README to determine which way to
proceed is better.

> > Furthermore, I can write down all the runes I use so that other people
> > can do the same to reproduce bugs discovered in osstest. That would
> > certainly help lower the barrier for people who want to help triaging
> > bugs.
> 
> This sort of thing is of no help with triage. It might be useful for
> debugging and reproducing an issue, but triage does not involve doing such
> things, it is the step before.
> 
> I'm being pedantic here because I don't think it is helpful to overstate
> what triage involves, since that will put people off doing useful triage
> activities.
> 

Right, I actually meant bug fixing.

Wei.

> Ian.