From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: [xen-unstable test] 18851: regressions - FAIL [and 1 more messages] Date: Fri, 6 Sep 2013 08:57:56 -0400 Message-ID: <20130906125756.GC2590@phenom.dumpdata.com> References: <21028.43605.857256.880579@mariner.uk.xensource.com> <522713A602000078000F042D@nat28.tlf.novell.com> <21031.3663.29603.464865@mariner.uk.xensource.com> <522869E2.3040106@citrix.com> <5228934302000078000F0D15@nat28.tlf.novell.com> <52289094.7020204@citrix.com> <21033.45234.301594.983000@mariner.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1VHvby-0004W8-Gt for xen-devel@lists.xenproject.org; Fri, 06 Sep 2013 12:58:06 +0000 Content-Disposition: inline In-Reply-To: <21033.45234.301594.983000@mariner.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Jackson Cc: Keir Fraser , David Vrabel , Jan Beulich , xen-devel , Boris Ostrovsky , Malcolm Crossley List-Id: xen-devel@lists.xenproject.org On Fri, Sep 06, 2013 at 11:38:42AM +0100, Ian Jackson wrote: > Jan Beulich writes ("Re: [xen-unstable test] 18851: regressions - FAIL"): > > This looks like a red herring. Having poked about in woodlouse it looks > > like something is screwy with interrupts. The tg3 cards aren't using > > MSI and the USB controller is using edge not level handlers. Another > > machine with the same chipset is happily using MSIs. > > I did the following tests overnight: > > * 3.4.60 kernel: > > Pass! [adhoc flight 19081] > > * 3.10.10 + patch from Zoltan Kiss to limit SKB_FRAG_PAGE_ORDER > Subject: net/core: Order-3 frag allocator causes SWIOTLB bouncing under Xen > Date: Wed Sep 04 21:54:01 BST 2013 > Message-ID: <1378327638-23956-1-git-send-email-zoltan.kiss@citrix.com> > > Fail as before (in this case, timeout in debootstrap trying to > install a geust). [adhoc flight 19082] > > * 3.10.10, kernel command line "pci=noacpi and pci=nocrs" > > Total boot failure. SATA controller complaining bitterly about > lost interrupts. [adhoc flight 19085] Somebody (Andrew? David?) took a look at the box and found that the MSIs were all out of whack. I guess with the 'noacpi' parameter the thinking is that the ACPI _PRT are out of whack with the more modern kernels? I am not that familiar with oss-test - but is each of the set of boxes running a different version of the hypervisor? Meaning you don't randomly install from scratch a new version of a hypervisor on different boxes? Thanks! > > I also took woodlouse out of the main test pool, which is how we got a > push of 4.2. I'm going to put it back now, and make a change to > switch to Linux 3.4.y for general tests. > > I think this gets the 3.10.y problem off the critical path for > everything else but of course we should still fix it. I will leave > the 3.10.y push gate in place. Aye. Is this issue (network incredibly slow) only surfacing on this box? No - I thought I saw the issue on gall and lice with the upstream Linux? Are those two machines the same as woodlouse? > > Ian.