All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Ian Campbell <Ian.Campbell@citrix.com>, konrad.wilk@oracle.com
Cc: Boris.Ostrovsky@oracle.com, xen-devel@lists.xensource.com,
	"xen.org" <Ian.Jackson@eu.citrix.com>,
	David Vrabel <david.vrabel@citrix.com>
Subject: Re: [linux-linus test] 25478: regressions - FAIL
Date: Mon, 17 Mar 2014 15:36:13 -0400	[thread overview]
Message-ID: <20140317193613.GA15772@phenom.dumpdata.com> (raw)
In-Reply-To: <1395054505.4122.65.camel@kazak.uk.xensource.com>

On Mon, Mar 17, 2014 at 11:08:25AM +0000, Ian Campbell wrote:
> On Fri, 2014-03-14 at 14:23 -0400, Konrad Rzeszutek Wilk wrote:
> > On Fri, Mar 14, 2014 at 05:07:35PM +0000, Ian Campbell wrote:
> > > On Fri, 2014-03-14 at 16:42 +0000, xen.org wrote:
> > > > flight 25478 linux-linus real [real]
> > > > http://www.chiark.greenend.org.uk/~xensrcts/logs/25478/
> > > > 
> > > > Regressions :-(
> > > > 
> > > > Tests which did not succeed and are blocking,
> > > > including tests which could not be run:
> > > >  test-amd64-i386-pair   17 guest-migrate/src_host/dst_host fail REGR. vs. 12557
> > > 
> > > Is anyone looking at these? Apparently this hasn't passed for 23 months:
> > 
> > I believe I asked to tweak somethings 23 months ago to troubleshoot this
> > but never heard back from you.
> 
> Was that me? I didn't think I had touched osstest at all that long ago
> apart from occasionally pinging folks when things looked to be failing.
> In any case sorry for letting it all through the cracks. Can you
> remember what the tweaks were? (I'm a bit reluctant to play "tweak the
> test case until it passes", but lets see what they are first).
> 
> What's weird is that the linux-3.4 and linux-3.10 stable branch flights
> seem to be passing at least some of the time, although looking at the
> history there they do seem to be hitting a failure in the same test
> cases at least sometimes.
> 
> > > http://xenbits.xen.org/gitweb/?p=linux-pvops.git;a=shortlog;h=refs/heads/tested/linux-linus
> > > 
> > > Looking through the recent failures this migration one seems quite
> > > common but there seem to be a few others, search for "[linux-linux
> > > test]" in http://lists.xen.org/archives/html/xen-devel/2014-03/ for some
> > > examples.
> > > 
> > > The particular failure here is
> > > http://www.chiark.greenend.org.uk/~xensrcts/logs/25478/test-amd64-i386-pair/info.html and the console logs http://www.chiark.greenend.org.uk/~xensrcts/logs/25478/test-amd64-i386-pair/serial-gall-mite.log  are full of
> > 
> > The 'info.html' you alluded - I just saw that happening with Xen 4.5, but
> > I don't see the SWIOTLB issue in my dom0.
> > 
> > >         Mar 14 13:24:27.592641 [ 1010.742462] mptsas 0000:03:00.0: swiotlb buffer is full
> > > 
> > > The migration itself times out after 5 minutes or so (for a 512M guest)
> > 
> > The more recent Linux kernel
> 
> This should be the most recent kernel, it's testing Linus' master. Here
> it is v3.14-rc6+ at changeset c60f7d5a8e7c639de5d9dfe07e1e91d302d506e4.
> 
> FWIW this happened again in 25558 over the weekend and
> http://www.chiark.greenend.org.uk/~xensrcts/logs/25558/test-amd64-i386-pair/serial-gall-mite.log shows some different messages along with the ones quoted above:
>         Mar 16 12:51:30.074927 [  845.982443] swiotlb_tbl_map_single: 269 callbacks suppressed
>         [...]
>         Mar 16 12:51:30.099339 [  845.982464] mptsas 0000:03:00.0: swiotlb buffer is full (sz: 4096 bytes)
> AFAICS all of the latter are 4k sized.
> 
> I had to go back to
> http://www.chiark.greenend.org.uk/~xensrcts/logs/25558/test-amd64-i386-pair/serial-gall-mite.log.0 to find the first such messages, there I see some bnx2 related ones as well e.g.
>         Mar 16 12:46:48.315302 [  579.735844] bnx2 0000:02:00.0: swiotlb buffer is full (sz: 21024 bytes)
> 
> These all appear to start only after the guest is launched, but there is
> no big smoking splat around that time like I was hoping for (i.e. around
> "Mar 16 12:46:20.837409 (d1)").
> 
> The actual dom0 boot looks ok to me. FWIW:
>         Mar 16 12:37:31.756322 [    0.000000] software IO TLB [mem 0x1ba93000-0x1fa93000] (64MB) mapped at [dba93000-dfa92fff]
> 
> Interesting that the issues seem to happen on the 64-bit side (this test
> is a migration from 64- to 32-bit host). It's also strange that
> test-amd64-amd64-xl and other amd64 based tests don't seem to be
> affected.
> 
> >  will also tell you what type of requests it was.
> 
> Do you mean the size? It seems to print that only for certain requests.
> 

Right. They aren't that big - and only enough the mptsas is a 4KB one?

> > You might also want to try a larger SWIOTLB buffer, swiotlb=26422 for fun.
> 
> Any reason for that particular number?

It would allocate a larger SWIOTLB space - in case you are using at near its
capacity.
> 
> > I think you are looking at two different issues.
> 
> You mean you think the swiotlb issue is unrelated to the slow migration
> timeouts? I can believe that.
> 
> Ian.
> 

  reply	other threads:[~2014-03-17 19:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-14 16:42 [linux-linus test] 25478: regressions - FAIL xen.org
2014-03-14 17:07 ` Ian Campbell
2014-03-14 18:23   ` Konrad Rzeszutek Wilk
2014-03-17 11:08     ` Ian Campbell
2014-03-17 19:36       ` Konrad Rzeszutek Wilk [this message]
2014-03-18  9:28         ` Ian Campbell
2014-03-18  9:43           ` Ian Campbell
2014-03-18  9:29         ` Ian Campbell
2014-03-18 10:55           ` David Vrabel
2014-03-18 11:04             ` Ian Campbell
2014-03-18 11:10               ` Ian Campbell
2014-03-18 11:25                 ` David Vrabel
2014-03-18 11:29                   ` Ian Campbell
     [not found]                     ` <53283118.5050706@citrix.com>
     [not found]                       ` <1395143026.12847.47.camel@kazak.uk.xensource.com>
2014-03-27 18:00                         ` [linux-linus test] 25677: " xen.org
2014-04-10 15:37                           ` [linux-linus test] 25478: regressions - FAIL [and 1 more messages] Ian Jackson
2014-04-16 10:12                             ` Ian Jackson
2014-04-16 10:53                               ` David Vrabel
2014-04-16 11:47                                 ` Ian Jackson
2014-04-16 13:48                                   ` David Vrabel
2014-03-17 12:01   ` [linux-linus test] 25478: regressions - FAIL David Vrabel
2014-03-17 12:23     ` Ian Campbell
2014-03-17 19:38     ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140317193613.GA15772@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=Boris.Ostrovsky@oracle.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=david.vrabel@citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.