From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: [linux-4.1 test] 63030: regressions - FAIL Date: Thu, 22 Oct 2015 10:50:54 +0100 Message-ID: <1445507454.9563.252.camel@citrix.com> References: <20151019135155.GB13286@zion.uk.xensource.com> <22054.21022.517755.482055@mariner.uk.xensource.com> <20151020152423.GC29090@zion.uk.xensource.com> <22054.24314.632722.619437@mariner.uk.xensource.com> <1445446026.32735.18.camel@citrix.com> <20151021173405.GG5060@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20151021173405.GG5060@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: xen-devel@lists.xensource.com, Ian Jackson , osstest service owner List-Id: xen-devel@lists.xenproject.org On Wed, 2015-10-21 at 18:34 +0100, Wei Liu wrote: > On Wed, Oct 21, 2015 at 05:47:06PM +0100, Ian Campbell wrote: > > On Tue, 2015-10-20 at 16:34 +0100, Ian Jackson wrote: > > > Wei Liu writes ("Re: [Xen-devel] [linux-4.1 test] 63030: regressions > > > - FAIL"): > > > > From mere code inspection and document of lwip 1.3.0 I think mini > > > -os > > > > does send gratuitous ARP. > > > > > > The guest is using the PVHVM drivers at this point, with the backend > > > directly in dom0, so it is the guest's gratuitous arp which is > > > needed, > > > I think. > > > > It would be worth investigating whether mini-os's gratuitous ARP might > > also be occurring and confusing things, e.g. by coming after and > > therefore taking precedence over the one coming from the guest. > > > > Several observations: > > 1. The guest doesn't always send gratuitous arp -- but this might not be > the cause of this failure. Guest works fine when using qemu-trad > only. As in it always sends the arp when using qemu-trad, or that it is fine irrespective of not always sending it? > 2. Guest only sends one gratuitous arp at most. This is as expected, but does the stubdom also send one? > 3. When using stubdom, guest is a lot less responsive. See two > experiments and analysis below. Less responsive in use or only while migrating, or to ssh after migration, or to something else? > Scenario 1: > xl shows "Migration successful." > ...30s... > xenbr0 receives gratuitous arp > ...1s... > ssh date command comes back > > Scenario 2: > xenbr0 receives gratuitous arp > ...1s... > xl shows "Migration successful." > ssh date command comes back > > When stubdom was not present I never saw scenario 1. It would be worth looking at the possibility of a delay between "Migration successful" and the target domain actually running. A 30s delay between the guest restarting and it sending the ARP would be pretty strange IMHO > Note that my machine is relative old (>6 years). It would never pass > the test in osstest because in osstest the timeout is 10s. > > The slowness in osstest seems to be host specific because all failures > in guest migrate test failed on merlot*. It's not only linux-4.1 is > failing, other branches fail the same test step on merlot*, too. This could be a factor in common with the other qmu timeout on merlot which led to 9acfbe14d726. It might be worth prodding AMD over that issue again. Ian.