From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: Is: events not being cleared during fast migration over InfiniBand Was: Re: xen 4.3 test report Date: Thu, 6 Jun 2013 10:25:47 +0100 Message-ID: <51B0559B.1040804@eu.citrix.com> References: <20130524141150.GA3900@phenom.dumpdata.com> <20130525114058.GH2418@localhost.localdomain> <20130603140837.GP6893@phenom.dumpdata.com> <20130605185005.GA15558@phenom.dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20130605185005.GA15558@phenom.dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Konrad Rzeszutek Wilk Cc: Ian Jackson , Vasiliy Tolstov , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On 05/06/13 19:50, Konrad Rzeszutek Wilk wrote: > On Tue, Jun 04, 2013 at 04:17:55PM +0400, Vasiliy Tolstov wrote: >> 2013/6/3 Konrad Rzeszutek Wilk : >>> The non-debug version tells me it is: >>> >>> 289 if ( (port = get_free_port(d)) < 0 ) >>> 290 ERROR_EXIT(port); >>> >>> Which gets -EEXIST from get_free_port. But get_free_port only returns >>> -EINVAL, -ENOMEM, and -ENOSPC in failure modes. >>> >>> But we get -EEXIST? Could you re-run git diff and attach output to >>> this email? I think you tweaked the debug code a bit so I am looking >>> at something different? >> >> Oh sorry. Yes i modify you patch to this version: > That is OK. >> - if ( v->virq_to_evtchn[virq] != 0 ) >> + if ( v->virq_to_evtchn[virq] != 0 ) { >> + gdprintk(XENLOG_WARNING, "d%dv%d [%s:%d], port:%d, rc:%ld\n", >> d->domain_id, >> + vcpu, __func__,__LINE__, v->virq_to_evtchn[virq], >> (long)-EEXIST); >> ERROR_EXIT(-EEXIST); > OK, so the value was 3 (event channel), and I am not sure what the virq value > was. But it looks as if somebody did not clear that and we are > tripping over it. > > George, have you seen issues with events not being cleared during migration? The other possibility, of course, is that the virq has been cleared, but that somehow the kernel is requesting the same virq twice. -George