From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCH RFC untested] kvm_set_irq: report coalesced for clear Date: Thu, 19 Jul 2012 14:18:32 +0300 Message-ID: <20120719111832.GA8693@redhat.com> References: <20120718221152.GA14049@redhat.com> <20120719075337.GQ26120@redhat.com> <20120719091719.GD20120@redhat.com> <20120719092107.GS26120@redhat.com> <20120719093329.GA10182@redhat.com> <20120719094124.GA3459@redhat.com> <20120719102648.GA14101@redhat.com> <20120719105453.GW26120@redhat.com> <20120719111213.GA4364@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Avi Kivity , Marcelo Tosatti , kvm@vger.kernel.org, Alex Williamson To: Gleb Natapov Return-path: Received: from mx1.redhat.com ([209.132.183.28]:61182 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752440Ab2GSLR6 (ORCPT ); Thu, 19 Jul 2012 07:17:58 -0400 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q6JBHvSQ030954 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Jul 2012 07:17:58 -0400 Content-Disposition: inline In-Reply-To: <20120719111213.GA4364@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Thu, Jul 19, 2012 at 02:12:13PM +0300, Michael S. Tsirkin wrote: > On Thu, Jul 19, 2012 at 01:54:53PM +0300, Gleb Natapov wrote: > > On Thu, Jul 19, 2012 at 01:26:48PM +0300, Michael S. Tsirkin wrote: > > > On Thu, Jul 19, 2012 at 12:41:24PM +0300, Gleb Natapov wrote: > > > > On Thu, Jul 19, 2012 at 12:33:29PM +0300, Michael S. Tsirkin wrote: > > > > > On Thu, Jul 19, 2012 at 12:21:07PM +0300, Gleb Natapov wrote: > > > > > > On Thu, Jul 19, 2012 at 12:17:19PM +0300, Michael S. Tsirkin wrote: > > > > > > > On Thu, Jul 19, 2012 at 10:53:37AM +0300, Gleb Natapov wrote: > > > > > > > > On Thu, Jul 19, 2012 at 01:11:53AM +0300, Michael S. Tsirkin wrote: > > > > > > > > > This creates a way to detect when kvm_set_irq(...,0) was run > > > > > > > > > twice with the same source id by returning 0 in this case. > > > > > > > > > > > > > > > > > > Signed-off-by: Michael S. Tsirkin > > > > > > > > > --- > > > > > > > > > > > > > > > > > > This is on top of my bugfix patch. Uncompiled and untested. Alex, I > > > > > > > > > think something like this patch will make it possible for you to simply > > > > > > > > > do > > > > > > > > > if (kvm_set_irq(...., 0)) > > > > > > > > > eventfd_signal() > > > > > > > > > > > > > > > > > Why caller can't track line state? > > > > > > > > > > > > > > Why duplicate information? As we are finding it's not trivial to keep > > > > > > > the two in sync. Think about migration etc ... > > > > > > > > > > > > > We do not migrate irq_states. The caller already have to have enough > > > > > > information to recreate its state and it should migrate the info, so why > > > > > > should we go all the way down the call chain to find something that is > > > > > > already known? > > > > > > > > > > Hmm it's an interesting point. Looks like irqfds for level lose state > > > > > across migration. Of course Alex wants to use them for assignment which > > > > > currently disables migration, but we are talking about a generic API, > > > > > so it's a problem that there's no way to retrieve the state. > > > > > > > > > There is no any problem. Source knows what the line status is. > > > > > > With EOIFD and level IRQFD, it does not. > > > > > So this is again eventfd and level interrupts incompatibility problem? > > At some level, yes. > > > > > Furthermore this is a (benign) bug if device calls irq_set with > > > > the same level since it results in needless system calls. Qemu guilty > > > > of it and _that_ should be fixed. > > > > > > Fine but we are arguably returning a wrong result in that case: > > > set_irq twice to 0 return 1 each time. I would expect 0 the > > > second time. > > It returns 0 if interrupt was coalesced. It was not. > > Not really, if you call it with level 0 you always get 1 back. > Look at kvm_ioapic_set_irq, see what happens if level is 0. > It looks like a bug though a harmless one. > > > > > > > > > > > > > > Also migration is only one example. Duplicated state is generally > > > > > nasty. We would need extra locking too which is not nice. > > > > > > > > > I don't know what extra locking you are talking about, but calling > > > > kvm_set_irq() repeatedly with the same level will do a lot of unnecessary > > > > locking in ioapic. > > > > > > I am talking about Alex's EOIFD. This is what this patch is trying > > > to help. > > > > > Can you point me to exact problem in Alex's patch? > > It's very simple. Alex adds an interface to clear the level > automatically from guest on EOI. So the caller has no way to know the > current state for a given source ID and can not restore it after > migration. That's one problem :). The second problem is it is adding more spinlocks around kvm_set_irq, so if we want to avoid vcpu scans under spinlock, we'll have more work to do. I'm not sure how serious this second problem is, Avi nacked a patch because of a similar issue in the past but that had to deal with MSI. > > -- > > Gleb.