From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: [PATCH v6 for-xen-4.5 3/3] dpci: Replace tasklet with an softirq (v6) Date: Thu, 25 Sep 2014 12:05:15 -0400 Message-ID: <20140925160515.GA25896@laptop.dumpdata.com> References: <1411438215-8146-1-git-send-email-konrad.wilk@oracle.com> <1411438215-8146-4-git-send-email-konrad.wilk@oracle.com> <5424490002000078000390C5@mail.emea.novell.com> <20140925152722.GA24487@laptop.dumpdata.com> <542454CD02000078000391AE@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta4.messagelabs.com ([85.158.143.247]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1XXBXp-0002pL-G1 for xen-devel@lists.xenproject.org; Thu, 25 Sep 2014 16:05:25 +0000 Content-Disposition: inline In-Reply-To: <542454CD02000078000391AE@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: xen-devel@lists.xenproject.org, keir@xen.org, ian.jackson@eu.citrix.com, ian.campbell@citrix.com, tim@xen.org List-Id: xen-devel@lists.xenproject.org On Thu, Sep 25, 2014 at 04:45:49PM +0100, Jan Beulich wrote: > >>> On 25.09.14 at 17:27, wrote: > > On Thu, Sep 25, 2014 at 03:55:28PM +0100, Jan Beulich wrote: > >> >>> On 23.09.14 at 04:10, wrote: > >> > +/* > >> > + * If we are racing with softirq_dpci (masked is still set) we return > >> > + * -EAGAIN. Otherwise we return 0. > >> > + * > >> > + * If it is -EAGAIN, it is the callers responsibility to kick the softirq > >> > + * (with the event_lock dropped). > >> > >> But pt_pirq_cleanup_check() doesn't do this - is the comment > >> misleading or that particular call site reacting wrongly? Actually the > >> other call site doesn't kick any softirq either - what am I missing here? > > > > The one call side that does is the 'pt_pirq_create..' which calls > > 'pt_pirq_reset'. The other ones: > > a) domain_kill->domain_relinquish_resources->pci_release_devices->pci_clean_dpci_irq > > b) pt_pirq_cleanup_check > > > > are missing it. It is easy with a)- just add the process_pending_softirq()) in > > when we are not holding the lock. But b) is much harder as we would need to > > alter the whole 'pirq_cleanup_check' to return an error (as the callers of > > 'pirq_cleanup_check' are holding the lock) and perculate that up.. > > Hmm, perhaps I'm misunderstanding "kick" then: If all you want is > for it to be executed, you don't need to do anything on the -EAGAIN > way out of domain_relinquish_resources(). I used the wrong phrase. Meant to say 'need to have softirq' run at some point to flush these stale ones out (if they are there of course). > > > One way to do this is by ignoring the 'pt_pirq_cleanup_check' case as > > the ramifications of that is that we would either re-use the 'pirq' > > in pt_irq_create_bind or pick 'pirq' up at pci_clean_dpci_irq and then > > remove it (and deal with the process_pending_softirq()). > > As long as that's safe to do... > > >> > + if ( pt_pirq_reset(d, pirq_dpci) ) > >> > + { > >> > + spin_unlock(&d->event_lock); > >> > + process_pending_softirqs(); > >> > + if ( ( NOW() - start ) >> 30 ) > >> > + return -EAGAIN; > >> > + goto restart; > >> > + } > >> > >> ... this still looks more like a hack, and I'm still not really certain > >> why between two uses (which is what I understand this is for) the > >> pIRQ (and hence it's softirq instance) won't be fully quiesced. > > > > Just to make it clear - the 'pirq_guest_unbind' (which is called in the > > pt_irq_destroy_bind) will take care of removing the action. So no more > > __do_IRQ calls using the 'pirq' after that. > > > > But we might have a pending softirq after we finished with > > pt_irq_destroy_bind. > > And this loop will take care of waiting it out. This problem had > > existed prior to this patch - this wait loop was done inside the > > 'tasklet_kill'. > > > > I added the 1 second timeout as I am not a fan of unbound loops. But > > I can put it back in to make it simpler (and look less hacky). > > If a softirq doesn't get run in a timely manner we're in bigger trouble > than what would warrant a timeout here. Perhaps simply put a > comment there referring to tasklet_kill() doing effectively the same > thing? Perfect! > > Jan >