From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4ED4E86C.3080902@domain.hid> Date: Tue, 29 Nov 2011 15:13:00 +0100 From: Philippe Gerum MIME-Version: 1.0 References: <4ED38714.2000207@domain.hid> In-Reply-To: <4ED38714.2000207@domain.hid> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Adeos-main] Fasteoi unmasking issue List-Id: General discussion about Adeos List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wolfgang Mauerer Cc: "Kiszka, Jan" , adeos-main , "Hillier, Gernot" On 11/28/2011 02:05 PM, Wolfgang Mauerer wrote: > Dear all, > > we are facing some difficulties with GSI interrupt storms > originating from a PCI card that seem to be caused by > ipipe: The card is passed through to qemu-kvm (the setup > is based on the patches sent by Jan some time ago). Once > the card becomes active, we are hit by a tremendous amount > of interrupts (> 100000/s) that keep ipipe fully occupied. > The observed pattern is (excerpt from the ipipe tracer) > > :| common_interrupt+0x20 (__ipipe_spin_unlock_irqrestore+0x62) > :| __ipipe_handle_irq+0x11 (common_interrupt+0x27) > (...) > : handle_irq+0x9 (do_IRQ+0x66) > : irq_to_desc+0x4 (handle_irq+0x15) > : handle_fasteoi_irq+0x14 (handle_irq+0x22) > (...) > : unmask_ioapic_irq+0x4 (handle_fasteoi_irq+0x94) > : unmask_ioapic+0xd (unmask_ioapic_irq+0x14) > : __ipipe_spin_lock_irqsave+0x7 (unmask_ioapic+0x23) > :| __ipipe_spin_lock_irqsave+0x93 (unmask_ioapic+0x23) > :| __io_apic_modify_irq+0x4 (unmask_ioapic+0x41) > :| __ipipe_unlock_irq+0x11 (unmask_ioapic+0x66) > :| __ipipe_spin_unlock_irqrestore+0x9 (unmask_ioapic+0x75) > :| __ipipe_spin_unlock_irqrestore+0x60 (unmask_ioapic+0x75) > :| common_interrupt+0x20 (__ipipe_spin_unlock_irqrestore+0x62) > > That is, as soon as the IRQ in question is unmasked, the > next one is immediately received, and the interrupt handler > in non-RT context never gets a chance to actually service > the interrupt. > > The problem seems to be caused by unmasking the IRQ in > handle_fasteoi_irq(), and with a hack along the lines of > > --- a/kernel/irq/chip.c > +++ b/kernel/irq/chip.c > @@ -586,7 +586,8 @@ handle_fasteoi_irq(unsigned int irq, struct irq_desc > *desc) > raw_spin_lock(&desc->lock); > desc->status&= ~IRQ_INPROGRESS; > #ifdef CONFIG_IPIPE > - desc->irq_data.chip->irq_unmask(&desc->irq_data); > + if (irq != WHICHEVER_IRQ_CAUSES_THE_STORM) > + desc->irq_data.chip->irq_unmask(&desc->irq_data); > out: > #else > out: > > the issue is solved. > > So the question is: Why is it okay to unconditionally unmask > all interrupts in the fasteoi handler? All cards that re-send > interrupts at high frequencies unless they are properly handled > by their device driver should cause the same problem. > I take the early unmasking is an optimisation, or are there any > further reasons for the unconditional unmasking in > handle_fasteoi_irq()? This is not an optimization, the flow for which this code was designed for is: hw IRQ receipt chip->eoi() must mask the IRQ line ... real-time or Linux handling, clear device interrupt ... handle_fasteoi() unmask previous masking It does not cope well with the recent threaded interrupt model addition in the vanilla kernel. So it will likely break for any device with threaded level IRQ handling. > > Thanks& best regards, Wolfgang > > -- > Siemens AG, Open Source Platforms, > Corporate Competence Centre Embedded Linux > > -- Philippe.