From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [GIT PULL] Fix lost interrupt race in Xen event channels Date: Fri, 27 Aug 2010 09:56:43 +0100 Message-ID: <4C7799EB020000780001276F@vpn.id2.novell.com> References: <4C743B2C.8070208@goop.org> <4C74E7C802000078000120C0@vpn.id2.novell.com> <4C7558E0.1060806@goop.org> <4C7629D10200007800012387@vpn.id2.novell.com> <4C769736.4050409@goop.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <4C769736.4050409@goop.org> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jeremy Fitzhardinge Cc: "Xen-devel@lists.xensource.com" , Tom Kopec , Daniel Stodden List-Id: xen-devel@lists.xenproject.org >>> On 26.08.10 at 18:32, Jeremy Fitzhardinge wrote: > On 08/25/2010 11:46 PM, Jan Beulich wrote: >> >>> On 25.08.10 at 19:54, Jeremy Fitzhardinge wrote: >>> Note that this patch is specifically for upstream Xen, which doesn't >>> have any pirq support in it at present. >> I understand that, but saw that you had paralleling changes to the >> pirq handling in your Dom0 tree. >> >>> However, I did consider using fasteoi, but I couldn't see how to make >>> it work. The problem is that it only does a single call into the >>> irq_chip for EOI after calling the interrupt handler, but there is no >>> call beforehand to ack the interrupt (which means clear the event flag >>> in our case). This leads to a race where an event can be lost after = the >>> interrupt handler has returned, but before the event flag has been >>> cleared (because Xen won't set pending or call the upcall function if >>> the event is already set). I guess I could pre-clear the event in the >>> upcall function, but I'm not sure that's any better. >> That's precisely what we're doing. >=20 > You mean pre-clearing the event? OK. >=20 > But aren't you still subject to the bug the switch to handle_edge_irq = fixed? >=20 > With handle_fasteoi_irq: >=20 > cpu A cpu B > get event mask and clear event > set INPROGRESS > call action > : > : > > : get event Cannot happen, event is masked (i.e. all that would happen is that the event occurrence would be logged evtchn_pending). > : INPROGRESS set? -> EOI, return > : > action returns > clear INPROGRESS > EOI unmask event, checking for whether the event got re-bound (and doing the unmask through a hypercall if necessary), thus re-raising the event in any case > The event arriving on B is lost, and there's no record made of it ever > having arrived. How do you avoid this? >=20 > With handle_edge_irq, the second event will set PENDING in the irq_desc, > and a loop will keep running on cpu A until PENDING is clear, so nothing > is ever lost. Actually, considering that you mask and unmask just like we do, I cannot even see why you would have run into above scenario with handle_level_irq(). Where is the window that I'm missing? >>> In the dom0 kernels, I followed the example of the IOAPIC irq_chip >>> implementation, which does the hardware EOI in the ack call at the = start >>> of handle_edge_irq, can did the EOI hypercall (when necessary) there. = I >>> welcome a reviewer's eye on this though. >> This I didn't actually notice so far. >> >> That doesn't look right, at least in combination with ->mask() being >> wired to disable_pirq(), which is empty (and btw., if the latter was >> right, you should also wire ->mask_ack() to disable_pirq() to avoid >> a pointless indirect call). >> >> But even with ->mask() actually masking the IRQ I'm not certain >> this is right. At the very least it'll make Xen see a potential >> second instance of the same IRQ much earlier than you will >> really be able to handle it. This is particularly bad for level >> triggered ones, as Xen will see them right again after you >> passed it the EOI notification - as a result there'll be twice as >> many interrupts seen by Xen on the respective lines. >> >> The native I/O APIC can validly do this as ->ack() only gets >> called for edge triggered interrupts (which is why ->eoi() is >> wired to ack_apic_level()). >=20 > Yes. The code as-is works OK, but I haven't checked to see if Xen it > taking many spurious interrupts. It probably helps that my test machine > has MSI for almost everything. >=20 > But does that mean the pirq code needs to have different ack/eoi > behaviour depending on the triggering of the ioapic interrupt? If you want to continue to use handle_edge_irq(), I think you will. With handle_fasteoi_irq(), you would leverage Xen's handling of edge/level, and wouldn't need to make any distinction. Jan