From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964987AbVKIBvg (ORCPT ); Tue, 8 Nov 2005 20:51:36 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S965156AbVKIBvg (ORCPT ); Tue, 8 Nov 2005 20:51:36 -0500 Received: from gate.crashing.org ([63.228.1.57]:42909 "EHLO gate.crashing.org") by vger.kernel.org with ESMTP id S964987AbVKIBvg (ORCPT ); Tue, 8 Nov 2005 20:51:36 -0500 Subject: Re: Posssible bug in kernel/irq/handle.c From: Benjamin Herrenschmidt To: Linus Torvalds Cc: Ingo Molnar , Paul Mackerras , Linux Kernel list In-Reply-To: References: <1131496739.24637.12.camel@gaston> Content-Type: text/plain Date: Wed, 09 Nov 2005 12:50:21 +1100 Message-Id: <1131501021.24637.18.camel@gaston> Mime-Version: 1.0 X-Mailer: Evolution 2.2.3 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2005-11-08 at 17:04 -0800, Linus Torvalds wrote: > > 1) The interrupt will be stuck IN_PROGRESS. I don't see how IN_PROGRESS > > can ever be cleared afterward > > If desc->action is NULL, the flags are supposed to be cleared when we get > an action. See kernel/irq/manage.c: setup_irq(), and in particular the > case where we had no handler before (ie the "!shared" case). Yes, but in the meantime, the CPU will be stuck taking the same irq for ever without ever going through note_interrupt(). > > 2) We won't go through the code in note_interrupt() that protects us > > against a stuck interrupt, so if the interrupt is indeed stuck, we'll > > just lockup the processor taking the same IRQ for ever (and not being > > able to handle it, even if an action magically gets registered, due to > > 1) > > If the irq is stuck, you have serious problems with your interrupt > controller. By definition, the irq should be disabled since there are no > handlers for that interrupt. Hrm... yah, it should... still, I find that a bit fragile. > So I think the code is correct. It has certainly worked for years on x86 > (and it got serious debugging, since we had some rather nasty and subtle > issues with edge-triggered APIC interrupts that just get lost if they are > disabled at the controller). Well, we have a "funny" case with some pSeries... the firmware may enable interrupts behind our back, and expects us to call a firmware "try to handle that interrupt" kind of call when we get one we don't handle. That is, either all the handlers returned IRQ_NONE or there is no action. I'm not sure how to do that with the current code without having our own __do_IRQ() which I'd rather avoid... Ben.