From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 27 Feb 2007 14:45:41 +1100 From: David Gibson To: Segher Boessenkool Subject: Re: [PATCH] powerpc: document new interrupt-array property Message-ID: <20070227034541.GD1861@localhost.localdomain> References: <9696D7A991D0824DBA8DFAC74A9C5FA302A592C7@az33exm25.fsl.freescale.net> <259dc2545888e6588a8a0707ad2e84b0@kernel.crashing.org> <9696D7A991D0824DBA8DFAC74A9C5FA302A59732@az33exm25.fsl.freescale.net> <1172299259.1902.22.camel@localhost.localdomain> <20070226041646.GC29826@localhost.localdomain> <4540139ce9bb2426dbcc3822e6c1a63a@kernel.crashing.org> <20070226130837.GA32080@localhost.localdomain> <20070227023243.GC1861@localhost.localdomain> <0bb86e9c2642f033697bfb44a4f59ff8@kernel.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <0bb86e9c2642f033697bfb44a4f59ff8@kernel.crashing.org> Cc: linuxppc-dev@ozlabs.org, paulus@samba.org, Yoder Stuart-B08248 List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , On Tue, Feb 27, 2007 at 03:52:41AM +0100, Segher Boessenkool wrote: > >> And if a program parsing the device tree sees no valid > >> "interrupts" property, it can validly assume the device > >> doesn't have interrupts. > >> > >> Same problem. > > > > Sort of. But the probable consequences of mistakenly believing a > > device has no interrupts are substantially less messy than mistakenly > > believing you understand the node's interrupts when you don't. > > "Less messy"... well the device won't work properly > in either case. The kernel might completely screw > up programming the interrupts, which would mean it > doesn't do enough sanity checking; or it could give > spectacular oopses, where the "less messy" case would > simply be a device driver not running for your device. > > If the one case gives you more information to track > down the problem than the other case, I argue that's > a shortcoming of the kernel, not of the OF binding. Segher, think for a moment instead of just arguing. There just isn't enough information available for the kernel to do sanity checking when there is an apparently valid 'interrupts' property. Consider: Interrupt controllers are generally initialized with all interrupts masked (yes, not always, but usually). So, if a client mistakenly believes a device has no interrupts, those interrupts will never be configured, and the CPU will never see those interrupts. This is only going to cause a problem if there is an active driver which is expecting interrupts. But if there's a driver expecting interrupts, it must at some point earlier have attempted to configure the interrupts (if the client is the kernel, that's a request_irq()). In order to configure the interrupt, it would have parsed the device tree to find data about the interrupt. In doing so it would have run into the lack of 'interrupts' property. There's a good chance at this point it will just print an error saying "Huh? Where's my interrupt" and abort driver initialization. If it doesn't do that, it's very likely it will immediately crash attempting to dereference or parse the non-existant property. Either way, the problem shows up at the point we're attempting to parse the interrupt tree, and will be pretty easy to debug. Now, a different case. Suppose we're using the 'interrupts' / 'interrupt-parents' approach. We have a board with two identical interrupt controllers, cascaded. It has a network device with two interrupts, the first is end-of-packet and is routed to the top-level IC, the second signals a fairly rare error condition and is routed to the cascaded IC. The network device sits under a bridge which has a single interrupt routed to the primary IC (and thus has an 'interrupt-parent' property). So, to an old-style parser it looks like the network device has two interrupts on the primary controller, routed via the bridge. When the network driver initializes, it requests its irqs, correctly configures the first, and misconfigures the second (because it follows the interrupt tree old-style and assumes they're all routed to the primary IC). It sends and receives packets fine, then the error condition happens, but the recovery ISR is never called and the network suddenly stops at some random time after startup. Programmer, baffled, tries half-a-dozen theories before noticing the error status bit and going "but why didn't we get an interrupt?". Or suppose the second interrupt signals a (fairly unimportant) status change, level-sensitive. The network driver works just fine. Then along comes another driver that shares an interrupt with the second network driver interrupt. It crashes with an unhandled interrupt on startup if-and-only-if the network driver has had a status change event before the second driver started. This is common on some networks and rare on others. Bafflement all around... Or for that matter, the network driver could crash with an unhandled interrupt when the device which is really using what the network driver thinks is its second irq, generates an interrupt. When that happens could depend on that other device, its driver, the board configuration, then network or other external environment... And those are just the first 3 recipes for utter confusion I can come up. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson