From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: 2.6.21-rc7-mm1 BUG at kernel/sched-clock.c:175 init_sched_clock() Date: Wed, 25 Apr 2007 01:55:59 -0700 Message-ID: <20070425015559.e6ed537c.akpm@linux-foundation.org> References: <462E4C4D.9020806@gmail.com> <20070425011654.222a2b4b.akpm@linux-foundation.org> <462F1481.8010807@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Return-path: Received: from smtp1.linux-foundation.org ([65.172.181.25]:41920 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1423323AbXDYI4G (ORCPT ); Wed, 25 Apr 2007 04:56:06 -0400 In-Reply-To: <462F1481.8010807@gmail.com> Sender: linux-ide-owner@vger.kernel.org List-Id: linux-ide@vger.kernel.org To: Tejun Heo Cc: "Berck E. Nash" , "linux-kernel@vger.kernel.org" , Jeff Garzik , linux-ide@vger.kernel.org On Wed, 25 Apr 2007 17:42:41 +0900 Tejun Heo wrote: > Andrew Morton wrote: > >> [ 2.953862] RIP: 0010:[] [] scsi_schedule_eh+0xa/0x57 > >> [ 3.058672] [] ata_port_schedule_eh+0x4c/0x50 > >> [ 3.064725] [] ata_port_abort+0xa2/0xae > >> [ 3.070248] [] ata_port_freeze+0x46/0x57 > >> [ 3.075853] [] ahci_interrupt+0x300/0x47a > >> [ 3.081552] [] handle_IRQ_event+0x27/0x57 > >> [ 3.087253] [] handle_edge_irq+0xee/0x133 > >> [ 3.092960] [] do_IRQ+0x6d/0xd5 > >> [ 3.097793] [] ret_from_intr+0x0/0xa > >> [ 3.103059] [] mwait_idle+0x46/0x4b > >> [ 3.108231] [] cpu_idle+0x87/0xaa > >> [ 3.113227] [] rest_init+0x49/0x4b > >> [ 3.118322] [] start_kernel+0x291/0x29c > >> [ 3.123837] [] _sinittext+0x13a/0x141 > >> > > > > So we took an AHCI interrupt when ata_port.scsi_host was still NULL. > > > > It appears that ATA is presently requesting its IRQ before allocating all > > the resources which are needed to handle an interrupt. Does this > > (resource-leaky) patch fix things? > > No, that would break host probing. The port is in frozen (controller > initialized and IRQs masked off) state, so it's not allowed to take > interrupt. Sounds dodgy. What happens if the IRQ line is shared with some other device? We'll enter ahci_interrupt(). We've alread set up ->port_map and ->n_ports so we're wholly dependent upon that read from HOST_IRQ_STAT returning zeroes. /* sigh. 0xffffffff is a valid return from h/w */ if that happens we're dead, aren't we? > If interrupt triggers at this point, it's low level driver > bug. Berck, how reliably can you reproduce this problem? Can you post > the result of 'lspci -nn'? Berck doesn't apepar to be sharing that IRQ, so it'll be something else. However do note that if CONFIG_DEBUG_SHIRQ is enabled, request_irq() will deliberately run the handler, to catch buggy drivers.