From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261370AbVAGSKd (ORCPT ); Fri, 7 Jan 2005 13:10:33 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261365AbVAGSKd (ORCPT ); Fri, 7 Jan 2005 13:10:33 -0500 Received: from ylpvm01-ext.prodigy.net ([207.115.57.32]:27873 "EHLO ylpvm01.prodigy.net") by vger.kernel.org with ESMTP id S261381AbVAGSFv (ORCPT ); Fri, 7 Jan 2005 13:05:51 -0500 From: David Brownell To: Greg KH Subject: Re: [patch 2.6.10] ehci "hc died" on startup (chip bug workaround) Date: Fri, 7 Jan 2005 10:05:43 -0800 User-Agent: KMail/1.7.1 Cc: linux-usb-devel@lists.sourceforge.net, Linux Kernel list References: <200501051435.42666.david-b@pacbell.net> <20050107174328.GB28878@kroah.com> In-Reply-To: <20050107174328.GB28878@kroah.com> MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="Boundary-00=_39s3BM2enNUrK8V" Message-Id: <200501071005.43520.david-b@pacbell.net> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org --Boundary-00=_39s3BM2enNUrK8V Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline On Friday 07 January 2005 9:43 am, Greg KH wrote: > On Wed, Jan 05, 2005 at 02:35:42PM -0800, David Brownell wrote: > > We seem to have tracked some annoying board-coupled EHCI startup > > problems to a chip bug, with a simple workaround. Please merge. > > Hm, I get a reject from this: > ... > > What kernel tree is it against? Probably my gadget-2.6 tree; here's one that applies against current 2.5 BK or your USB integration tree. Sorry! - Dave --Boundary-00=_39s3BM2enNUrK8V Content-Type: text/x-diff; charset="us-ascii"; name="e0107.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="e0107.patch" This fixes OSDL bugid #3056 for at least some users, where the EHCI driver gets a "fatal error" IRQ on startup ... only on certain boards, starting with the 2.6.6 or 2.6.7 kernels. These IRQs normally indicate that an invalid DMA address got passed to the controller, or something equally nasty and unrecoverable. But it turns out that some of these controllers (at least ALI and Intel) are lying. They're issuing these IRQs without stopping, contrary to the EHCI spec ... so these IRQs can be recovered from. Thanks to Christian Iversen for noticing that his ALI controller would continue operating, which was the first real break in this annoying case. This patch tests for these bogus IRQs, and ignores them ... working around what's clearly a chip bug. It's not clear why we started triggering that bug, but at least EHCI is now usable on boards exhibiting this problem. Signed-off-by: David Brownell --- xu26/drivers/usb/host/ehci-hcd.c 2004-12-20 15:07:23.000000000 -0800 +++ gadget-2.6/drivers/usb/host/ehci-hcd.c 2005-01-04 12:01:46.000000000 -0800 @@ -883,13 +903,20 @@ /* PCI errors [4.15.2.4] */ if (unlikely ((status & STS_FATAL) != 0)) { - ehci_err (ehci, "fatal error\n"); + /* bogus "fatal" IRQs appear on some chips... why? */ + status = readl (&ehci->regs->status); + dbg_cmd (ehci, "fatal", readl (&ehci->regs->command)); + dbg_status (ehci, "fatal", status); + if (status & STS_HALT) { + ehci_err (ehci, "fatal error\n"); dead: - ehci_reset (ehci); - /* generic layer kills/unlinks all urbs, then - * uses ehci_stop to clean up the rest - */ - bh = 1; + ehci_reset (ehci); + writel (0, &ehci->regs->configured_flag); + /* generic layer kills/unlinks all urbs, then + * uses ehci_stop to clean up the rest + */ + bh = 1; + } } if (bh) --Boundary-00=_39s3BM2enNUrK8V--