* EHCI Regression in 2.6.23-rc2
@ 2007-08-10 8:45 Daniel Exner
2007-08-10 9:27 ` Jiri Kosina
0 siblings, 1 reply; 17+ messages in thread
From: Daniel Exner @ 2007-08-10 8:45 UTC (permalink / raw)
To: linux-kernel
Hi!
Please CC me, as I'm currently not subscribed to this list, thx.
After some serious hangs with 2.6.23-rc2 I did some bisects and this was the
result:
196705c9bbc03540429b0f7cf9ee35c2f928a534 is first bad commit
commit 196705c9bbc03540429b0f7cf9ee35c2f928a534
Author: Stuart_Hayes@Dell.com <Stuart_Hayes@Dell.com>
Date: Thu May 3 08:58:49 2007 -0700
USB: EHCI cpufreq fix
EHCI controllers that don't cache enough microframes can get MMF errors
when CPU frequency changes occur between the start and completion of
split interrupt transactions, due to delays in reading main memory
(caused by CPU cache snoop delays).
This patch adds a cpufreq notifier to the EHCI driver that will
inactivate split interrupt transactions during frequency transitions.
It was tested on Intel ICH7 and Serverworks/Broadcom HT1000 EHCI
controllers.
Signed-off-by: Stuart Hayes <stuart_hayes@dell.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
:040000 040000 0e6d518de17cf18155c7529f7b044a4660ca24e9
736bbcc7d3fb138138ee1840d8a6b83b959c07fc M drivers
As expected my system only hangs when cpufreq, powernow-k8 and ehci modules
are loaded, and some transition should occur.
(Simulated by using userspace governour and changing freq manually)
The corresponding EHCI Controller is:
00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86) (prog-if 20
[EHCI])
Subsystem: ASUSTeK Computer Inc. A7V600/K8V-X/A8V Deluxe motherboard
I could not get my hands on any output while the hang occurs, seems like the
CPU is really bad locked.
Greetings
Daniel Exner
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: EHCI Regression in 2.6.23-rc2 2007-08-10 8:45 EHCI Regression in 2.6.23-rc2 Daniel Exner @ 2007-08-10 9:27 ` Jiri Kosina 2007-08-10 13:08 ` Daniel Exner 2007-08-10 15:14 ` Stuart_Hayes 0 siblings, 2 replies; 17+ messages in thread From: Jiri Kosina @ 2007-08-10 9:27 UTC (permalink / raw) To: webmaster; +Cc: linux-kernel, Stuart_Hayes, linux-usb-devel On Fri, 10 Aug 2007, Daniel Exner wrote: > Please CC me, as I'm currently not subscribed to this list, thx. Please also don't forget to CC relevant people/lists when reporting bugs, thanks. > After some serious hangs with 2.6.23-rc2 I did some bisects and this was the > result: > 196705c9bbc03540429b0f7cf9ee35c2f928a534 is first bad commit > commit 196705c9bbc03540429b0f7cf9ee35c2f928a534 > Author: Stuart_Hayes@Dell.com <Stuart_Hayes@Dell.com> > Date: Thu May 3 08:58:49 2007 -0700 I guess that the patch attached to bug 8535 in kernel.org bugzilla -- http://bugzilla.kernel.org/attachment.cgi?id=12228&action=view -- solves your issues, right? Stuart, did you submit this fix for upstream already please? -- Jiri Kosina ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EHCI Regression in 2.6.23-rc2 2007-08-10 9:27 ` Jiri Kosina @ 2007-08-10 13:08 ` Daniel Exner 2007-08-13 20:48 ` Stuart_Hayes 2007-08-10 15:14 ` Stuart_Hayes 1 sibling, 1 reply; 17+ messages in thread From: Daniel Exner @ 2007-08-10 13:08 UTC (permalink / raw) To: Jiri Kosina; +Cc: linux-kernel, Stuart_Hayes, linux-usb-devel Jiri Kosina wrote: > On Fri, 10 Aug 2007, Daniel Exner wrote: > > Please CC me, as I'm currently not subscribed to this list, thx. > > Please also don't forget to CC relevant people/lists when reporting bugs, > thanks. Guess its ok, now? Thanks anyway :) > > After some serious hangs with 2.6.23-rc2 I did some bisects and this was > > the result: > > 196705c9bbc03540429b0f7cf9ee35c2f928a534 is first bad commit > > commit 196705c9bbc03540429b0f7cf9ee35c2f928a534 > > Author: Stuart_Hayes@Dell.com <Stuart_Hayes@Dell.com> > > Date: Thu May 3 08:58:49 2007 -0700 > > I guess that the patch attached to bug 8535 in kernel.org bugzilla -- > http://bugzilla.kernel.org/attachment.cgi?id=12228&action=view -- solves > your issues, right? Nope, this does _not_ fix my issue. Anything else I could try, or some files you need? I tried finding some clue in my logs, but without any results so far. Greetings Daniel Exner ^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: EHCI Regression in 2.6.23-rc2 2007-08-10 13:08 ` Daniel Exner @ 2007-08-13 20:48 ` Stuart_Hayes 2007-08-13 22:16 ` David Brownell 0 siblings, 1 reply; 17+ messages in thread From: Stuart_Hayes @ 2007-08-13 20:48 UTC (permalink / raw) To: webmaster, greg; +Cc: linux-kernel, linux-usb-devel, david-b, jikos Daniel Exner wrote: > Jiri Kosina wrote: >> On Fri, 10 Aug 2007, Daniel Exner wrote: >>> Please CC me, as I'm currently not subscribed to this list, thx. >> >> Please also don't forget to CC relevant people/lists when reporting >> bugs, thanks. > Guess its ok, now? Thanks anyway :) > >>> After some serious hangs with 2.6.23-rc2 I did some bisects and >>> this was the result: 196705c9bbc03540429b0f7cf9ee35c2f928a534 is >>> first bad commit commit 196705c9bbc03540429b0f7cf9ee35c2f928a534 >>> Author: Stuart_Hayes@Dell.com <Stuart_Hayes@Dell.com> >>> Date: Thu May 3 08:58:49 2007 -0700 >> >> I guess that the patch attached to bug 8535 in kernel.org bugzilla -- >> http://bugzilla.kernel.org/attachment.cgi?id=12228&action=view -- >> solves your issues, right? > Nope, this does _not_ fix my issue. > > Anything else I could try, or some files you need? > I tried finding some clue in my logs, but without any results so far. > > Greetings > Daniel Exner It appears that the VIA controllers just ignore the "inactivate" bit completely. Normally, when I set the "inactivate" bit in the QH and then watch the QH & overlay, I eventually see the controller clear the "active" bit in the overlay token, and, of course, it doesn't do the transaction. With the VIA controller I have, after I set the "inactivate" bit, I eventually see the controller set bit 1 in the overlay token (SplitXstate), indicating that it's running the transaction, and, a couple microframes later, it clears that bit again. The transaction is not inactivated. The problem occurs if a transaction completes when the "inactivate" bit is set... qh_completions will ignore the transaction until the "inactivate" bit is cleared, and then, when the transaction should be re-activated, my patch will set the "active" bit back to 1 in the overlay & qtd token, even though the transaction was already completed by the controller... To work around this, I'd have to re-write my patch so that it didn't depend on the "inactivate" bit at all... I suppose it could possibly be done just by directly manipulating the "active" bit in the overlay token, since already the code doesn't mess with the overlay if there's any chance that the transaction is alrady cached or in progress, but that would be tricky. Perhaps for now the best thing would just be to bypass the EHCI CPU frequency notifier code (i.e., my patch) for VIA EHCI controllers, since they are broken. Would a hard-coded blacklist (just an "if (manufacturer==VIA)..." type thing) be OK? I've also acquired a card with an NEC EHCI controller on it, which I'm going to look at while I'm into it... Thanks Stuart ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EHCI Regression in 2.6.23-rc2 2007-08-13 20:48 ` Stuart_Hayes @ 2007-08-13 22:16 ` David Brownell 2007-08-14 6:43 ` Daniel Exner 0 siblings, 1 reply; 17+ messages in thread From: David Brownell @ 2007-08-13 22:16 UTC (permalink / raw) To: Stuart_Hayes; +Cc: webmaster, greg, linux-kernel, linux-usb-devel, jikos On Monday 13 August 2007, Stuart_Hayes@dell.com wrote: > With the VIA controller I have, Which kind is that? The VT6202 is buggy as all get-out, and they sold a *LOT* of those discrete chips for use in add-on PCI cards. We generally warn people away from those. A more current version is the VT6212, which was much more usable. (If it says EHCI 0.95, it's a VT6202... their EHCI 1.0 chips were much better.) > after I set the "inactivate" bit, I > eventually see the controller set bit 1 in the overlay token > (SplitXstate), indicating that it's running the transaction, and, a > couple microframes later, it clears that bit again. The transaction is > not inactivated. > ... > Perhaps for now the best thing would just be to bypass the EHCI CPU > frequency notifier code (i.e., my patch) for VIA EHCI controllers, since > they are broken. Would a hard-coded blacklist (just an "if > (manufacturer==VIA)..." type thing) be OK? Yes ... although if you don't need to blacklist their EHCI 1.0 chips don't do it. (Any VIA EHCI integrated into a southbridge is going to follow spec rev 1.0 pretty well, modulo idiosyncratic timings.) > I've also acquired a card with an NEC EHCI controller on it, which I'm > going to look at while I'm into it... Another case where there are a lot of add-on "EHCI 0.95" cards; but in this case the quirks were less significant. - Dave ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EHCI Regression in 2.6.23-rc2 2007-08-13 22:16 ` David Brownell @ 2007-08-14 6:43 ` Daniel Exner 2007-08-14 8:01 ` David Brownell 0 siblings, 1 reply; 17+ messages in thread From: Daniel Exner @ 2007-08-14 6:43 UTC (permalink / raw) To: David Brownell; +Cc: Stuart_Hayes, greg, linux-kernel, linux-usb-devel, jikos David Brownell wrote: > On Monday 13 August 2007, Stuart_Hayes@dell.com wrote: > > With the VIA controller I have, > > Which kind is that? The VT6202 is buggy as all get-out, and > they sold a *LOT* of those discrete chips for use in add-on PCI > cards. We generally warn people away from those. A more current > version is the VT6212, which was much more usable. (If it says > EHCI 0.95, it's a VT6202... their EHCI 1.0 chips were much better.) Where exactly should I search for this? Neither lspci nor lsusb showed any hint on the EHCI rev. the chip conforms to.. [..] > > Perhaps for now the best thing would just be to bypass the EHCI CPU > > frequency notifier code (i.e., my patch) for VIA EHCI controllers, since > > they are broken. Would a hard-coded blacklist (just an "if > > (manufacturer==VIA)..." type thing) be OK? > > Yes ... although if you don't need to blacklist their EHCI 1.0 chips > don't do it. (Any VIA EHCI integrated into a southbridge is going > to follow spec rev 1.0 pretty well, modulo idiosyncratic timings.) I guess its needed to blacklist even the ECHI 1.0 chips, since my problem is with exactly one of those ;) I'm not really into USB protocol specs, but perhaps its possible to test wether the problem Stuarts patch addressed can actually happen on VIA EHCI chips? Perhaps those guys solved the problem in Hard/Firmware.. > > I've also acquired a card with an NEC EHCI controller on it, which I'm > > going to look at while I'm into it... > > Another case where there are a lot of add-on "EHCI 0.95" cards; but > in this case the quirks were less significant. Some guy donated me a PCMCIA card with one of those, cause it'll wont work in his Windows only Notebook :) Greetings Daniel Exner ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EHCI Regression in 2.6.23-rc2 2007-08-14 6:43 ` Daniel Exner @ 2007-08-14 8:01 ` David Brownell 2007-08-14 9:46 ` Daniel Exner 0 siblings, 1 reply; 17+ messages in thread From: David Brownell @ 2007-08-14 8:01 UTC (permalink / raw) To: webmaster; +Cc: Stuart_Hayes, greg, linux-kernel, linux-usb-devel, jikos On Monday 13 August 2007, Daniel Exner wrote: > David Brownell wrote: > > On Monday 13 August 2007, Stuart_Hayes@dell.com wrote: > > > With the VIA controller I have, > > > > Which kind is that? The VT6202 is buggy as all get-out, and > > they sold a *LOT* of those discrete chips for use in add-on PCI > > cards. We generally warn people away from those. A more current > > version is the VT6212, which was much more usable. (If it says > > EHCI 0.95, it's a VT6202... their EHCI 1.0 chips were much better.) > > Where exactly should I search for this? Neither lspci nor lsusb showed any > hint on the EHCI rev. the chip conforms to.. The driver logs that information as it starts; on this sytem: ehci_hcd 0000:00:02.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 vs "EHCI 0.95". > [..] > > > Perhaps for now the best thing would just be to bypass the EHCI CPU > > > frequency notifier code (i.e., my patch) for VIA EHCI controllers, since > > > they are broken. Would a hard-coded blacklist (just an "if > > > (manufacturer==VIA)..." type thing) be OK? > > > > Yes ... although if you don't need to blacklist their EHCI 1.0 chips > > don't do it. (Any VIA EHCI integrated into a southbridge is going > > to follow spec rev 1.0 pretty well, modulo idiosyncratic timings.) > > I guess its needed to blacklist even the ECHI 1.0 chips, since my problem is > with exactly one of those ;) Something doesn't add up then ... above you ask where to find that info, but here you say you already got it from somwhere ... ? > I'm not really into USB protocol specs, but perhaps its possible to test > wether the problem Stuarts patch addressed can actually happen on VIA EHCI > chips? Perhaps those guys solved the problem in Hard/Firmware.. Theoretically possible, and I've certainly seen hardware made to do stranger things than that. > > > I've also acquired a card with an NEC EHCI controller on it, which I'm > > > going to look at while I'm into it... > > > > Another case where there are a lot of add-on "EHCI 0.95" cards; but > > in this case the quirks were less significant. > > Some guy donated me a PCMCIA card with one of those, cause it'll wont work in > his Windows only Notebook :) A NEC 0.95 ?? Should be fine with Linux. Assuming no bugs have crept in. - Dave ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EHCI Regression in 2.6.23-rc2 2007-08-14 8:01 ` David Brownell @ 2007-08-14 9:46 ` Daniel Exner 2007-08-14 15:13 ` Stuart_Hayes 2007-08-14 15:42 ` David Brownell 0 siblings, 2 replies; 17+ messages in thread From: Daniel Exner @ 2007-08-14 9:46 UTC (permalink / raw) To: David Brownell; +Cc: Stuart_Hayes, greg, linux-kernel, linux-usb-devel, jikos David Brownell wrote: > On Monday 13 August 2007, Daniel Exner wrote: [..] > > Where exactly should I search for this? Neither lspci nor lsusb showed > > any hint on the EHCI rev. the chip conforms to.. > > The driver logs that information as it starts; on this sytem: > > ehci_hcd 0000:00:02.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 > > vs "EHCI 0.95". ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 Build into: 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI bridge [K8T800/K8T890 South] > > > > I've also acquired a card with an NEC EHCI controller on it, which > > > > I'm going to look at while I'm into it... > > > > > > Another case where there are a lot of add-on "EHCI 0.95" cards; but > > > in this case the quirks were less significant. > > > > Some guy donated me a PCMCIA card with one of those, cause it'll wont > > work in his Windows only Notebook :) > > A NEC 0.95 ?? Should be fine with Linux. Assuming no bugs have > crept in. Didn't test it yet with 2.6.23-rc2 or rc3, but up to 2.6.22 it was fine :) Regarding the option to blacklist VIA in the module: I would prefer blacklisting VIA by default but giving the module some parameter like "honours inactive bit" to override this. Perhaps there are newer VIA Chips out there, that indeed do this and some users trigger happy enough to test this. :) Greetings Daniel Exner ^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: EHCI Regression in 2.6.23-rc2 2007-08-14 9:46 ` Daniel Exner @ 2007-08-14 15:13 ` Stuart_Hayes 2007-08-14 15:49 ` David Brownell 2007-08-14 15:42 ` David Brownell 1 sibling, 1 reply; 17+ messages in thread From: Stuart_Hayes @ 2007-08-14 15:13 UTC (permalink / raw) To: webmaster, david-b; +Cc: greg, linux-kernel, linux-usb-devel, jikos Daniel Exner wrote: > David Brownell wrote: >> On Monday 13 August 2007, Daniel Exner wrote: > [..] >>> Where exactly should I search for this? Neither lspci nor lsusb >>> showed any hint on the EHCI rev. the chip conforms to.. >> >> The driver logs that information as it starts; on this sytem: >> >> ehci_hcd 0000:00:02.2: USB 2.0 started, EHCI 1.00, driver 10 Dec >> 2004 >> >> vs "EHCI 0.95". > ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 > > Build into: > 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI bridge > [K8T800/K8T890 South] > Hm... I've got a 0.95. I'll try to get a Via EHCI 1.00 controller and make sure it's the same problem. >>>>> I've also acquired a card with an NEC EHCI controller on it, >>>>> which I'm going to look at while I'm into it... >>>> >>>> Another case where there are a lot of add-on "EHCI 0.95" cards; >>>> but in this case the quirks were less significant. >>> >>> Some guy donated me a PCMCIA card with one of those, cause it'll >>> wont work in his Windows only Notebook :) >> >> A NEC 0.95 ?? Should be fine with Linux. Assuming no bugs have >> crept in. > Didn't test it yet with 2.6.23-rc2 or rc3, but up to 2.6.22 it was > fine :) > > Regarding the option to blacklist VIA in the module: > I would prefer blacklisting VIA by default but giving the module some > parameter like "honours inactive bit" to override this. > > Perhaps there are newer VIA Chips out there, that indeed do this and > some users trigger happy enough to test this. :) That kernel parameter sounds like a reasonable idea to me. The problem that the patch is trying to work around is that, while the CPUs are changing frequency, the EHCI controller gets delayed trying to read main memory (because CPU cache snoops have to wait until the CPU is finished)... if this happens in the middle of a split transaction to a low/full speed device, the transaction won't complete in time, and you get an error and possible data loss. If the EHCI controller caches ahead enough, it shouldn't need to read main memory to be able to complete the split transaction... but, while the controller does say how much ahead it may cache, it isn't clear to me that it will always be able to cache that much, so I thought it would be safe to go ahead and inactivate split transactions during CPU frequency transitions regardless. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EHCI Regression in 2.6.23-rc2 2007-08-14 15:13 ` Stuart_Hayes @ 2007-08-14 15:49 ` David Brownell 2007-08-14 21:33 ` Stuart_Hayes 0 siblings, 1 reply; 17+ messages in thread From: David Brownell @ 2007-08-14 15:49 UTC (permalink / raw) To: webmaster, Stuart_Hayes; +Cc: linux-usb-devel, linux-kernel, jikos, greg > Hm... I've got a 0.95. I'll try to get a Via EHCI 1.00 controller and > make sure it's the same problem. Yeah, for some reason way too many of the add-on PCI cards with VIA chips use that pretty-broken VT6202 chip. Ones with VT6212 are also available, and work a lot better. > > Regarding the option to blacklist VIA in the module: > > I would prefer blacklisting VIA by default but giving the module some > > parameter like "honours inactive bit" to override this. > > > > Perhaps there are newer VIA Chips out there, that indeed do this and > > some users trigger happy enough to test this. :) > > That kernel parameter sounds like a reasonable idea to me. Yes, IFF we know that the bug shows up in EHCI 1.00 chips rather than just the already-known-to-be-buggy VT6202 chips. (I think part of the deal was that until the parts went through some conformance testing, nobody could use the "1.0" label. There were also a few small feature updates and spec clarifications. If anyone else shipped silicon in volume that was as buggy as a VT6202, I didn't see any.) I'd be happy to see a warning come out whenever a VT6202 is found, since its problems are NOT limited to this I-bit bug. > The problem > that the patch is trying to work around is that, while the CPUs are > changing frequency, the EHCI controller gets delayed trying to read main > memory (because CPU cache snoops have to wait until the CPU is > finished)... if this happens in the middle of a split transaction to a > low/full speed device, the transaction won't complete in time, and you > get an error and possible data loss. > > If the EHCI controller caches ahead enough, it shouldn't need to read > main memory to be able to complete the split transaction... but, while > the controller does say how much ahead it may cache, it isn't clear to > me that it will always be able to cache that much, so I thought it would > be safe to go ahead and inactivate split transactions during CPU > frequency transitions regardless. Right. ^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: EHCI Regression in 2.6.23-rc2 2007-08-14 15:49 ` David Brownell @ 2007-08-14 21:33 ` Stuart_Hayes 2007-08-15 18:38 ` Stuart_Hayes 0 siblings, 1 reply; 17+ messages in thread From: Stuart_Hayes @ 2007-08-14 21:33 UTC (permalink / raw) To: david-b, webmaster; +Cc: linux-usb-devel, linux-kernel, jikos, greg David Brownell wrote: >> Hm... I've got a 0.95. I'll try to get a Via EHCI 1.00 controller >> and make sure it's the same problem. > > Yeah, for some reason way too many of the add-on PCI cards with VIA > chips use that pretty-broken VT6202 chip. Ones with VT6212 are also > available, and work a lot better. > > >>> Regarding the option to blacklist VIA in the module: >>> I would prefer blacklisting VIA by default but giving the module >>> some parameter like "honours inactive bit" to override this. >>> >>> Perhaps there are newer VIA Chips out there, that indeed do this and >>> some users trigger happy enough to test this. :) >> >> That kernel parameter sounds like a reasonable idea to me. > > Yes, IFF we know that the bug shows up in EHCI 1.00 chips rather than > just the already-known-to-be-buggy VT6202 chips. (I think part of > the deal was that until the parts went through some conformance > testing, nobody could use the "1.0" label. There were also a few > small feature updates and spec clarifications. If anyone else > shipped silicon in volume that was as buggy as a VT6202, I didn't see > any.) > > I'd be happy to see a warning come out whenever a VT6202 is found, > since its problems are NOT limited to this I-bit bug. > OK, I've got a VIA VT6212, and it's definitely not the same as the 6202--it's locking up my system, too, with my patch, and it is definitely not just ignoring the inactivate bit. I'm still trying to figure out what's going on. The NEC controller (EHCI 1.00) seems to work fine, though. ^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: EHCI Regression in 2.6.23-rc2 2007-08-14 21:33 ` Stuart_Hayes @ 2007-08-15 18:38 ` Stuart_Hayes 2007-08-15 19:12 ` [linux-usb-devel] " David Brownell 0 siblings, 1 reply; 17+ messages in thread From: Stuart_Hayes @ 2007-08-15 18:38 UTC (permalink / raw) To: david-b, webmaster; +Cc: linux-usb-devel, linux-kernel, jikos, greg Hayes, Stuart wrote: > David Brownell wrote: >>> Hm... I've got a 0.95. I'll try to get a Via EHCI 1.00 controller >>> and make sure it's the same problem. >> >> Yeah, for some reason way too many of the add-on PCI cards with VIA >> chips use that pretty-broken VT6202 chip. Ones with VT6212 are also >> available, and work a lot better. >> >> >>>> Regarding the option to blacklist VIA in the module: >>>> I would prefer blacklisting VIA by default but giving the module >>>> some parameter like "honours inactive bit" to override this. >>>> >>>> Perhaps there are newer VIA Chips out there, that indeed do this >>>> and some users trigger happy enough to test this. :) >>> >>> That kernel parameter sounds like a reasonable idea to me. >> >> Yes, IFF we know that the bug shows up in EHCI 1.00 chips rather than >> just the already-known-to-be-buggy VT6202 chips. (I think part of >> the deal was that until the parts went through some conformance >> testing, nobody could use the "1.0" label. There were also a few >> small feature updates and spec clarifications. If anyone else >> shipped silicon in volume that was as buggy as a VT6202, I didn't see >> any.) >> >> I'd be happy to see a warning come out whenever a VT6202 is found, >> since its problems are NOT limited to this I-bit bug. >> > > OK, I've got a VIA VT6212, and it's definitely not the same as the > 6202--it's locking up my system, too, with my patch, and it is > definitely not just ignoring the inactivate bit. I'm still trying to > figure out what's going on. > > The NEC controller (EHCI 1.00) seems to work fine, though. OK... I see what's happening. When the VIA VT6212 sees the "inactivate" bit set, it will START the split transaction, but it doesn't finish it. When I set the "I" bit--even if I set it like 50 uframes before the transaction should start--the controller will set bit 1 (splitXstate--this means it's started the transaction) and clear bit 7 (active bit) in the token, when it comes time for the transaction to be run. This is a violation of EHCI 1.0 spec section 4.12.2.5, second bullet: "If the Active bit is a one and the SplitXState is DoStart (regardless of the value of S-mask), the host controller will simply set Active bit to a zero... the host controller must not issue the start-split bus transaction." With an analyzer, I've observed that the controller is indeed issuing "start split" without issuing "complete split", and I'm losing keystrokes if I type when this is happening, as expected. So... I still think blacklisting the VIA controllers from this CPU frequency stuff is the best option. It is unlikely that any real issues will be seen during CPU frequency transitions with these controllers anyway, because they claim to cache 8 uframes of the periodic schedule. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [linux-usb-devel] EHCI Regression in 2.6.23-rc2 2007-08-15 18:38 ` Stuart_Hayes @ 2007-08-15 19:12 ` David Brownell 0 siblings, 0 replies; 17+ messages in thread From: David Brownell @ 2007-08-15 19:12 UTC (permalink / raw) To: Stuart_Hayes; +Cc: linux-usb-devel, webmaster, greg, linux-kernel On Wednesday 15 August 2007, Stuart_Hayes@dell.com wrote: > So... I still think blacklisting the VIA controllers from this CPU > frequency stuff is the best option. Sadly, yes. My negative impression of VIA quality is confirmed, yet again... Please make sure the comments in your blacklist code describe both of the chip bugs you've observed. > It is unlikely that any real issues > will be seen during CPU frequency transitions with these controllers > anyway, because they claim to cache 8 uframes of the periodic schedule. I'd not be so sure. But if there are such issues, we can wait for problem reports. - Dvae ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EHCI Regression in 2.6.23-rc2 2007-08-14 9:46 ` Daniel Exner 2007-08-14 15:13 ` Stuart_Hayes @ 2007-08-14 15:42 ` David Brownell 2007-08-14 15:57 ` Daniel Exner 1 sibling, 1 reply; 17+ messages in thread From: David Brownell @ 2007-08-14 15:42 UTC (permalink / raw) To: webmaster; +Cc: Stuart_Hayes, linux-usb-devel, linux-kernel, jikos, greg > ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 > > Build into: > 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI bridge [K8T800/K8T890 > South] Yeah, VT8235 was their first southbridge with integrated EHCI. ISTR that the VT8237 worked a bit more smoothly. I think 8237 was about where they forked off the the VT6212 as their first discrete EHCI (for addon cards) claiming EHCI 1.0 conformance. Too bad they didn't recycle all the VT6202 chips in their inventory at that time... Now ... are you reporting that this worked with Stuart's patch? Or that it didn't? Or that you couldn't say? - Dave ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EHCI Regression in 2.6.23-rc2 2007-08-14 15:42 ` David Brownell @ 2007-08-14 15:57 ` Daniel Exner 2007-08-14 16:12 ` David Brownell 0 siblings, 1 reply; 17+ messages in thread From: Daniel Exner @ 2007-08-14 15:57 UTC (permalink / raw) To: David Brownell; +Cc: Stuart_Hayes, linux-usb-devel, linux-kernel, jikos, greg David Brownell wrote: > > ehci_hcd 0000:00:10.4: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004 > > > > Build into: > > 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI bridge > > [K8T800/K8T890 South] > > Yeah, VT8235 was their first southbridge with integrated EHCI. > ISTR that the VT8237 worked a bit more smoothly. > > I think 8237 was about where they forked off the the VT6212 as > their first discrete EHCI (for addon cards) claiming EHCI 1.0 > conformance. Too bad they didn't recycle all the VT6202 chips > in their inventory at that time... No hardware producer would have done that ;) > Now ... are you reporting that this worked with Stuart's patch? > Or that it didn't? Or that you couldn't say? As I started this thread because Stuart's patch freezes my whole system (at least my bitsect did blame him), I therefore report that it doesnt work for me. Greetings Daniel Exner ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: EHCI Regression in 2.6.23-rc2 2007-08-14 15:57 ` Daniel Exner @ 2007-08-14 16:12 ` David Brownell 0 siblings, 0 replies; 17+ messages in thread From: David Brownell @ 2007-08-14 16:12 UTC (permalink / raw) To: webmaster; +Cc: Stuart_Hayes, linux-usb-devel, linux-kernel, jikos, greg > > I think 8237 was about where they forked off the the VT6212 as > > their first discrete EHCI (for addon cards) claiming EHCI 1.0 > > conformance. Too bad they didn't recycle all the VT6202 chips > > in their inventory at that time... > > No hardware producer would have done that ;) As I said: too bad. ;) > > Now ... are you reporting that this worked with Stuart's patch? > > Or that it didn't? Or that you couldn't say? > > As I started this thread because Stuart's patch freezes my whole system (at > least my bitsect did blame him), I therefore report that it doesnt work for > me. The original patch, yes. ISTR seeing an update come around though. Maybe I was imagining things. ^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: EHCI Regression in 2.6.23-rc2 2007-08-10 9:27 ` Jiri Kosina 2007-08-10 13:08 ` Daniel Exner @ 2007-08-10 15:14 ` Stuart_Hayes 1 sibling, 0 replies; 17+ messages in thread From: Stuart_Hayes @ 2007-08-10 15:14 UTC (permalink / raw) To: jikos, webmaster; +Cc: linux-kernel, linux-usb-devel Jiri Kosina wrote: > On Fri, 10 Aug 2007, Daniel Exner wrote: > >> After some serious hangs with 2.6.23-rc2 I did some bisects and this >> was the result: >> 196705c9bbc03540429b0f7cf9ee35c2f928a534 is first bad commit commit >> 196705c9bbc03540429b0f7cf9ee35c2f928a534 >> Author: Stuart_Hayes@Dell.com <Stuart_Hayes@Dell.com> >> Date: Thu May 3 08:58:49 2007 -0700 > > I guess that the patch attached to bug 8535 in kernel.org bugzilla -- > http://bugzilla.kernel.org/attachment.cgi?id=12228&action=view -- > solves your issues, right? > > Stuart, did you submit this fix for upstream already please? Yes... http://marc.info/?l=linux-usb-devel&m=118598561010046&w=2 However, I have not tested this with a VIA EHCI controller (though it's been tested with Intel, Broadcom, and nVidia). This patch uses the "inactivate" bit in the QH, which wasn't previously used by the linux kernel, and I found that the different vendors of EHCI controllers (Intel, Broadcom, nVidia) all handle this a little differently. There's probably something about the way VIA controllers respond to seeing this bit set that is breaking things. I'll try to get my hands on a VIA EHCI controller so I can look at this... if you happen to know of an add-in card that has one of these, please let me know! It would be a lot easier for me to debug this myself here than to try to get someone else to run test kernels for me... ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2007-08-15 19:12 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-08-10 8:45 EHCI Regression in 2.6.23-rc2 Daniel Exner 2007-08-10 9:27 ` Jiri Kosina 2007-08-10 13:08 ` Daniel Exner 2007-08-13 20:48 ` Stuart_Hayes 2007-08-13 22:16 ` David Brownell 2007-08-14 6:43 ` Daniel Exner 2007-08-14 8:01 ` David Brownell 2007-08-14 9:46 ` Daniel Exner 2007-08-14 15:13 ` Stuart_Hayes 2007-08-14 15:49 ` David Brownell 2007-08-14 21:33 ` Stuart_Hayes 2007-08-15 18:38 ` Stuart_Hayes 2007-08-15 19:12 ` [linux-usb-devel] " David Brownell 2007-08-14 15:42 ` David Brownell 2007-08-14 15:57 ` Daniel Exner 2007-08-14 16:12 ` David Brownell 2007-08-10 15:14 ` Stuart_Hayes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox