* Re: Possible critical VIA vt82c686a chip bug (private question) [not found] <20001026173244.B8290@suse.cz> @ 2000-10-27 12:04 ` bart 2000-10-27 13:41 ` Vojtech Pavlik 0 siblings, 1 reply; 18+ messages in thread From: bart @ 2000-10-27 12:04 UTC (permalink / raw) To: linux-kernel; +Cc: vojtech On 26 Oct, Vojtech Pavlik wrote: > On Thu, Oct 26, 2000 at 04:42:31PM +0200, Yoann Vandoorselaere wrote: > >> > On Thu, Oct 26, 2000 at 04:20:43PM +0200, Yoann Vandoorselaere wrote: >> > >> > > ... >> > > >> > > Have you any idea what is the relation between time and this chip ? >> > > >> > > Also, I'm experiencing the problem for several month on my >> > > workstation and I never could find where it was comming from... >> > > how did you do ? >> > >> > Well, it integrates both the i8253 PIT and the vt82c586 IDE controller. >> > >> > I first located the wrong time was coming from gettimeofday() and not >> > from the other sources of time the kernel provides. And then I was >> > tracking the problem (which actually is an underflow - the chip bug >> > causes some time offset variables go negative - 0xffffffff microseconds >> > is about 1:20 hours). And this way I got to the spot where the patch >> > cures the problem. >> >> Ok, here is what I experienced : >> >> First what is strange is that : >> - I'm using SCSI >> - I just have an IDE disk for mp3. >> The IDE subsystem is never used heavilly... >> >> I've experienced the problem after some time of >> heavy scsi IO, my screen under X was going black (like with dpms) >> When I was moving the mouse, the image was coming back >> for < 1 seconds, then black screen... >> >> The only fix was to kill X then to reboot. >> >> Anyway, thanks for your explaination... >> I'll do a feedback for this patch ASAP. > > Interesting. If it's caused by SCSI as well (might be), then it's not > caused by heavy IDE activity but rather than that it could be heavy > BusMastering activity instead (The IDE chip does BM as well). > > I'm still wondering if it could be a Linux kernel bug (bad/concurrent > accesses to the i8253 registers), this has to be checked. > How sure are you that the chip is actually buggy? I ran into something similar a while ago, when I mixed the two arguments to an outb in a driver, and ended up writing MYPORT into the timer instead of 0x40 into MYPORT. Bart -- Bart Hartgers - TUE Eindhoven Get my GPG key at http://etpmod.phys.tue.nl/bart/pubkey.gpg - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-27 12:04 ` Possible critical VIA vt82c686a chip bug (private question) bart @ 2000-10-27 13:41 ` Vojtech Pavlik 2000-10-28 5:39 ` TimO 0 siblings, 1 reply; 18+ messages in thread From: Vojtech Pavlik @ 2000-10-27 13:41 UTC (permalink / raw) To: bart; +Cc: linux-kernel On Fri, Oct 27, 2000 at 02:04:58PM +0200, bart@etpmod.phys.tue.nl wrote: > > Interesting. If it's caused by SCSI as well (might be), then it's not > > caused by heavy IDE activity but rather than that it could be heavy > > BusMastering activity instead (The IDE chip does BM as well). > > > > I'm still wondering if it could be a Linux kernel bug (bad/concurrent > > accesses to the i8253 registers), this has to be checked. > > > > How sure are you that the chip is actually buggy? I ran into something > similar a while ago, when I mixed the two arguments to an outb in a driver, > and ended up writing MYPORT into the timer instead of 0x40 into MYPORT. I'm *not* sure. It just looks like a reasonable explanation. It doesn't happen on Intel chips and older VIA chips, it only happens on new VIA chips, and the code is the same all the time. Also, it happens both with 2.2 and 2.4 kernels ... -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-27 13:41 ` Vojtech Pavlik @ 2000-10-28 5:39 ` TimO 0 siblings, 0 replies; 18+ messages in thread From: TimO @ 2000-10-28 5:39 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: linux-kernel Vojtech Pavlik wrote: > > I'm *not* sure. It just looks like a reasonable explanation. It doesn't > happen on Intel chips and older VIA chips, it only happens on new VIA > chips, and the code is the same all the time. Also, it happens both with > 2.2 and 2.4 kernels ... > > -- > Vojtech Pavlik > SuSE Labs > Do you have a method guaranteed to reproduce this? I have a newer VIA chipset and haven't (yet) observed this problem. Host bridge: VIA Technologies, Inc. VT8371 [KX133] (rev 2). PCI bridge: VIA Technologies, Inc. VT8371 [KX133 AGP] (rev 0). ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 34). IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 16). Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 48). Multimedia audio controller: VIA Technologies, Inc. AC97 Audio Controller (rev 32). =============== -- TimO --------------------==============++==============-------------------- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <m3d7gnd31m.fsf@test1.mandrakesoft.com>]
[parent not found: <Pine.LNX.3.95.1001026115039.12337A-100000@chaos.analogic.com>]
* Re: Possible critical VIA vt82c686a chip bug (private question) [not found] ` <Pine.LNX.3.95.1001026115039.12337A-100000@chaos.analogic.com> @ 2000-10-26 17:03 ` Vojtech Pavlik 2000-10-26 17:42 ` Richard B. Johnson 2000-10-27 10:02 ` Martin Mares 0 siblings, 2 replies; 18+ messages in thread From: Vojtech Pavlik @ 2000-10-26 17:03 UTC (permalink / raw) To: Richard B. Johnson; +Cc: Yoann Vandoorselaere, linux-kernel On Thu, Oct 26, 2000 at 12:04:21PM -0400, Richard B. Johnson wrote: > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > to the timer. It writes 0 to the control-word for timer 0. This > does the following: > > o Selects timer 0. > o Latches the timer. > o Selects mode 0. > o Programs it to a 16 bit counter. > > The result is a latched (stopped) counter. Bits 5 and 4 should have been > selected. Then you read bits 0-7 from 0x40, followed by bits 8-15 from > the same port. > > Also, there is no spin-lock protecting access to these ports. If anybody > else is mucking with the timer, all bets are off. Well, at least on 2.4.0-test9, the above timing code is #ifed to DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in include/linux/ide.h. So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-26 17:03 ` Vojtech Pavlik @ 2000-10-26 17:42 ` Richard B. Johnson 2000-10-26 18:02 ` Vojtech Pavlik 2000-10-27 10:02 ` Martin Mares 1 sibling, 1 reply; 18+ messages in thread From: Richard B. Johnson @ 2000-10-26 17:42 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: Yoann Vandoorselaere, linux-kernel On Thu, 26 Oct 2000, Vojtech Pavlik wrote: > On Thu, Oct 26, 2000 at 12:04:21PM -0400, Richard B. Johnson wrote: > > > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > > to the timer. It writes 0 to the control-word for timer 0. This > > does the following: [Snipped...] > > Well, at least on 2.4.0-test9, the above timing code is #ifed to > DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in > include/linux/ide.h. > > So this is not our problem here. Anyway I guess it's time to hunt for > i8259 accesses in the kernel that lack the necessary spinlock, even when > they're not probably the cause of the problem we see here. Okay, good. Cheers, Dick Johnson Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-26 17:42 ` Richard B. Johnson @ 2000-10-26 18:02 ` Vojtech Pavlik 2000-10-26 20:11 ` Yoann Vandoorselaere 0 siblings, 1 reply; 18+ messages in thread From: Vojtech Pavlik @ 2000-10-26 18:02 UTC (permalink / raw) To: Richard B. Johnson; +Cc: Yoann Vandoorselaere, linux-kernel On Thu, Oct 26, 2000 at 01:42:29PM -0400, Richard B. Johnson wrote: > > > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > > > to the timer. It writes 0 to the control-word for timer 0. This > > > does the following: > [Snipped...] > > > > Well, at least on 2.4.0-test9, the above timing code is #ifed to > > DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in > > include/linux/ide.h. > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > they're not probably the cause of the problem we see here. > > Okay, good. Ok, here is a list of places within the kernel that access the PIT timer, plus the method of locking (i386 arch only): Usage: Lock method: arch/i386/kernel/time.c:170: spin_lock() arch/i386/kernel/time.c:491: spin_lock() arch/i386/kernel/time.c:575: none (init) arch/i386/kernel/i8259.c:491: none (init) arch/i386/kernel/apm.c:871: cli() arch/i386/kernel/apic.c:398: spin_lock_irqsave() drivers/char/vt.c:121: cli() drivers/char/ftape/lowlevel/ftape-calibr.c:80: cli() drivers/char/ftape/lowlevel/ftape-calibr.c:99: cli() drivers/char/joystick/analog.c:142: cli() __cli() drivers/char/joystick/gameport.c:66: cli() drivers/ide/hd.c:137: cli() drivers/ide/ide.c:206: __cli() I guess we'll need to fix this. While races here are not likely (the most likely is a beep by vt.c at a wrong moment), they're possible. However, these don't seem to be the cause of the problem we see here anyway. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-26 18:02 ` Vojtech Pavlik @ 2000-10-26 20:11 ` Yoann Vandoorselaere 2000-10-26 20:16 ` Vojtech Pavlik 0 siblings, 1 reply; 18+ messages in thread From: Yoann Vandoorselaere @ 2000-10-26 20:11 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: Richard B. Johnson, linux-kernel Vojtech Pavlik <vojtech@suse.cz> writes: > On Thu, Oct 26, 2000 at 01:42:29PM -0400, Richard B. Johnson wrote: > > > > > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > > > > to the timer. It writes 0 to the control-word for timer 0. This > > > > does the following: > > [Snipped...] > > > > > > Well, at least on 2.4.0-test9, the above timing code is #ifed to > > > DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in > > > include/linux/ide.h. > > > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > they're not probably the cause of the problem we see here. > > > > Okay, good. > > Ok, here is a list of places within the kernel that access the PIT > timer, plus the method of locking (i386 arch only): [...] Ok, I just tested if the problem was always present without the IDE subsystem... The answer is it is not... so it isn't an IDE problem. -- -- Yoann http://www.mandrakesoft.com/~yoann/ An engineer from NVidia, while asking him to release cards specs said : "Actually, we do write our drivers without documentation." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-26 20:11 ` Yoann Vandoorselaere @ 2000-10-26 20:16 ` Vojtech Pavlik 2000-10-26 21:05 ` Yoann Vandoorselaere 0 siblings, 1 reply; 18+ messages in thread From: Vojtech Pavlik @ 2000-10-26 20:16 UTC (permalink / raw) To: Yoann Vandoorselaere; +Cc: Richard B. Johnson, linux-kernel On Thu, Oct 26, 2000 at 10:11:54PM +0200, Yoann Vandoorselaere wrote: > > > > > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > > > > > to the timer. It writes 0 to the control-word for timer 0. This > > > > > does the following: > > > [Snipped...] > > > > > > > > Well, at least on 2.4.0-test9, the above timing code is #ifed to > > > > DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in > > > > include/linux/ide.h. > > > > > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > > they're not probably the cause of the problem we see here. > > > > > > Okay, good. > > > > Ok, here is a list of places within the kernel that access the PIT > > timer, plus the method of locking (i386 arch only): > > [...] > > Ok, I just tested if the problem was always present without > the IDE subsystem... > > The answer is it is not... so it isn't an IDE problem. Uh, guess too many negations. You wanted to say that the problem was present even when you disabled the IDE subsystem, right? So now it seems that possibly enough PCI traffic / busmastering traffic can cause the problem ... -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-26 20:16 ` Vojtech Pavlik @ 2000-10-26 21:05 ` Yoann Vandoorselaere 2000-10-26 21:15 ` Vojtech Pavlik 0 siblings, 1 reply; 18+ messages in thread From: Yoann Vandoorselaere @ 2000-10-26 21:05 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: Richard B. Johnson, linux-kernel Vojtech Pavlik <vojtech@suse.cz> writes: > On Thu, Oct 26, 2000 at 10:11:54PM +0200, Yoann Vandoorselaere wrote: > > > > > > > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > > > > > > to the timer. It writes 0 to the control-word for timer 0. This > > > > > > does the following: > > > > [Snipped...] > > > > > > > > > > Well, at least on 2.4.0-test9, the above timing code is #ifed to > > > > > DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in > > > > > include/linux/ide.h. > > > > > > > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > > > they're not probably the cause of the problem we see here. > > > > > > > > Okay, good. > > > > > > Ok, here is a list of places within the kernel that access the PIT > > > timer, plus the method of locking (i386 arch only): > > > > [...] > > > > Ok, I just tested if the problem was always present without > > the IDE subsystem... > > > > The answer is it is not... so it isn't an IDE problem. > > Uh, guess too many negations. You wanted to say that the problem was > present even when you disabled the IDE subsystem, right? yop > > So now it seems that possibly enough PCI traffic / busmastering traffic > can cause the problem ... yop, I 've done : make -j10 World in the xfree tree and simulateously : while true; do make dep && make clean && make bzImage; done in the kernel tree -- -- Yoann http://www.mandrakesoft.com/~yoann/ An engineer from NVidia, while asking him to release cards specs said : "Actually, we do write our drivers without documentation." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-26 21:05 ` Yoann Vandoorselaere @ 2000-10-26 21:15 ` Vojtech Pavlik 2000-10-26 21:24 ` Yoann Vandoorselaere 0 siblings, 1 reply; 18+ messages in thread From: Vojtech Pavlik @ 2000-10-26 21:15 UTC (permalink / raw) To: Yoann Vandoorselaere; +Cc: Richard B. Johnson, linux-kernel On Thu, Oct 26, 2000 at 11:05:04PM +0200, Yoann Vandoorselaere wrote: > yop, I 've done : > > make -j10 World > in the xfree tree and simulateously : > > while true; do make dep && make clean && make bzImage; done > in the kernel tree Now it'd be nice to verify that the problem also happens when the system is not running out of memory (which -j10 quite causes I think) ... -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-26 21:15 ` Vojtech Pavlik @ 2000-10-26 21:24 ` Yoann Vandoorselaere 2000-10-26 21:25 ` Vojtech Pavlik 0 siblings, 1 reply; 18+ messages in thread From: Yoann Vandoorselaere @ 2000-10-26 21:24 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: Richard B. Johnson, linux-kernel Vojtech Pavlik <vojtech@suse.cz> writes: > On Thu, Oct 26, 2000 at 11:05:04PM +0200, Yoann Vandoorselaere wrote: > > > yop, I 've done : > > > > make -j10 World > > in the xfree tree and simulateously : > > > > while true; do make dep && make clean && make bzImage; done > > in the kernel tree > > Now it'd be nice to verify that the problem also happens when the system > is not running out of memory (which -j10 quite causes I think) ... Nope, my system was loaded, but was usable (at least until the problem occured)... Athlon 750 with 128mb of ram and 103mb of swap. -- -- Yoann http://www.mandrakesoft.com/~yoann/ An engineer from NVidia, while asking him to release cards specs said : "Actually, we do write our drivers without documentation." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-26 21:24 ` Yoann Vandoorselaere @ 2000-10-26 21:25 ` Vojtech Pavlik 0 siblings, 0 replies; 18+ messages in thread From: Vojtech Pavlik @ 2000-10-26 21:25 UTC (permalink / raw) To: Yoann Vandoorselaere; +Cc: Richard B. Johnson, linux-kernel On Thu, Oct 26, 2000 at 11:24:38PM +0200, Yoann Vandoorselaere wrote: > Vojtech Pavlik <vojtech@suse.cz> writes: > > > On Thu, Oct 26, 2000 at 11:05:04PM +0200, Yoann Vandoorselaere wrote: > > > > > yop, I 've done : > > > > > > make -j10 World > > > in the xfree tree and simulateously : > > > > > > while true; do make dep && make clean && make bzImage; done > > > in the kernel tree > > > > Now it'd be nice to verify that the problem also happens when the system > > is not running out of memory (which -j10 quite causes I think) ... > > Nope, my system was loaded, but was usable > (at least until the problem occured)... Good to know. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-26 17:03 ` Vojtech Pavlik 2000-10-26 17:42 ` Richard B. Johnson @ 2000-10-27 10:02 ` Martin Mares 2000-10-27 10:49 ` Vojtech Pavlik 1 sibling, 1 reply; 18+ messages in thread From: Martin Mares @ 2000-10-27 10:02 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: Richard B. Johnson, Yoann Vandoorselaere, linux-kernel Hi! > So this is not our problem here. Anyway I guess it's time to hunt for > i8259 accesses in the kernel that lack the necessary spinlock, even when > they're not probably the cause of the problem we see here. BTW what about trying to modify your work-around code to make it attempt to read the timer again? This way we could test whether it was a race condition during timer read or really timer jumping to a bogus value. Have a nice fortnight -- Martin `MJ' Mares <mj@ucw.cz> <mj@suse.cz> http://atrey.karlin.mff.cuni.cz/~mj/ "This line is umop apisdn." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-27 10:02 ` Martin Mares @ 2000-10-27 10:49 ` Vojtech Pavlik 2000-10-27 10:58 ` Yoann Vandoorselaere 0 siblings, 1 reply; 18+ messages in thread From: Vojtech Pavlik @ 2000-10-27 10:49 UTC (permalink / raw) To: Martin Mares; +Cc: Richard B. Johnson, Yoann Vandoorselaere, linux-kernel On Fri, Oct 27, 2000 at 12:02:20PM +0200, Martin Mares wrote: > > So this is not our problem here. Anyway I guess it's time to hunt for > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > they're not probably the cause of the problem we see here. > > BTW what about trying to modify your work-around code to make it > attempt to read the timer again? This way we could test whether it was > a race condition during timer read or really timer jumping to a bogus > value. Actually if I don't reprogram the timer (and just ignore the value for example), the work-around code keeps being called again and again very often (between 1x/minute to 100x/second) after the first failure, even when the system is idle. When reprogramming, next failure happens only after stressing the system again. So it's not just a race, the impact of the failure on the chip is permanent and stays till it's reprogrammed. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-27 10:49 ` Vojtech Pavlik @ 2000-10-27 10:58 ` Yoann Vandoorselaere 2000-10-27 11:01 ` Vojtech Pavlik 0 siblings, 1 reply; 18+ messages in thread From: Yoann Vandoorselaere @ 2000-10-27 10:58 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: Martin Mares, Richard B. Johnson, linux-kernel Vojtech Pavlik <vojtech@suse.cz> writes: > On Fri, Oct 27, 2000 at 12:02:20PM +0200, Martin Mares wrote: > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > they're not probably the cause of the problem we see here. > > > > BTW what about trying to modify your work-around code to make it > > attempt to read the timer again? This way we could test whether it was > > a race condition during timer read or really timer jumping to a bogus > > value. > > Actually if I don't reprogram the timer (and just ignore the value for > example), the work-around code keeps being called again and again very > often (between 1x/minute to 100x/second) after the first failure, even > when the system is idle. > > When reprogramming, next failure happens only after stressing the system > again. > > So it's not just a race, the impact of the failure on the chip is > permanent and stays till it's reprogrammed. Are you sure there is not an error in the way the chipset is programmed ? -- -- Yoann http://www.mandrakesoft.com/~yoann/ "Programming is a race between programmers, who try and make more and more idiot-proof software, and universe, which produces more and more remarkable idiots. Until now, universe leads the race" -- R. Cook - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-27 10:58 ` Yoann Vandoorselaere @ 2000-10-27 11:01 ` Vojtech Pavlik 2000-10-27 11:16 ` Yoann Vandoorselaere 0 siblings, 1 reply; 18+ messages in thread From: Vojtech Pavlik @ 2000-10-27 11:01 UTC (permalink / raw) To: Yoann Vandoorselaere; +Cc: Martin Mares, Richard B. Johnson, linux-kernel On Fri, Oct 27, 2000 at 12:58:12PM +0200, Yoann Vandoorselaere wrote: > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > > they're not probably the cause of the problem we see here. > > > > > > BTW what about trying to modify your work-around code to make it > > > attempt to read the timer again? This way we could test whether it was > > > a race condition during timer read or really timer jumping to a bogus > > > value. > > > > Actually if I don't reprogram the timer (and just ignore the value for > > example), the work-around code keeps being called again and again very > > often (between 1x/minute to 100x/second) after the first failure, even > > when the system is idle. > > > > When reprogramming, next failure happens only after stressing the system > > again. > > > > So it's not just a race, the impact of the failure on the chip is > > permanent and stays till it's reprogrammed. > > Are you sure there is not an error in the way the > chipset is programmed ? Which part of the chipset you mean? The PIT (programmable interrupt timer)? That one is standard since XT times. The rest of the ISA bridge? Maybe, but that's mostly BIOS work and shouldn't impact the PIT under sane conditions. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-27 11:01 ` Vojtech Pavlik @ 2000-10-27 11:16 ` Yoann Vandoorselaere 2000-10-27 11:15 ` Vojtech Pavlik 0 siblings, 1 reply; 18+ messages in thread From: Yoann Vandoorselaere @ 2000-10-27 11:16 UTC (permalink / raw) To: Vojtech Pavlik; +Cc: Martin Mares, Richard B. Johnson, linux-kernel Vojtech Pavlik <vojtech@suse.cz> writes: > On Fri, Oct 27, 2000 at 12:58:12PM +0200, Yoann Vandoorselaere wrote: > > > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > > > they're not probably the cause of the problem we see here. > > > > > > > > BTW what about trying to modify your work-around code to make it > > > > attempt to read the timer again? This way we could test whether it was > > > > a race condition during timer read or really timer jumping to a bogus > > > > value. > > > > > > Actually if I don't reprogram the timer (and just ignore the value for > > > example), the work-around code keeps being called again and again very > > > often (between 1x/minute to 100x/second) after the first failure, even > > > when the system is idle. > > > > > > When reprogramming, next failure happens only after stressing the system > > > again. > > > > > > So it's not just a race, the impact of the failure on the chip is > > > permanent and stays till it's reprogrammed. > > > > Are you sure there is not an error in the way the > > chipset is programmed ? > > Which part of the chipset you mean? The PIT (programmable interrupt > timer)? That one is standard since XT times. The rest of the ISA bridge? > Maybe, but that's mostly BIOS work and shouldn't impact the PIT > under sane conditions. What is strange is that a number of persons seem to be hit by this problem... And if VIA didn't corrected it it's probably because they are not aware of it... I think that if such problem occured under windows (thinking to the windows user base), VIA would be already in touch. -- -- Yoann http://www.mandrakesoft.com/~yoann/ Tiniest "mesures unities?" - lenght : millimeter - volume : milliliter - intelligence : military man - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Possible critical VIA vt82c686a chip bug (private question) 2000-10-27 11:16 ` Yoann Vandoorselaere @ 2000-10-27 11:15 ` Vojtech Pavlik 0 siblings, 0 replies; 18+ messages in thread From: Vojtech Pavlik @ 2000-10-27 11:15 UTC (permalink / raw) To: Yoann Vandoorselaere; +Cc: Martin Mares, Richard B. Johnson, linux-kernel On Fri, Oct 27, 2000 at 01:16:34PM +0200, Yoann Vandoorselaere wrote: > > Which part of the chipset you mean? The PIT (programmable interrupt > > timer)? That one is standard since XT times. The rest of the ISA bridge? > > Maybe, but that's mostly BIOS work and shouldn't impact the PIT > > under sane conditions. > > What is strange is that a number of persons seem to be hit by this > problem... And if VIA didn't corrected it it's probably because > they are not aware of it... > > I think that if such problem occured under windows > (thinking to the windows user base), VIA would be already in touch. It can't happen under Windows, because Windows timer runs at 18 Hz (timer programmed to 65535), while Linux uses 100 Hz (timer programmed to approx 11920), so when the timer unprograms itself due to the bug to 65535, only Linux notices it, Windows can't. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2000-10-28 5:36 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20001026173244.B8290@suse.cz>
2000-10-27 12:04 ` Possible critical VIA vt82c686a chip bug (private question) bart
2000-10-27 13:41 ` Vojtech Pavlik
2000-10-28 5:39 ` TimO
[not found] <m3d7gnd31m.fsf@test1.mandrakesoft.com>
[not found] ` <Pine.LNX.3.95.1001026115039.12337A-100000@chaos.analogic.com>
2000-10-26 17:03 ` Vojtech Pavlik
2000-10-26 17:42 ` Richard B. Johnson
2000-10-26 18:02 ` Vojtech Pavlik
2000-10-26 20:11 ` Yoann Vandoorselaere
2000-10-26 20:16 ` Vojtech Pavlik
2000-10-26 21:05 ` Yoann Vandoorselaere
2000-10-26 21:15 ` Vojtech Pavlik
2000-10-26 21:24 ` Yoann Vandoorselaere
2000-10-26 21:25 ` Vojtech Pavlik
2000-10-27 10:02 ` Martin Mares
2000-10-27 10:49 ` Vojtech Pavlik
2000-10-27 10:58 ` Yoann Vandoorselaere
2000-10-27 11:01 ` Vojtech Pavlik
2000-10-27 11:16 ` Yoann Vandoorselaere
2000-10-27 11:15 ` Vojtech Pavlik
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox