* 2.4.18 is not SMP friendly
@ 2002-07-18 10:51 devik
2002-07-18 13:45 ` Alan Cox
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: devik @ 2002-07-18 10:51 UTC (permalink / raw)
To: linux-kernel
Hello,
I someone here running 2.4.18 on PII SMP successfully ?
My SMP box was happily running 2.4.3 but after upgrade
to 2.4.18 I got 3 oopses in 4 days.
All was FS related, one during heavy access to SCSI and
IDE in paralel (I post ksymoops output recently but nobody
seemed interested) ane during cdrecord running in paralel
with SCSI HDD (IDE cdwritter) and latest when trying to
mount IDE ZIP drive with corrupted ZIP floppy. Latest
resulted in system panic and freeze so no output here :(
This is like scream into dark because I rebooted with
maxcpus=1 and it seems to be ok now and I don't want to
experiment with production server anymore.
But is someone knows the problem I'm willing to test some
patches, hacks .. etc
Seems to me like missing spinlock somewhere ..
thanks,
devik
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: 2.4.18 is not SMP friendly 2002-07-18 10:51 2.4.18 is not SMP friendly devik @ 2002-07-18 13:45 ` Alan Cox 2002-07-18 13:53 ` Tommy Faasen ` (2 more replies) 2002-07-18 14:10 ` J.A. Magallon 2002-07-18 15:10 ` SMP & MCE [Was: 2.4.18 is not SMP friendly] Mika Liljeberg 2 siblings, 3 replies; 8+ messages in thread From: Alan Cox @ 2002-07-18 13:45 UTC (permalink / raw) To: devik; +Cc: linux-kernel On Thu, 2002-07-18 at 11:51, devik wrote: > I someone here running 2.4.18 on PII SMP successfully ? PPro in my case but yes. 2.4.18 ought to be pretty solid except for some annoying bugs you'll only hit if you use smbfs. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18 is not SMP friendly 2002-07-18 13:45 ` Alan Cox @ 2002-07-18 13:53 ` Tommy Faasen 2002-07-18 14:13 ` mbs 2002-07-18 15:36 ` Chris Ricker 2002-07-18 15:44 ` devik 2 siblings, 1 reply; 8+ messages in thread From: Tommy Faasen @ 2002-07-18 13:53 UTC (permalink / raw) To: devik, linux-kernel > On Thu, 2002-07-18 at 11:51, devik wrote: >> I someone here running 2.4.18 on PII SMP successfully ? > No problems on my side, on 2.4.18 and 2.4.18-wolk-3.5rc3. > PPro in my case but yes. 2.4.18 ought to be pretty solid except for some > annoying bugs you'll only hit if you use smbfs. > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" > in the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18 is not SMP friendly 2002-07-18 13:53 ` Tommy Faasen @ 2002-07-18 14:13 ` mbs 0 siblings, 0 replies; 8+ messages in thread From: mbs @ 2002-07-18 14:13 UTC (permalink / raw) To: Tommy Faasen, devik, linux-kernel I've had problems w/P4 SMP on 2.4.18 and RH2.4.18-3 where after a while (30-40 min after boot) it would slow to a crawl, and the disk would be constantly going, but CPU usage would be ~0%. with 2 gigs of RAM and nothing running (not even x).... RH2.4.18-5 does not seem to have the problem. On Thursday 18 July 2002 09:53, Tommy Faasen wrote: > > On Thu, 2002-07-18 at 11:51, devik wrote: > >> I someone here running 2.4.18 on PII SMP successfully ? > > No problems on my side, on 2.4.18 and 2.4.18-wolk-3.5rc3. > > > PPro in my case but yes. 2.4.18 ought to be pretty solid except for some > > annoying bugs you'll only hit if you use smbfs. > > > > - > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" > > in the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- /************************************************** ** Mark Salisbury || mbs@mc.com ** ** If you would like to sponsor me for the ** ** Mass Getaway, a 150 mile bicycle ride to for ** ** MS, contact me to donate by cash or check or ** ** click the link below to donate by credit card ** **************************************************/ https://www.nationalmssociety.org/pledge/pledge.asp?participantid=86736 ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18 is not SMP friendly 2002-07-18 13:45 ` Alan Cox 2002-07-18 13:53 ` Tommy Faasen @ 2002-07-18 15:36 ` Chris Ricker 2002-07-18 15:44 ` devik 2 siblings, 0 replies; 8+ messages in thread From: Chris Ricker @ 2002-07-18 15:36 UTC (permalink / raw) To: Alan Cox; +Cc: devik, linux-kernel On 18 Jul 2002, Alan Cox wrote: > On Thu, 2002-07-18 at 11:51, devik wrote: > > I someone here running 2.4.18 on PII SMP successfully ? > > PPro in my case but yes. 2.4.18 ought to be pretty solid except for some > annoying bugs you'll only hit if you use smbfs. Or if you use data=journal w/ ext3.... later, chris ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18 is not SMP friendly 2002-07-18 13:45 ` Alan Cox 2002-07-18 13:53 ` Tommy Faasen 2002-07-18 15:36 ` Chris Ricker @ 2002-07-18 15:44 ` devik 2 siblings, 0 replies; 8+ messages in thread From: devik @ 2002-07-18 15:44 UTC (permalink / raw) To: Alan Cox; +Cc: linux-kernel Hi, Yes I use smbfs. Regarding my oops report, is there known bug where waitqueue would be corrupted ? When I analyzed it I found that invalid address 8bd4189c was loaded from tasklist pointer in wait_queue_head_t (sched.c, __wake_up_common line "p = curr->task"). The wakeup was called from get_new_inode and seems like if list of tasks was not initialized of what :( thanks, devik On 18 Jul 2002, Alan Cox wrote: > On Thu, 2002-07-18 at 11:51, devik wrote: > > I someone here running 2.4.18 on PII SMP successfully ? > > PPro in my case but yes. 2.4.18 ought to be pretty solid except for some > annoying bugs you'll only hit if you use smbfs. > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: 2.4.18 is not SMP friendly 2002-07-18 10:51 2.4.18 is not SMP friendly devik 2002-07-18 13:45 ` Alan Cox @ 2002-07-18 14:10 ` J.A. Magallon 2002-07-18 15:10 ` SMP & MCE [Was: 2.4.18 is not SMP friendly] Mika Liljeberg 2 siblings, 0 replies; 8+ messages in thread From: J.A. Magallon @ 2002-07-18 14:10 UTC (permalink / raw) To: devik; +Cc: linux-kernel On 2002.07.18 devik wrote: >Hello, > >I someone here running 2.4.18 on PII SMP successfully ? >My SMP box was happily running 2.4.3 but after upgrade >to 2.4.18 I got 3 oopses in 4 days. Solid as a rock on dual PII@400. Anso on a Dual Xeon and on a bunch of dual PIII boxes. Even I run jam kernels built with gcc3.1.1, but when I get into trouble 2.4.18 is there. -- J.A. Magallon \ Software is like sex: It's better when it's free mailto:jamagallon@able.es \ -- Linus Torvalds, FSF T-shirt Linux werewolf 2.4.19-rc2-jam1, Mandrake Linux 8.3 (Cooker) for i586 gcc (GCC) 3.1.1 (Mandrake Linux 8.3 3.1.1-0.7mdk) ^ permalink raw reply [flat|nested] 8+ messages in thread
* SMP & MCE [Was: 2.4.18 is not SMP friendly] 2002-07-18 10:51 2.4.18 is not SMP friendly devik 2002-07-18 13:45 ` Alan Cox 2002-07-18 14:10 ` J.A. Magallon @ 2002-07-18 15:10 ` Mika Liljeberg 2 siblings, 0 replies; 8+ messages in thread From: Mika Liljeberg @ 2002-07-18 15:10 UTC (permalink / raw) To: devik; +Cc: linux-kernel On Thu, 2002-07-18 at 13:51, devik wrote: > Hello, > > I someone here running 2.4.18 on PII SMP successfully ? > My SMP box was happily running 2.4.3 but after upgrade > to 2.4.18 I got 3 oopses in 4 days. 2 x PII (Deschutes, dA0 core). So far so good, uptime nearly 2 days now. In fact, I'm starting to have a glimmer of hope that I might finally have licked (fingers crossed) a really ugly system freeze problem which has been bugging me ever since I moved on from 2.4.0-test9 [solid freeze in less than 24 hours, on average]. I have tried numerous kernels after that, none of them helped. Not one. Well, a few days ago I got a Machine Check Exception in the log file, basically complaining about a catastrophic memory system inconsistency. First time I ever saw this, despite hundreds of lockups. I thought, whaddaya know, maybe it really is a hardware problem. So how come 2.4.0-test9 and older kernels appear to work ok? [You might ask why I'm not running a kernel that I know is more stable. Well, my home system is not that important and I've sort of learned to live with the lockups. I usually shut it down for the night, so the average uptime is good enough most days. It really is no worse than trying to run Win98, and ext3 does help a lot.] Anyway, I had already resigned to my fate, but now I decided to investigate again. It turns out that Machine Check Exceptions were, for the very first time, enabled by default in 2.4.0-test10. Also, it turns out that the PII has a surprising number of Errata related to SMP and MCEs. Almost all of them lead to a catastrophic failure and CPU shutdown. Correct execution of the MCE handler is not guaranteed either. Exactly the kind of behaviour I have been seeing. Coincidence? Maybe. It's the only hypothesis I've got, so I'm putting it to the test. According to the PII errata, some of the lockups could be eliminated by simply not enabling MCE at all. Unfortunately, this is not true for all of them. Besides, there appear to be other SMP related ones that are really ugly and completely unrelated to MCE. The worst of the errata could, however, be worked around with a BIOS patch (i.e., microcode update). Fat chance. It turns out my mobo vendor never bothered to put most of the IA32 microcode updates into the BIOS (thanks a lot Giga-Byte!). Anyway, I'm now running 2.4.18 with the machine check exceptions disabled. I've also compiled the microcode upgrade driver into the kernel and upgrade the microcode on both CPUs during Linux boot. Maybe it helps. I hope this tirade is useful to someone who is suffering from mysterious lockups or strange MCEs. Mostly I'm just happy that I have finished it and my machine is still running. Cheers, MikaL ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2002-07-18 16:10 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-07-18 10:51 2.4.18 is not SMP friendly devik 2002-07-18 13:45 ` Alan Cox 2002-07-18 13:53 ` Tommy Faasen 2002-07-18 14:13 ` mbs 2002-07-18 15:36 ` Chris Ricker 2002-07-18 15:44 ` devik 2002-07-18 14:10 ` J.A. Magallon 2002-07-18 15:10 ` SMP & MCE [Was: 2.4.18 is not SMP friendly] Mika Liljeberg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox