* kernel freeze on 2.4.32, apparently in cached_lookup @ 2006-01-24 17:49 Chris Lightfoot 2006-01-24 21:13 ` Willy Tarreau 0 siblings, 1 reply; 3+ messages in thread From: Chris Lightfoot @ 2006-01-24 17:49 UTC (permalink / raw) To: linux-kernel I have a Pentium 4 machine running stock kernel 2.4.32 with ext3 on LVM on software RAID-1. HIMEM is enabled and the machine has 3GB of RAM. Various details of the machine and kernel as here: http://ex-parrot.com/~chris/tmp/20060124/caesious-.config http://ex-parrot.com/~chris/tmp/20060124/caesious-cpuinfo http://ex-parrot.com/~chris/tmp/20060124/caesious-lsmod http://ex-parrot.com/~chris/tmp/20060124/caesious-lspci Occasionally -- often when running updatedb or another disk-heavy cron job, but sometimes during normal use of the machine -- the machine freezes up almost entirely (mouse pointer stops working, ditto VC switching, no console output if on the text console, SSH sessions freeze, but network packet forwarding and NAT still work). There's no output on the VGA console and the machine doesn't respond to Ctrl-Alt-Sysrq, but does respond to break+... on the serial console. That gives sysrq-p output like this, from the most recent freeze: SysRq : Show Regs Pid: 30641, comm: updatedb EIP: 0010:d_lookup+63/110 CPU: 0 EFLAGS: 00000287 Tainted: P EAX: c8632710 EBX: c8632700 ECX: 00000012 EDX: 13fe1842 ESI: d373b000 EDI: 0003ffff EBP: ea93bedc DS: 0018 ES: 0018 CR0: 8005003b CR2: 080a4094 CR3: 2965b000 CR4: 000006d0 Call Trace: cached_lookup+11/50 link_path_walk+63b/900 vfs_permission+79/120 path_lookup+1e/30 __user_walk+2b/50 sys_lstat64+17/70 system_call+33/38 -- repeating sysrq+p suggests that the kernel is stuck in d_lookup: http://ex-parrot.com/~chris/tmp/20060124/caesious-regs-symbols There's no oops or other message logged. (I'm running a uniprocessor kernel -- the SMP kernel also freezes under similar circumstances, and I wanted to eliminate the SMP code as a source of problems.) Does this look like a known problem? If not, what should I do next to track down the problem? In particular, what other information should I try to collect next time it freezes? (Please cc replies to me if possible....) -- Q. Can I make copies of the copyright form? (US Copyright Office FAQ) ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: kernel freeze on 2.4.32, apparently in cached_lookup 2006-01-24 17:49 kernel freeze on 2.4.32, apparently in cached_lookup Chris Lightfoot @ 2006-01-24 21:13 ` Willy Tarreau 2006-01-25 1:54 ` Chris Lightfoot 0 siblings, 1 reply; 3+ messages in thread From: Willy Tarreau @ 2006-01-24 21:13 UTC (permalink / raw) To: Chris Lightfoot; +Cc: linux-kernel Hi, On Tue, Jan 24, 2006 at 05:49:28PM +0000, Chris Lightfoot wrote: > I have a Pentium 4 machine running stock kernel 2.4.32 > with ext3 on LVM on software RAID-1. HIMEM is enabled and > the machine has 3GB of RAM. Various details of the machine > and kernel as here: > > http://ex-parrot.com/~chris/tmp/20060124/caesious-.config > http://ex-parrot.com/~chris/tmp/20060124/caesious-cpuinfo > http://ex-parrot.com/~chris/tmp/20060124/caesious-lsmod > http://ex-parrot.com/~chris/tmp/20060124/caesious-lspci > > Occasionally -- often when running updatedb or another > disk-heavy cron job, but sometimes during normal use of > the machine -- the machine freezes up almost entirely > (mouse pointer stops working, ditto VC switching, no > console output if on the text console, SSH sessions > freeze, but network packet forwarding and NAT still work). > There's no output on the VGA console and the machine > doesn't respond to Ctrl-Alt-Sysrq, but does respond to > break+... on the serial console. That gives sysrq-p output > like this, from the most recent freeze: > > SysRq : Show Regs > Pid: 30641, comm: updatedb > EIP: 0010:d_lookup+63/110 CPU: 0 EFLAGS: 00000287 Tainted: P > EAX: c8632710 EBX: c8632700 ECX: 00000012 EDX: 13fe1842 > ESI: d373b000 EDI: 0003ffff EBP: ea93bedc DS: 0018 ES: 0018 > CR0: 8005003b CR2: 080a4094 CR3: 2965b000 CR4: 000006d0 > Call Trace: cached_lookup+11/50 link_path_walk+63b/900 vfs_permission+79/120 path_lookup+1e/30 __user_walk+2b/50 sys_lstat64+17/70 system_call+33/38 > > -- repeating sysrq+p suggests that the kernel is stuck in > d_lookup: > > http://ex-parrot.com/~chris/tmp/20060124/caesious-regs-symbols > > There's no oops or other message logged. > > (I'm running a uniprocessor kernel -- the SMP kernel also > freezes under similar circumstances, and I wanted to > eliminate the SMP code as a source of problems.) > > Does this look like a known problem? If not, what should I > do next to track down the problem? In particular, what > other information should I try to collect next time it > freezes? It seems a little weird. I've never seen such a case yet, but found a few ones looking like yours, but there is nothing common between them (various FS, +/- highmem, ...) and all of them only report oops or panics. No interesting response anyway. What seems strange in your report is that the kernel freezes. The only part in cached_lookup() which could freeze IMHO is when it calls d_lookup(), but for this, you should have a closed loop instead of a linked list. It could happen with some memory corruption, but you would get far more oopses and panics than freezes. For this reason, I believe you might have some random problem on your filesystem. Could you run a full fsck on it ? If it does not find anything, probably that a night-long memtest will give us some indications. > (Please cc replies to me if possible....) Regards, Willy ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: kernel freeze on 2.4.32, apparently in cached_lookup 2006-01-24 21:13 ` Willy Tarreau @ 2006-01-25 1:54 ` Chris Lightfoot 0 siblings, 0 replies; 3+ messages in thread From: Chris Lightfoot @ 2006-01-25 1:54 UTC (permalink / raw) To: Willy Tarreau; +Cc: linux-kernel On Tue, Jan 24, 2006 at 10:13:12PM +0100, Willy Tarreau wrote: [...] > What seems strange in your report is that the kernel freezes. > The only part in cached_lookup() which could freeze IMHO is > when it calls d_lookup(), but for this, you should have a > closed loop instead of a linked list. It could happen with > some memory corruption, but you would get far more oopses > and panics than freezes. For this reason, I believe you > might have some random problem on your filesystem. Could > you run a full fsck on it ? fsck finds the filesystem is clean; I ran memtest overnight when I built the machine and it didn't find anything. Nick's suggestion that it could be a temperature problem is also interesting; I've added another fan to the machine and I'll see if that helps matters; if not I'll try memtest again. -- ``It's not a bomb. It's a device that explodes.'' (possibly-apocryphal statement by French spokesman, before the 1995 nuclear tests) ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2006-01-25 1:54 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-01-24 17:49 kernel freeze on 2.4.32, apparently in cached_lookup Chris Lightfoot 2006-01-24 21:13 ` Willy Tarreau 2006-01-25 1:54 ` Chris Lightfoot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox