public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.4.19 crash
@ 2002-08-07 18:17 Michal Illich
  2002-08-07 22:04 ` Alan Cox
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Illich @ 2002-08-07 18:17 UTC (permalink / raw)
  To: linux-kernel

Hallo,

	I want to report multiple crashes while using last stable kernel, the message 
it gives is:

--------------------
Unable to handle kernel paging request at virtual address 45ca6234
  printing eip:
c0128c80
*pde = 00000000
Oops: 0000
CPU:    0
EIP:    0010:[<c0128c80>]    Not tainted
EFLAGS: 00010813
eax: 999f7887   ebx: c2217e50   ecx: df4c8000   edx: f762e680
esi: 00000246   edi: cfbc4380   ebp: 000000f0   esp: ddff7de8
ds: 0018   es: 0018   ss: 0018
Process NameOfProccessWhichWasRunning (pid: 2133, stackpage=ddff7000)
Stack: 00000000 ecb0109c 009bb6d0 e55216c0 00000000 00000000 00001000 00000001
        c0132804 c2217e50 000000f0 c01328b6 00000001 df4c8940 04a87241 00000000
        0000a324 c212b320 00000341 0000a325 c212b320 c0132ae6 c212b320 00001000
Call Trace:    [<c0132804>] [<c01328b6>] [<c0132ae6>] [<c0133158>] [<c01239ad>]
   [<c0123a47>] [<c01534f0>] [<c0124035>] [<c0124288>] [<c0126408>] [<c01247fa>]
--------------------

	Then it refuses to do "anything reasonable" (probably forking a new thread), 
but ping works and some applications are partially running (e.g. Apache 
serves static page, but not CGI). No user can log in.

	It prints the message above multiple times, probably at each atempt to do 
something. Sometimes (not the first time) it also writes "Bug: ... twice..." 
(I don't remember it, sorry, it isn't written in messages).

	On the machine are large blocks of shared memory (hundreds of MB, usually 
larger than default SHMMAX, which is now set to 256MB), crash happened while 
processing large amounts of data (few GB, hundreds of MB RAM used).

	Other info: Red Hat Linux 7.3, kernel 2.4.19 from kernel.org, single Athlon 
XP, compiled with Athlon processor option and "4GB" memory setting.

         If you need more information, feel free to ask me. Do you know what 
happened and if it can be fixed?

	Thanks,

Michal Illich






^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.4.19 crash
@ 2002-08-07 19:50 Michal Illich
  0 siblings, 0 replies; 6+ messages in thread
From: Michal Illich @ 2002-08-07 19:50 UTC (permalink / raw)
  To: linux-kernel

output from ksymoops:

-------------------------------------------
ksymoops 2.4.4 on i686 2.4.19.  Options used
      -V (default)
      -k /proc/ksyms (default)
      -l /proc/modules (default)
      -o /lib/modules/2.4.19/ (default)
      -m /boot/System.map-2.4.19 (default)

Warning: You did not tell me where to find symbol information. [...]

No modules in ksyms, skipping objects
Warning (read_lsmod): no symbols in lsmod, is /proc/modules a valid lsmod file?
Aug  7 09:47:16 [...] kernel: Unable to handle kernel paging request at 
virtual address 45ca6234
Aug  7 09:47:16 [...] kernel: c0128c80
Aug  7 09:47:16 [...] kernel: *pde = 00000000
Aug  7 09:47:16 [...] kernel: Oops: 0000
Aug  7 09:47:16 [...] kernel: CPU:    0
Aug  7 09:47:16 [...] kernel: EIP:    0010:[<c0128c80>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Aug  7 09:47:16 [...] kernel: EFLAGS: 00010813
Aug  7 09:47:16 [...] kernel: eax: 999f7887   ebx: c2217e50   ecx: df4c8000 
   edx: f762e680
Aug  7 09:47:16 [...] kernel: esi: 00000246   edi: cfbc4380   ebp: 000000f0 
   esp: ddff7de8
Aug  7 09:47:16 [...] kernel: ds: 0018   es: 0018   ss: 0018
Aug  7 09:47:16 [...] kernel: Process [...] (pid: 2133, stackpage=ddff7000)
Aug  7 09:47:16 [...] kernel: Stack: 00000000 ecb0109c 009bb6d0 e55216c0 
00000000 00000000 00001000 00000001
Aug  7 09:47:16 [...] kernel:        c0132804 c2217e50 000000f0 c01328b6 
00000001 df4c8940 04a87241 00000000
Aug  7 09:47:16 [...] kernel:        0000a324 c212b320 00000341 0000a325 
c212b320 c0132ae6 c212b320 00001000
Aug  7 09:47:16 [...] kernel: Call Trace:    [<c0132804>] [<c01328b6>] 
[<c0132ae6>] [<c0133158>] [<c01239ad>]
Aug  7 09:47:16 [...] kernel:   [<c0123a47>] [<c01534f0>] [<c0124035>] 
[<c0124288>] [<c0126408>] [<c01247fa>]
Warning (Oops_read): Code line not seen, dumping what data is available

 >>EIP; c0128c80 <kfree+30/a0>   <=====
Trace; c0132804 <block_read_full_page+c4/230>
Trace; c01328b6 <block_read_full_page+176/230>
Trace; c0132ae6 <cont_prepare_write+86/240>
Trace; c0133158 <generic_direct_IO+48/160>
Trace; c01239ad <___wait_on_page+2d/b0>
Trace; c0123a47 <unlock_page+17/70>
Trace; c01534f0 <ext3_block_truncate_page+100/360>
Trace; c0124035 <do_generic_file_read+b5/420>
Trace; c0124288 <do_generic_file_read+308/420>
Trace; c0126408 <remove_suid+58/59>
Trace; c01247fa <sys_sendfile+1a/200>

3 warnings issued.  Results may not be reliable.
-----------------------------------------------

both lsmod and ksyms produce no output, I think this is ok in case of lsmod, 
because everything is compiled in. I don't know if it is ok with ksyms; if 
you need it I can play little more with that...

[...] are my edits


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.4.19 crash
  2002-08-07 18:17 Michal Illich
@ 2002-08-07 22:04 ` Alan Cox
  0 siblings, 0 replies; 6+ messages in thread
From: Alan Cox @ 2002-08-07 22:04 UTC (permalink / raw)
  To: Michal Illich; +Cc: linux-kernel

On Wed, 2002-08-07 at 19:17, Michal Illich wrote:
> 	I want to report multiple crashes while using last stable kernel, the message 
> it gives is:
> 
> --------------------
> Unable to handle kernel paging request at virtual address 45ca6234
>   printing eip:

See REPORTING-BUGS in the kernel source tree, and run the first oops it
logged through the ksymoops tool. That'll make it a lot easier to see
what happened.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* 2.4.19 crash
@ 2002-08-09 10:02 Michal Illich
  2002-08-09 12:01 ` Alan Cox
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Illich @ 2002-08-09 10:02 UTC (permalink / raw)
  To: linux-kernel

Happened again, now with "kernel BUG at buffer.c:510!" message.
Everything same as before, except:
(a) kernel was recompiled without high memory option
(b) machine was running but not responding at all (no ping)
Seems quite serious to me, any ideas?

--------------------------------------------------------------------------

ksymoops 2.4.4 on i686 2.4.19.  Options used
      -V (default)
      -k /proc/ksyms (default)
      -l /proc/modules (default)
      -o /lib/modules/2.4.19/ (default)
      -m /boot/System.map-2.4.19 (default)

Warning: You did not tell me where to find symbol information. [...]

No modules in ksyms, skipping objects
Warning (read_lsmod): no symbols in lsmod, is /proc/modules a valid lsmod file?
Aug  9 10:11:47 [...] kernel: kernel BUG at buffer.c:510!
Aug  9 10:11:47 [...] kernel: invalid operand: 0000
Aug  9 10:11:47 [...] kernel: CPU:    0
Aug  9 10:11:47 [...] kernel: EIP:    0010:[<c01313ee>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Aug  9 10:11:47 [...] kernel: EFLAGS: 00010286
Aug  9 10:11:47 [...] kernel: eax: df5ea969   ebx: 00000002   ecx: d50c0040 
   edx: c02d2794
Aug  9 10:11:47 [...] kernel: esi: d50c0040   edi: 00000001   ebp: d50c0040 
   esp: e5005e68
Aug  9 10:11:47 [...] kernel: ds: 0018   es: 0018   ss: 0018
Aug  9 10:11:47 [...] kernel: Process [...] (pid: 14320, stackpage=e5005000)
Aug  9 10:11:47 [...] kernel: Stack: 00000002 c0131c93 d50c0040 00000002 
d757b300 00000000 c0132493 d50c0040
Aug  9 10:11:47 [...] kernel:        d50c0040 00001000 c0131ca9 d50c0040 
c01326f6 d50c0040 00001000 00000000
Aug  9 10:11:47 [...] kernel:        00001000 0b2ae000 00000000 d757b300 
c0132d32 d757b300 c144fb10 00000000
Aug  9 10:11:47 [...] kernel: Call Trace:    [<c0131c93>] [<c0132493>] 
[<c0131ca9>] [<c01326f6>] [<c0132d32>]
Aug  9 10:11:47 [...] kernel:   [<c0153045>] [<c01528a0>] [<c01261b8>] 
[<c011b5dd>] [<c0130175>] [<c011790b>]
Aug  9 10:11:47 [...] kernel:   [<c010878b>]
Aug  9 10:11:48 [...] kernel: Code: 0f 0b fe 01 ba 2f 24 c0 8b 02 85 c0 75 
07 89 0a 89 49 24 8b

 >>EIP; c01313ee <__insert_into_lru_list+1e/70>   <=====
Trace; c0131c93 <__refile_buffer+53/60>
Trace; c0132493 <__block_prepare_write+103/2e0>
Trace; c0131ca9 <refile_buffer+9/10>
Trace; c01326f6 <__block_commit_write+86/d0>
Trace; c0132d32 <generic_commit_write+32/60>
Trace; c0153045 <ext3_commit_write+135/1c0>
Trace; c01528a0 <ext3_get_block+0/60>
Trace; c01261b8 <generic_file_write+4e8/6e0>
Trace; c011b5dd <do_timer+3d/70>
Trace; c0130175 <sys_write+95/f0>
Trace; c011790b <sys_gettimeofday+1b/a0>
Trace; c010878b <system_call+33/38>
Code;  c01313ee <__insert_into_lru_list+1e/70>
00000000 <_EIP>:
Code;  c01313ee <__insert_into_lru_list+1e/70>   <=====
    0:   0f 0b                     ud2a      <=====
Code;  c01313f0 <__insert_into_lru_list+20/70>
    2:   fe 01                     incb   (%ecx)
Code;  c01313f2 <__insert_into_lru_list+22/70>
    4:   ba 2f 24 c0 8b            mov    $0x8bc0242f,%edx
Code;  c01313f7 <__insert_into_lru_list+27/70>
    9:   02 85 c0 75 07 89         add    0x890775c0(%ebp),%al
Code;  c01313fd <__insert_into_lru_list+2d/70>
    f:   0a 89 49 24 8b 00         or     0x8b2449(%ecx),%cl

Aug  9 10:11:48 [...] kernel:  <1>Unable to handle kernel paging request at 
virtual address 546a53b8

2 warnings issued.  Results may not be reliable.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.4.19 crash
  2002-08-09 10:02 Michal Illich
@ 2002-08-09 12:01 ` Alan Cox
  2002-08-09 12:13   ` Michal Illich
  0 siblings, 1 reply; 6+ messages in thread
From: Alan Cox @ 2002-08-09 12:01 UTC (permalink / raw)
  To: Michal Illich; +Cc: linux-kernel

On Fri, 2002-08-09 at 11:02, Michal Illich wrote:
> Happened again, now with "kernel BUG at buffer.c:510!" message.
> Everything same as before, except:
> (a) kernel was recompiled without high memory option
> (b) machine was running but not responding at all (no ping)
> Seems quite serious to me, any ideas?

Looks very much like random memory corruption. Could be many things.
Does the box pass memtest86 ?


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 2.4.19 crash
  2002-08-09 12:01 ` Alan Cox
@ 2002-08-09 12:13   ` Michal Illich
  0 siblings, 0 replies; 6+ messages in thread
From: Michal Illich @ 2002-08-09 12:13 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

>>Happened again, now with "kernel BUG at buffer.c:510!" message.
> Looks very much like random memory corruption. Could be many things.
> Does the box pass memtest86 ?

It just passed memtest-2.93.1 (I don't have physical access to machine to 
run standalone test). BIOS memcheck also runs ok. It usually crashes after 
1-5 days, in the meantime all programs run perfectly.

---------------------------------------------------- memtest.log:

Testing 469684224 bytes at 0x42136000 (4088 bytes lost to page alignment).

Run    1:
   Test  1:         Stuck Address:  Setting...Passed.
   Test  2:          Random value:  Setting...Testing...Passed.
   Test  3:        XOR comparison:  Setting...Testing...Passed.
   Test  4:        SUB comparison:  Setting...Testing...Passed.
   Test  5:        MUL comparison:  Setting...Testing...Passed.
   Test  6:        DIV comparison:  Setting...Testing...Passed.
   Test  7:         OR comparison:  Setting...Testing...Passed.
   Test  8:        AND comparison:  Setting...Testing...Passed.
   Test  9:  Sequential Increment:  Setting...Testing...Passed.
   Test 10:            Solid Bits:  Setting...Passed.
   Test 11:      Block Sequential:  Setting...Passed.
   Test 12:          Checkerboard:  Setting...Passed.
   Test 13:            Bit Spread:  Setting...Passed.
   Test 14:              Bit Flip:  Setting...Passed.
   Test 15:          Walking Ones:  Setting...Passed.
   Test 16:        Walking Zeroes:  Setting...Passed.
Run    1 completed in 1476 seconds (0 tests showed errors).
munlock'ed memory.
1 runs completed.  0 errors detected.  Total runtime:  1476 seconds.





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2002-08-09 12:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-07 19:50 2.4.19 crash Michal Illich
  -- strict thread matches above, loose matches on Subject: below --
2002-08-09 10:02 Michal Illich
2002-08-09 12:01 ` Alan Cox
2002-08-09 12:13   ` Michal Illich
2002-08-07 18:17 Michal Illich
2002-08-07 22:04 ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox