Random crashes (was Re: Crashes and lockups in XFS filesystem (2.6.8-rc4).)

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: David Martinez Moreno <ender@debian.org>
To: linux-kernel@vger.kernel.org, Nathan Scott <nathans@sgi.com>
Cc: ender@debian.org
Subject: Random crashes (was Re: Crashes and lockups in XFS filesystem (2.6.8-rc4).)
Date: Thu, 19 Aug 2004 19:35:09 +0200	[thread overview]
Message-ID: <200408191935.09717.ender@debian.org> (raw)
In-Reply-To: <20040819084410.GA14750@frodo>

[-- Attachment #1: Type: text/plain, Size: 5205 bytes --]

El Jueves, 19 de Agosto de 2004 10:44, Nathan Scott escribió:
> On Wed, Aug 18, 2004 at 06:16:57PM +0200, David Martinez Moreno wrote:
> > 	Hello, I am getting persistent lockups that could be IMHO XFS-related. I
> > created a fresh XFS filesystem in a SCSI disk, with xfsprogs version
> > 2.6.18.
> >
> > 	Mounted /dev/sda1 under /mnt, after that, I have been copying lots of
> > files from /dev/md0, then run a find blabla -exec rm \{\{ \; over /mnt
> > and then voilà! the lockup:
>
> Did /mnt run out of space while doing that?  Or nearly?  There's
> a known issue with that area of the XFS code, in conjunction with
> 4K stacks at the moment - was that enabled in your .config?
>
> Looks like something stamped on parts of the xfs_mount structure
> for the filesystem mounted at /mnt, a stack overrun would explain
> that and your subsequent oopsen.

	Hello, Nathan. I am here at the University now, and yes, the kernel were 
compiled with 4K stacks. I rebuilt it with 8K and rebooted:

ulises:~# zcat /proc/config.gz |grep STACKS
# CONFIG_4KSTACKS is not set

	I ran a disk test and waited.

	After an hour, the system crashed. I forgot to plug in a console, so rebooted 
and waited again. And here it is:

------------[ cut here ]------------
kernel BUG at mm/vmscan.c:565!
invalid operand: 0000 [#1]
CPU:    0
EIP:    0060:[<c01307f1>]    Not tainted
EFLAGS: 00010046   (2.6.8.1)
EIP is at shrink_cache+0x7c/0x2c7
eax: 00000000   ebx: c0415d24   ecx: c12485f8   edx: c12485f8
esi: 0000001d   edi: c0415d48   ebp: 0000001c   esp: c15bfeb0
ds: 007b   es: 007b   ss: 0068
Process kswapd0 (pid: 10, threadinfo=c15be000 task=c15720b0)
Stack: c15ab080 00000002 c15be000 00000020 c10444f8 c10f8b18 00000000 00000001
       00000286 c15c0130 c15bfee4 c012ddd8 00000000 00000286 00000000 00000000
       00000000 00000000 c15bfef8 c15bfef8 00000000 00000000 00000000 dff28180
Call Trace:
 [<c012ddd8>] drain_cpu_caches+0x40/0x42
 [<c0130085>] shrink_slab+0x7b/0x186
 [<c0130f19>] shrink_zone+0x9e/0xb8
 [<c01312d6>] balance_pgdat+0x1c9/0x22d
 [<c0131401>] kswapd+0xc7/0xd7
 [<c0112dac>] autoremove_wake_function+0x0/0x57
 [<c0112dac>] autoremove_wake_function+0x0/0x57
 [<c013133a>] kswapd+0x0/0xd7
 [<c0101fdd>] kernel_thread_helper+0x5/0xb
Code: 0f 0b 35 02 02 81 3c c0 8b 51 04 8b 01 89 50 04 89 02 c7 41
 <1>Unable to handle kernel NULL pointer dereference at virtual address 
00000004
 printing eip:
c012e3c7
*pde = 00000000
Oops: 0000 [#2]
CPU:    0
EIP:    0060:[<c012e3c7>]    Not tainted
EFLAGS: 00010006   (2.6.8.1)
EIP is at free_block+0x43/0xcb
eax: 00800000   ebx: 00000000   ecx: 00000000   edx: c1000000
esi: c0415d48   edi: 00000000   ebp: c0415d54   esp: c15bfb20
ds: 007b   es: 007b   ss: 0068
Process kswapd0 (pid: 10, threadinfo=c15be000 task=c15720b0)
Stack: c022b4c2 dffe0090 c0415d64 2002002c d242f800 00000286 c1401480 c012e499
       c0415d48 c1161948 2002002c c1161948 c1161938 d242f800 00000286 0000062a
       c012e7a6 c0415d48 c1161938 c1a4ce80 dcdab000 00000004 c0338237 d242f800
Call Trace:
 [<c022b4c2>] memmove+0x4d/0x4f
 [<c012e499>] cache_flusharray+0x4a/0xb6
 [<c012e7a6>] kfree+0x5e/0x62
 [<c0338237>] kfree_skbmem+0x13/0x2c
 [<c03382b8>] __kfree_skb+0x68/0xdd
 [<c03591aa>] tcp_clean_rtx_queue+0x12f/0x3a0
 [<c0359a53>] tcp_ack+0xb4/0x560
 [<c035bc4d>] __tcp_data_snd_check+0xd3/0xe2
 [<c035c454>] tcp_rcv_established+0x419/0x84a
 [<c036470d>] tcp_v4_do_rcv+0x117/0x11c
 [<c0364d51>] tcp_v4_rcv+0x63f/0x884
 [<c025b3a5>] scrup+0xe3/0xf7
 [<c034badf>] ip_local_deliver+0xa3/0x1a2
 [<c034bec4>] ip_rcv+0x2e6/0x3fe
 [<c033d335>] netif_receive_skb+0x14b/0x17e
 [<c033d3dd>] process_backlog+0x75/0xf6
 [<c033d4c8>] net_rx_action+0x6a/0xe2
 [<c0117d52>] __do_softirq+0x7e/0x80
 [<c0117d7a>] do_softirq+0x26/0x28
 [<c0105e79>] do_IRQ+0xc4/0xdf
 [<c0104e82>] do_invalid_op+0x0/0xcb
 [<c0104460>] common_interrupt+0x18/0x20
 [<c0104e82>] do_invalid_op+0x0/0xcb
 [<c0104b61>] die+0x7a/0xcd
 [<c0104f4b>] do_invalid_op+0xc9/0xcb
 [<c01307f1>] shrink_cache+0x7c/0x2c7
 [<c02217da>] vn_purge+0x108/0x116
 [<c010451d>] error_code+0x2d/0x38
 [<c01307f1>] shrink_cache+0x7c/0x2c7
 [<c012ddd8>] drain_cpu_caches+0x40/0x42
 [<c0130085>] shrink_slab+0x7b/0x186
 [<c0130f19>] shrink_zone+0x9e/0xb8
 [<c01312d6>] balance_pgdat+0x1c9/0x22d
 [<c0131401>] kswapd+0xc7/0xd7
 [<c0112dac>] autoremove_wake_function+0x0/0x57
 [<c0112dac>] autoremove_wake_function+0x0/0x57
 [<c013133a>] kswapd+0x0/0xd7
 [<c0101fdd>] kernel_thread_helper+0x5/0xb
Code: 8b 53 04 8b 03 89 50 04 89 02 c7 43 04 00 02 20 00 2b 4b 0c
 <0>Kernel panic: Fatal exception in interrupt
In interrupt handler - not syncing

	Last time I seem to remember some function related to ext3, so this is more a 
data/stack corruption than others, what do you think?

	I attach current dmesg and config, if it helps.

	Is there anything else I could do? Patches, other trees, special 
configurations? This machine seems impossible to stabilize.

	The machine is using SiI libata driver, if it could have some problem 
related...

	Many thanks in advance,


		Ender.
-- 
 Why is a cow? Mu. (Ommmmmmmmmm)

[-- Attachment #2: dmesg.gz --]
[-- Type: application/x-gzip, Size: 5228 bytes --]

[-- Attachment #3: config.gz --]
[-- Type: application/x-gzip, Size: 6233 bytes --]

next prev parent reply	other threads:[~2004-08-19 17:40 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-08-18 16:16 Crashes and lockups in XFS filesystem (2.6.8-rc4) David Martinez Moreno
2004-08-19  8:44 ` Nathan Scott
2004-08-19  8:39   ` David Martínez Moreno
2004-08-19 17:35   ` David Martinez Moreno [this message]
     [not found]   ` <200408192127.58996.ender@debian.org>
2004-08-20  8:47     ` Crashes and lockups in XFS filesystem (2.6.8-rc4 and 2.6.8.1-mm2) Nathan Scott

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200408191935.09717.ender@debian.org \
    --to=ender@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nathans@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox