Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Chris Boot <bootc@bootc.net>
To: linux-kernel@vger.kernel.org
Subject: Re: Panic with XFS on RHEL5 (2.6.18-8.1.8.el5)
Date: Sat, 18 Aug 2007 17:28:11 +0100	[thread overview]
Message-ID: <46C71E1B.40702@bootc.net> (raw)
In-Reply-To: <yw1xmywokfik.fsf@thrashbarg.mansr.com>

Måns Rullgård wrote:
> Chris Boot <bootc@bootc.net> writes:
>
>   
>> Måns Rullgård wrote:
>>     
>>> Chris Boot <chris.boot@abacustree.com> writes:
>>>
>>>
>>>       
>>>> All,
>>>>
>>>> I've got a box running RHEL5 and haven't been impressed by ext3
>>>> performance on it (running of a 1.5TB HP MSA20 using the cciss
>>>> driver). I compiled XFS as a module and tried it out since I'm used to
>>>> using it on Debian, which runs much more efficiently. However, every
>>>> so often the kernel panics as below. Apologies for the tainted kernel,
>>>> but we run VMware Server on the box as well.
>>>>
>>>> Does anyone have any hits/tips for using XFS on Red Hat? What's
>>>> causing the panic below, and is there a way around this?
>>>>
>>>> BUG: unable to handle kernel paging request at virtual address b8af9d60
>>>> printing eip:
>>>> c0415974
>>>> *pde = 00000000
>>>> Oops: 0000 [#1]
>>>> SMP last sysfs file: /block/loop7/dev
>>>>         
> [...]
>   
>>>> [<f936884e>] xfsbufd_wakeup+0x28/0x49 [xfs]
>>>> [<c04572f9>] shrink_slab+0x56/0x13c
>>>> [<c0457c0c>] try_to_free_pages+0x162/0x23e
>>>> [<c0454064>] __alloc_pages+0x18d/0x27e
>>>> [<c045214e>] find_or_create_page+0x53/0x8c
>>>> [<c046c7b1>] __getblk+0x162/0x270
>>>> [<c0475be0>] do_lookup+0x53/0x157
>>>> [<f889138f>] ext3_getblk+0x7c/0x233 [ext3]
>>>> [<f88913fe>] ext3_getblk+0xeb/0x233 [ext3]
>>>> [<c048215c>] mntput_no_expire+0x11/0x6a
>>>> [<f889226e>] ext3_bread+0x13/0x69 [ext3]
>>>> [<f8895606>] htree_dirblock_to_tree+0x22/0x113 [ext3]
>>>> [<f889574f>] ext3_htree_fill_tree+0x58/0x1a0 [ext3]
>>>> [<c047828b>] do_path_lookup+0x20e/0x25f
>>>> [<c046b987>] get_empty_filp+0x99/0x15e
>>>> [<f889d611>] ext3_permission+0x0/0xa [ext3]
>>>> [<f888eaa3>] ext3_readdir+0x1ce/0x59b [ext3]
>>>> [<c047a0dd>] filldir+0x0/0xb9
>>>> [<c0472973>] sys_fstat64+0x1e/0x23
>>>> [<c047a1f9>] vfs_readdir+0x63/0x8d
>>>> [<c047a0dd>] filldir+0x0/0xb9
>>>> [<c047a447>] sys_getdents+0x5f/0x9c
>>>> [<c0403eff>] syscall_call+0x7/0xb
>>>> =======================
>>>>
>>>>         
>>> Your Redhat kernel is probably built with 4k stacks and XFS+loop+ext3
>>> seems to be enough to overflow it.
>>>
>>>       
>> Thanks, that explains a lot. However, I don't have any XFS filesystems
>> mounted over loop devices on ext3. Earlier in the day I had iso9660 on
>> loop on xfs, could that have caused the issue? It was unmounted and
>> deleted when this panic occurred.
>>     
>
> The mention of /block/loop7/dev and the presence both XFS and ext3
> function in the call stack suggested to me that you might have an ext3
> filesystem in a loop device on XFS.  I see no other explanation for
> that call stack other than a stack overflow, but then we're still back
> at the same root cause.
>
> Are you using device-mapper and/or md?  They too are known to blow 4k
> stacks when used with XFS.
>   

I am. The situation was earlier on was iso9660 on loop on xfs on lvm on 
cciss. I guess that might have smashed the stack undetectably and 
induced corruption encountered later on? When I experienced this panic 
the machine would have probably been performing a backup, which was 
simply a load of ext3/xfs filesystems on lvm on the HP cciss controller. 
None of the loop devices would have been mounted.

I have a few machines now with 4k stacks and using lvm + md + xfs and 
have no trouble at all, but none are Red Hat (all Debian) and none use 
cciss either. Maybe it's a deadly combination.

>> I'll probably just try and recompile the kernel with 8k stacks and see
>> how it goes. Screw the support, we're unlikely to get it anyway. :-P
>>     
>
> Please report how this works out.
>   

I will. This will probably be on Monday now, since the machine isn't 
accepting SysRq requests over the serial console. :-(

Many thanks,
Chris

next prev parent reply	other threads:[~2007-08-18 16:28 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-18  9:52 Panic with XFS on RHEL5 (2.6.18-8.1.8.el5) Chris Boot
2007-08-18 12:31 ` Måns Rullgård
2007-08-18 12:53   ` Jan Engelhardt
2007-08-18 14:17   ` Chris Boot
2007-08-18 15:51     ` Måns Rullgård
2007-08-18 16:28       ` Chris Boot [this message]
2007-08-18 16:40         ` Jan Engelhardt
2007-08-20 16:53         ` Chris Boot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46C71E1B.40702@bootc.net \
    --to=bootc@bootc.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.