public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.6.9 Oops: Major problems with XFS and ext3 (VFS related?)
@ 2004-11-24 17:12 Anders Saaby
  2004-11-25 22:35 ` Nathan Scott
  0 siblings, 1 reply; 2+ messages in thread
From: Anders Saaby @ 2004-11-24 17:12 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-xfs

Hi Lists, (XFS list CC'ed)

We are encountering what looks like a race on both ext3 and XFS on a high-load 
mailserver.

Here is the cituation:
We have a high-load mailserver serving IMAP from Maildirs. We originally had 
the maildirs on ext3 but the kernel eventually Oopsed every ~20 hours (Oops - 
included) - we then moved the Maildirs to XFS thinking the problems where 
history, but now we get a somewhat similar error from XFS (inluded). They 
both look like a race to me but I am not able to get more out of it.

System: IBM Dual Xeon P4 - IBM ips raidcontroller (raid 0+1) ~150G.
Kernel: Linux 2.6.9 SMP

So buttomline both ext3 and XFS causes crashes. Comments anyone? ...We are 
desperate.

Here is what XFS says:
<SNIP>
Filesystem "sdb1": xfs_trans_delete_ail: attempting to delete a log item that 
is not in the AIL
xfs_force_shutdown(sdb1,0x8) called from line 382 of file 
fs/xfs/xfs_trans_ail.c.  Return address = 0xc0216a56
@Linux version 2.6.9 (root@mail1.domain.tld) (gcc version 2.96 20000731 (Red 
Hat Linux 7.3 2.96-113)) #1 SMP Tue Oct 19 16:04:55 CEST 2004
</SNIP>

Here is what ext3 says:
<SNIP>
Unable to handle kernel NULL pointer dereference at virtual address 0000000c
printing eip:
c018b2f5
*pde = 00000000
Oops: 0002 [#1]
SMP
Modules linked in: nfs e1000 iptable_nat rtc
CPU:    2
EIP:    0060:[<c018b2f5>]    Not tainted VLI
EFLAGS: 00010286   (2.6.9)
EIP is at journal_commit_transaction+0x545/0x11b0
eax: d971826c   ebx: 00000000   ecx: e489eefc   edx: 00000014
esi: d971826c   edi: f7406000   ebp: ea0a6f80   esp: f7407d8c
ds: 007b   es: 007b   ss: 0068
Process kjournald (pid: 177, threadinfo=f7406000 task=f7df63b0)
Stack: 03afe6b2 c2157478 f7407e40 f7406000 c2157414 00000000 00000000 00000000
       00000000 00000000 e489ebfc cd61056c 000010e8 01c2bf60 c040e020 00000000
       f7406000 0000001e f7407e1c c0412f80 00000008 f7407e5c c01134e3 f7407e1c
Call Trace:
 [<c01134e3>] find_busiest_group+0xf3/0x300
 [<c0113799>] find_busiest_queue+0xa9/0xd0
 [<c0115620>] autoremove_wake_function+0x0/0x40
 [<c0115620>] autoremove_wake_function+0x0/0x40
 [<c018e0e1>] kjournald+0xc1/0x230
 [<c0115620>] autoremove_wake_function+0x0/0x40
 [<c0112ba3>] finish_task_switch+0x33/0x70
 [<c0115620>] autoremove_wake_function+0x0/0x40
 [<c0103ff6>] ret_from_fork+0x6/0x14
 [<c018e000>] commit_timeout+0x0/0x10
 [<c018e020>] kjournald+0x0/0x230
 [<c010253d>] kernel_thread_helper+0x5/0x18
Code: 00 89 f0 e8 5e e1 17 00 83 c4 14 8b 45 18 85 c0 0f 84 49 01 00 00 bf 00 
e0 ff ff 21 e7 89 f6 8d bc 27 00 00 00 00 8b 70 20 8b 1e <f0> ff 43 0c 8b 03 
83 e0 04 74 4e 8b 94 24
 e8 01 00 00 8d 82 c0
</SNIP>

I will be happy to supply any info and do some testing - if anyone catches 
interest! :-)

-- 
Med venlig hilsen - Best regards - Meilleures salutations

Anders Saaby
Systems Engineer
------------------------------------------------
Cohaesio A/S - Maglebjergvej 5D - DK-2800 Lyngby
Phone: +45 45 880 888 - Fax: +45 45 880 777
Mail: as@cohaesio.com - http://www.cohaesio.com
------------------------------------------------

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: 2.6.9 Oops: Major problems with XFS and ext3 (VFS related?)
  2004-11-24 17:12 2.6.9 Oops: Major problems with XFS and ext3 (VFS related?) Anders Saaby
@ 2004-11-25 22:35 ` Nathan Scott
  0 siblings, 0 replies; 2+ messages in thread
From: Nathan Scott @ 2004-11-25 22:35 UTC (permalink / raw)
  To: Anders Saaby; +Cc: linux-kernel, linux-xfs

On Wed, Nov 24, 2004 at 06:12:33PM +0100, Anders Saaby wrote:
> Hi Lists, (XFS list CC'ed)

Hi there,

> Here is the cituation:
> We have a high-load mailserver serving IMAP from Maildirs. We originally had 
> the maildirs on ext3 but the kernel eventually Oopsed every ~20 hours (Oops - 
> included) - we then moved the Maildirs to XFS thinking the problems where 
> history, but now we get a somewhat similar error from XFS (inluded). They 
> both look like a race to me but I am not able to get more out of it.
> ...
> Here is what XFS says:
> <SNIP>
> Filesystem "sdb1": xfs_trans_delete_ail: attempting to delete a log item that 
> is not in the AIL
> xfs_force_shutdown(sdb1,0x8) called from line 382 of file 
> fs/xfs/xfs_trans_ail.c.  Return address = 0xc0216a56
> @Linux version 2.6.9 (root@mail1.domain.tld) (gcc version 2.96 20000731 (Red 
> Hat Linux 7.3 2.96-113)) #1 SMP Tue Oct 19 16:04:55 CEST 2004
> ...
> I will be happy to supply any info and do some testing - if anyone catches 
> interest! :-)

Yep, very interested.  So, "serving IMAP from Maildirs" - from
the filesystems perspective, can you describe that in detail for
me?  I would guess that means a shallow directory tree, with quite
large directories (how large?) and many (how many?) small files?
(how small on average?)  How frequently are files added/removed?

Is this easily reproducible for you?  If so, can you send me
enough details that I can try to reproduce it locally?

thanks.

-- 
Nathan

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2004-11-26 23:54 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-24 17:12 2.6.9 Oops: Major problems with XFS and ext3 (VFS related?) Anders Saaby
2004-11-25 22:35 ` Nathan Scott

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox