From: Jakob Oestergaard <jakob@unthought.net>
To: Christoph Hellwig <hch@infradead.org>,
David Greaves <david@dgreaves.com>, Jan Kasprzak <kas@fi.muni.cz>,
linux-kernel@vger.kernel.org, kruty@fi.muni.cz
Subject: Re: XFS: inode with st_mode == 0
Date: Mon, 17 Jan 2005 11:07:46 +0100 [thread overview]
Message-ID: <20050117100746.GI347@unthought.net> (raw)
In-Reply-To: <20050116135112.GA24814@infradead.org>
On Sun, Jan 16, 2005 at 01:51:12PM +0000, Christoph Hellwig wrote:
> On Fri, Jan 14, 2005 at 07:23:09PM +0100, Jakob Oestergaard wrote:
> > So apart from the general well known instability problems that will
> > occur when you actually start *using* the system, there should be no
>
> What known instabilities?
Where should I begin? ;)
Most of the following have already been posted to LKML - primarily by
Anders (as@cohaesio.com) - it seems that noone cares, but I'll repost a
summary that Anders sent me below:
-------
Scenario 1: Mailservers:
2.6.10 (~24-40 hours uptime):
Running ext3 on mailqueue:
<SNIP>
Unable to handle kernel NULL pointer dereference at virtual address 00000004
printing eip:
c018a095
*pde = 00000000
Oops: 0002 [#1]
SMP
Modules linked in: nfs e1000 iptable_nat ipt_connlimit rtc
CPU: 2
EIP: 0060:[<c018a095>] Not tainted
EFLAGS: 00010286 (2.6.8.1)
EIP is at journal_commit_transaction+0x535/0x10e5
eax: cac1e26c ebx: 00000000 ecx: f7cec400 edx: f7cec400
esi: f65f3000 edi: cac1e26c ebp: f65f3000 esp: f65f3dc0
ds: 007b es: 007b ss: 0068
Process kjournald (pid: 174, threadinfo=f65f3000 task=c2308b70)
Stack: f65f3e64 00000000 00000000 00000000 00000000 00000000 f7cec400 cda565fc
0000149a 00000004 f65f3e48 c01132d8 00000002 c202ad20 00000001 f65f3e5c
c202ad20 c202ad20 00000002 00000001 0000001e 01c1af60 f65f3e68 c0407dc0
Call Trace:
[<c01132d8>] scheduler_tick+0x468/0x470
[<c01127b5>] find_busiest_group+0x105/0x310
[<c011db8e>] del_timer_sync+0x7e/0xa0
[<c018cd4d>] kjournald+0xbd/0x230
[<c0114b10>] autoremove_wake_function+0x0/0x40
[<c0114b10>] autoremove_wake_function+0x0/0x40
[<c0103f16>] ret_from_fork+0x6/0x14
[<c018cc70>] commit_timeout+0x0/0x10
[<c018cc90>] kjournald+0x0/0x230
[<c01024bd>] kernel_thread_helper+0x5/0x18
Code: f0 ff 43 04 8b 03 83 e0 04 74 4c 8b 8c 24 b8 01 00 00 c6 81
<2>SoftDog: Initiating system reboot
</SNIP>
-------
Scenario 2: Mailservers:
Running XFS on mailqueue:
<SNIP>
Filesystem "sdb1": xfs_trans_delete_ail: attempting to delete a log item that
is not in the AIL
xfs_force_shutdown(sdb1,0x8) called from line 382 of file
fs/xfs/xfs_trans_ail.c. Return address = 0xc0216a56
@Linux version 2.6.9 (root@mail1.domain.tld) (gcc version 2.96 20000731 (Red
Hat Linux 7.3 2.96-113)) #1 SMP Tue Oct 19 16:04:55 CEST 2004
</SNIP>
=======
Resolution to the mailserver problem:
2.4.28 is perfectly stable on these machines.
-------
Scenario 3: Webservers:
2.6.10, 2.6.10-ac8 (~3-12 hours uptime):
<SNIP>
Unable to handle kernel paging request
<2>SoftDog: Initiating system reboot.
<SNIP>
(No more...) :(
=======
Resolution to the webserver problem:
2.4.28/2.4.29-rc2 are stable here
-------
Scenario 4: Storageservers:
2.6.8.1:
Oopses after ~5-10 hours whith SMP on. - Cannot find the actual Oopses
anymore and 2.6.8+ havent been tested as we cannot afford anymore downtime on
these servers.
=======
Resolution to the storage server problem:
2.6.8.1 UP is stable (but oopses regularly after memory allocation
failures)
Hardware on all servers: IBM x335 and x345.
Mentioned errors seen on a total of 17 servers.
--
/ jakob
next prev parent reply other threads:[~2005-01-17 10:11 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-12-09 12:59 XFS: inode with st_mode == 0 Jan Kasprzak
2004-12-09 13:53 ` Jakob Oestergaard
2004-12-09 14:07 ` Jan Kasprzak
2004-12-09 21:54 ` Christoph Hellwig
2004-12-14 23:40 ` Jakob Oestergaard
2004-12-21 18:43 ` Jan Kasprzak
2004-12-22 8:41 ` Jakob Oestergaard
2004-12-22 18:23 ` Christoph Hellwig
2004-12-23 15:01 ` Jakob Oestergaard
2005-01-04 8:48 ` Jakob Oestergaard
2005-01-05 11:34 ` Christoph Hellwig
2005-01-14 18:14 ` David Greaves
2005-01-14 18:23 ` Jakob Oestergaard
2005-01-15 2:09 ` Nathan Scott
2005-01-17 0:53 ` Jakob Oestergaard
2005-01-16 13:51 ` Christoph Hellwig
2005-01-17 10:07 ` Jakob Oestergaard [this message]
2005-01-17 11:55 ` Jan-Frode Myklebust
2005-01-17 13:48 ` Anders Saaby
2005-01-17 21:31 ` journaled filesystems -- known instability; Was: " Jeffrey Hundstad
2005-01-17 20:54 ` Alan Cox
2005-01-20 22:30 ` Jeffrey E. Hundstad
2005-01-25 12:47 ` Stephen C. Tweedie
2005-01-25 15:09 ` Jeffrey Hundstad
2005-01-25 15:37 ` Stephen C. Tweedie
2005-01-28 20:15 ` Jeffrey E. Hundstad
2005-01-28 21:00 ` Stephen C. Tweedie
2005-01-28 21:06 ` Jeffrey E. Hundstad
2005-01-18 11:45 ` Jan Kasprzak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050117100746.GI347@unthought.net \
--to=jakob@unthought.net \
--cc=david@dgreaves.com \
--cc=hch@infradead.org \
--cc=kas@fi.muni.cz \
--cc=kruty@fi.muni.cz \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.