From: David Chinner <dgc@sgi.com>
To: Denys Fedoryshchenko <denys@visp.net.lb>
Cc: linux-kernel@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: 2.6.25 released with bug, which leads to XFS crash?
Date: Fri, 18 Apr 2008 10:06:57 +1000 [thread overview]
Message-ID: <20080418000657.GC108924158@sgi.com> (raw)
In-Reply-To: <200804170949.36864.denys@visp.net.lb>
On Thu, Apr 17, 2008 at 09:49:36AM +0300, Denys Fedoryshchenko wrote:
> Hi again
>
> I reported about http://bugzilla.kernel.org/show_bug.cgi?id=10421 , and it
> is triggerable on different loaded servers with XFS (squid with aufs),
> just it is happening even on heavy load after 1-2 days. IMHO such bugs is
> critical (same as getting kernel panic, and etc),
Well, yes, and we treat shutdown bugs as such. A filesystem shutdown
is effectively a filesystem panic and is indicative of either a
corruption or a bug. The reality is that it takes time to triage
such a problem that only occurs on one workload on one set of
identical machines once every day or two. This does not make the
problem a release blocker, though.
The other side of it is that problems like this in Linux are often
the result of a bug in a lower layer and not XFS itself. Given this
particular problem seems to be memory corruption it could be anything
that is causing it....
> cause they are unrecoverable, causing minor filesystem corruption, and only
> way to fix them - wakeup sysadmin. Worst thing, it is hapenning at night,
> when i restart squid, and probably it is doing agressive unlinking stale
> cache entries. It doesn't do panic, or even oops, but filesystem will be
> disconnected, > and squid will remain in loop trying to restart. Sure it is
> easy to restart it, but maybe it has to be OOPS? so at least i can do
> sysctl -w kernel.panic_on_oops = 1, and FS will be recovered on reboot.
Rather than fearmongering, perhaps you should ask on the XFS list
(xfs@oss.sgi.com) whether anything like this can be done. Then you
might have learnt about Documentation/filesystems/xfs.txt and
/proc/sys/fs/xfs/panic_mask:
fs.xfs.panic_mask (Min: 0 Default: 0 Max: 127)
Causes certain error conditions to call BUG(). Value is a bitmask;
AND together the tags which represent errors which should cause panics:
> Just want to warn people who is using XFS on loaded servers to keep
> attention while using 2.6.25, and if you face same bug, report to bugzilla.
Actually, I'd much prefer XFS bug reports to go to xfs@oss.sgi.com
rather than the kernel bugzilla - that way most of the XFS community
will see the bug report and the triage being done and then there's
no need for spamming lkml like this....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
parent reply other threads:[~2008-04-18 0:06 UTC|newest]
Thread overview: expand[flat|nested] mbox.gz Atom feed
[parent not found: <200804170949.36864.denys@visp.net.lb>]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080418000657.GC108924158@sgi.com \
--to=dgc@sgi.com \
--cc=denys@visp.net.lb \
--cc=linux-kernel@vger.kernel.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox