All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Juerg Haefliger <juergh@gmail.com>
Cc: xfs@oss.sgi.com
Subject: Re: Daily crash in xfs_cmn_err
Date: Mon, 29 Oct 2012 23:53:30 +1100	[thread overview]
Message-ID: <20121029125330.GS29378@dastard> (raw)
In-Reply-To: <CADLDEKtkwaitKCsU21JnsesM70H5AwkEFQFcdmOHE0JU9Oa8nw@mail.gmail.com>

On Mon, Oct 29, 2012 at 11:55:15AM +0100, Juerg Haefliger wrote:
> Hi,
> 
> I have a node that used to crash every day at 6:25am in xfs_cmn_err
> (Null pointer dereference).

Stack trace, please.

> 1) I was under the impression that during the mounting of an XFS
> volume some sort of check/repair is performed.  How does that differ
> from running xfs_check and/or xfs_repair?

Journal recovery is performed at mount time, not a consistency
check.

http://en.wikipedia.org/wiki/Filesystem_journaling

> 2) Any ideas how the filesystem might have gotten into this state? I
> don't have the history of that node but it's possible that it crashed
> previously due to an unrelated problem. Could this have left the
> filesystem is this state?

<shrug>

How long is a piece of string?

> 3) What exactly does the ouput of the xfs_check mean? How serious is
> it? Are those warning or errors? Will some of them get cleanup up
> during the mounting of the filesystem?

xfs_check is deprecated.  The output of xfs_repair indicates
cross-linked extent indexes. Will only get properly detected and
fixed by xfs_repair. And "fixed" may mean corrupt files are removed
from the filesystem - repair does nto guarantee that your data is
preserved or consistent after it runs, just that the filesystem is
consistent and error free.

> 4) We have a whole bunch of production nodes running the same kernel.
> I'm more than a little concerned that we might have a ticking timebomb
> with some filesystems being in a state that might trigger a crash
> eventually. Is there any way to perform a live check on a mounted
> filesystem so that I can get an idea of how big of a problem we have
> (if any)?

Read the xfs_repair man page?

-n     No modify mode. Specifies that xfs_repair should not
       modify the filesystem but should only scan the  filesystem
       and indicate what repairs would have been made.
.....

-d     Repair dangerously. Allow xfs_repair to repair an XFS
       filesystem mounted read only. This is typically done on a
       root fileystem from single user mode, immediately followed by
       a reboot.

So, remount read only, run xfs_repair -d -n will check the
filesystem as best as can be done online. If there are any problems,
then you can repair them and immediately reboot.

> i don't claim to know exactly what I'm doing but I picked a
> node, froze the filesystem and then ran a modified xfs_check (which
> bypasses the is_mounted check and ignores non-committed metadata) and
> it did report some issues. At this point I believe those are false
> positive. Do you have any suggestions short of rebooting the nodes and
> running xfs_check on the unmounted filesystem?

Don't bother with xfs_check. xfs_repair will detect all the same
errors (and more) and can fix them at the same time.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-10-29 12:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-29 10:55 Daily crash in xfs_cmn_err Juerg Haefliger
2012-10-29 12:53 ` Dave Chinner [this message]
2012-10-30  8:58   ` Juerg Haefliger
2012-10-30 19:02     ` Eric Sandeen
2012-10-29 14:23 ` Carlos Maiolino
2012-10-30  9:07   ` Juerg Haefliger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121029125330.GS29378@dastard \
    --to=david@fromorbit.com \
    --cc=juergh@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.