All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Anisse Astier <anisse@astier.eu>
Cc: xfs@oss.sgi.com
Subject: Re: xfs_repair crashing (versions 3.1.4 and 3.1.5)
Date: Tue, 19 Apr 2011 18:27:05 +1000	[thread overview]
Message-ID: <20110419082705.GI23985@dastard> (raw)
In-Reply-To: <BANLkTikh1i3aYgXNZut+AGT-1kz=aqv-Eg@mail.gmail.com>

On Mon, Apr 18, 2011 at 09:24:22PM +0200, Anisse Astier wrote:
> Hi,
> 
> (first of all, I'm not subscribed to the list, Please cc-me on all replies)
> 
> On an ARM NAS, using kernel 2.6.36.2 I managed to crash my root xfs partition.
> 
> xfs_repair cannot then repair this partition and is crashing itself.
> 
> # xfs_info  /dev/sda2
> meta-data=/dev/sda2              isize=256    agcount=32, agsize=7615249 blks
>          =                       sectsz=512   attr=1
> data     =                       bsize=4096   blocks=243687968, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=32768, version=1
>          =                       sectsz=512   sunit=0 blks, lazy-count=0
> realtime =none                   extsz=65536  blocks=0, rtextents=0
> 
> 
> 
> I did a SMART test to ensure the disk didn't have any bad block:
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> Num  Test_Description    Status                  Remaining
> LifeTime(hours)  LBA_of_first_error
> # 1  Extended offline    Completed without error       00%      8327         -
> 
> The dmesg log (on another recovery system with kernel 2.6.36-rc2) ; I
> tried to mount the system :
> [ 1003.257446] XFS mounting filesystem sda2
> [ 1003.301519] Starting XFS recovery on filesystem: sda2 (logdev: internal)
> [ 1003.303068] XFS: bad number of regions (28024) in inode log format
> [ 1003.303142] XFS: log mount/recovery failed: error 5
> [ 1003.303419] XFS: log mount failed

Something has corrupted the log....

> I then had no other choice than suppressing the log with xfs_repair -L.

Yup.

> xfs_repair crashed, but I was able to mount the filesystem(ro), but
> once I tried accessing the corrupt files, xfs would go mad:
> [13717.138896] UDF-fs: No partition found (1)
> [13717.202112] XFS mounting filesystem sda2
> [13717.274885] Ending clean XFS mount for filesystem: sda2
> [43969.970648] sshd (1039): /proc/1039/oom_adj is deprecated, please
> use /proc/1039/oom_score_adj instead.
> [107180.252602] Filesystem "sda2": corrupt dinode 805341224, (btree
> extents).  Unmount and run xfs_repair.

Quite likely, zeroing the log effectively corrupts the filesystem.

.....
> directory flags set on non-directory inode 2283178100, would fix bad flags.
> bad key in bmbt root (is 73434, would reset to 74194) in inode
> 2283178100 data fork
> bad fwd (right) sibling pointer (saw 145202888 should be NULLDFSBNO)
> Segmentation fault

Hmmm. The very next line doesn't appear before the segfault, making
me think that it's the printf that is causing it to crash.

        if (check_dups == 0 &&
                cursor.level[0].right_fsbno != NULLDFSBNO)  {
                do_warn(
        _("bad fwd (right) sibling pointer (saw %llu should be NULLDFSBNO)\n"),
                        cursor.level[0].right_fsbno);

We get this line of output.

                do_warn(
        _("\tin inode %u (%s fork) bmap btree block %llu\n"),
                        XFS_AGINO_TO_INO(mp, agno, ino), forkname,
                        cursor.level[0].fsbno);

But not this one. I wonder if passing a 64bit number to a %u format
string (shoul dbe %llu) causes problems on ARM? All the variables
are valid as they are printed or accessed elsewhere in the function,
so that's the only thing I can think of without a stack trace to
tell me otherwise....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2011-04-19  8:23 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-18 19:24 xfs_repair crashing (versions 3.1.4 and 3.1.5) Anisse Astier
2011-04-19  8:27 ` Dave Chinner [this message]
2011-04-19 11:07   ` Anisse Astier
2011-04-21 19:26     ` Eric Sandeen
2011-04-22 11:09       ` Anisse Astier
2011-05-04  9:11         ` Christoph Hellwig
2011-05-04 10:24           ` Anisse Astier
2011-05-05 22:46             ` Anisse Astier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110419082705.GI23985@dastard \
    --to=david@fromorbit.com \
    --cc=anisse@astier.eu \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.