public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Anisse Astier <anisse@astier.eu>
Cc: xfs@oss.sgi.com
Subject: Re: xfs_repair crashing (versions 3.1.4 and 3.1.5)
Date: Tue, 19 Apr 2011 18:27:05 +1000	[thread overview]
Message-ID: <20110419082705.GI23985@dastard> (raw)
In-Reply-To: <BANLkTikh1i3aYgXNZut+AGT-1kz=aqv-Eg@mail.gmail.com>

On Mon, Apr 18, 2011 at 09:24:22PM +0200, Anisse Astier wrote:
> Hi,
> 
> (first of all, I'm not subscribed to the list, Please cc-me on all replies)
> 
> On an ARM NAS, using kernel 2.6.36.2 I managed to crash my root xfs partition.
> 
> xfs_repair cannot then repair this partition and is crashing itself.
> 
> # xfs_info  /dev/sda2
> meta-data=/dev/sda2              isize=256    agcount=32, agsize=7615249 blks
>          =                       sectsz=512   attr=1
> data     =                       bsize=4096   blocks=243687968, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0
> log      =internal               bsize=4096   blocks=32768, version=1
>          =                       sectsz=512   sunit=0 blks, lazy-count=0
> realtime =none                   extsz=65536  blocks=0, rtextents=0
> 
> 
> 
> I did a SMART test to ensure the disk didn't have any bad block:
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> Num  Test_Description    Status                  Remaining
> LifeTime(hours)  LBA_of_first_error
> # 1  Extended offline    Completed without error       00%      8327         -
> 
> The dmesg log (on another recovery system with kernel 2.6.36-rc2) ; I
> tried to mount the system :
> [ 1003.257446] XFS mounting filesystem sda2
> [ 1003.301519] Starting XFS recovery on filesystem: sda2 (logdev: internal)
> [ 1003.303068] XFS: bad number of regions (28024) in inode log format
> [ 1003.303142] XFS: log mount/recovery failed: error 5
> [ 1003.303419] XFS: log mount failed

Something has corrupted the log....

> I then had no other choice than suppressing the log with xfs_repair -L.

Yup.

> xfs_repair crashed, but I was able to mount the filesystem(ro), but
> once I tried accessing the corrupt files, xfs would go mad:
> [13717.138896] UDF-fs: No partition found (1)
> [13717.202112] XFS mounting filesystem sda2
> [13717.274885] Ending clean XFS mount for filesystem: sda2
> [43969.970648] sshd (1039): /proc/1039/oom_adj is deprecated, please
> use /proc/1039/oom_score_adj instead.
> [107180.252602] Filesystem "sda2": corrupt dinode 805341224, (btree
> extents).  Unmount and run xfs_repair.

Quite likely, zeroing the log effectively corrupts the filesystem.

.....
> directory flags set on non-directory inode 2283178100, would fix bad flags.
> bad key in bmbt root (is 73434, would reset to 74194) in inode
> 2283178100 data fork
> bad fwd (right) sibling pointer (saw 145202888 should be NULLDFSBNO)
> Segmentation fault

Hmmm. The very next line doesn't appear before the segfault, making
me think that it's the printf that is causing it to crash.

        if (check_dups == 0 &&
                cursor.level[0].right_fsbno != NULLDFSBNO)  {
                do_warn(
        _("bad fwd (right) sibling pointer (saw %llu should be NULLDFSBNO)\n"),
                        cursor.level[0].right_fsbno);

We get this line of output.

                do_warn(
        _("\tin inode %u (%s fork) bmap btree block %llu\n"),
                        XFS_AGINO_TO_INO(mp, agno, ino), forkname,
                        cursor.level[0].fsbno);

But not this one. I wonder if passing a 64bit number to a %u format
string (shoul dbe %llu) causes problems on ARM? All the variables
are valid as they are printed or accessed elsewhere in the function,
so that's the only thing I can think of without a stack trace to
tell me otherwise....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2011-04-19  8:23 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-18 19:24 xfs_repair crashing (versions 3.1.4 and 3.1.5) Anisse Astier
2011-04-19  8:27 ` Dave Chinner [this message]
2011-04-19 11:07   ` Anisse Astier
2011-04-21 19:26     ` Eric Sandeen
2011-04-22 11:09       ` Anisse Astier
2011-05-04  9:11         ` Christoph Hellwig
2011-05-04 10:24           ` Anisse Astier
2011-05-05 22:46             ` Anisse Astier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110419082705.GI23985@dastard \
    --to=david@fromorbit.com \
    --cc=anisse@astier.eu \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox