Re: XFS on RBD crash - Dave Chinner

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Alex Gorbachev <ag@iss-integration.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: XFS on RBD crash
Date: Tue, 12 Dec 2017 08:06:07 +1100	[thread overview]
Message-ID: <20171211210607.GS5858@dastard> (raw)
In-Reply-To: <CADb94503vRdng4T_sTGimZPWhmZE4D9BAC0_OKCGDBAc06PMVw@mail.gmail.com>

On Sat, Dec 09, 2017 at 04:01:34PM -0500, Alex Gorbachev wrote:
> I have experienced a crash today (in a sense of filesystem going
> offline) of a 25TB XFS filesystem.  Tried searching the list and
> google, and not much specific info I can use, so very much appreciate
> any insight:

freespace tree corruption. No idea what the cause might have been.

> System: Ubuntu 16.04, kernel 4.10.17-041017-generic
> 
> Mount info:
> 
> /dev/rbd0 on /srv/exports/sclun63 type xfs
> (rw,relatime,attr2,inode64,logbsize=256k,sunit=8192,swidth=8192,noquota)
                                           ^^^^^^^^^^^^^^^^^^^^^^
Why?

> xfs_repair (had to do -L):

Because the corrupted metadata was in the log, causing mount to
fail, and that's why you zeroed the log?

> root@roc01r-scd224:~# xfs_repair -L /dev/rbd0
> Phase 1 - find and verify superblock...
>         - reporting progress in intervals of 15 minutes
> Phase 2 - using internal log
>         - zero log...
> ALERT: The filesystem has valuable metadata changes in a log which is being
> destroyed because the -L option was used.
>         - scan filesystem freespace and inode maps...
> freeblk count 3 != flcount 4 in ag 47

Those are from trashing the log, I think.

> sb_ifree 667, counted 615
> sb_fdblocks 3930698117, counted 1111093367

That's a major discrepancy - superblock said ~16TB free, repair
counted only 5TB free. And the inode count is off, too - had you
been removing files recently?

>         - 15:06:10: scanning filesystem freespace - 193 of 193

Lots of AGs for a small filesystem - this filesystem has been grown
several times?

>         - 15:10:22: check for inodes claiming duplicate blocks - 896
> of 896 inodes done

Only ~900 files in the filesystem? 

> No other errors in logs, Ceph or hardware.

And no error reported from xfs_repair, either. So the corruption
occurred in memory and was captured by the log, which you zeroed
to run xfs_repair.

So there's really nothing left for us to analyse and debug.....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

     prev parent reply	other threads:[~2017-12-11 21:06 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-09 21:01 XFS on RBD crash Alex Gorbachev
2017-12-09 21:04 ` Alex Gorbachev
2017-12-11 21:06 ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171211210607.GS5858@dastard \
    --to=david@fromorbit.com \
    --cc=ag@iss-integration.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox