linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Arkadiusz Miśkiewicz" <a.miskiewicz@gmail.com>
To: "Brian Foster" <bfoster@redhat.com>,
	"Arkadiusz Miśkiewicz" <a.miskiewicz@gmail.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: xfs_repair doesn't handle: br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0 corruption
Date: Thu, 4 Mar 2021 09:54:25 +0100	[thread overview]
Message-ID: <b3d66e9b-2223-9413-7d66-d348b63660c5@gmail.com> (raw)
In-Reply-To: <20200715114058.GB51908@bfoster>

W dniu 15.07.2020 o 13:40, Brian Foster pisze:
> On Wed, Jul 15, 2020 at 09:05:47AM +0200, Arkadiusz Miśkiewicz wrote:
>>
>> Hello.
>>
>> xfs_repair (from for-next from about 2-3 weeks ago) doesn't seem to
>> handle such kind of corruption. Repair (few times) finishes just fine
>> but it ends up again with such trace.
>>
> 
> Are you saying that xfs_repair eventually resolves the corruption but it
> takes multiple tries, and then the corruption reoccurs at runtime? Or
> that xfs_repair doesn't ever resolve the corruption?
> 
> Either way, what does xfs_repair report?

http://ixion.pld-linux.org/~arekm/xfs/xfs-repair.txt

This is repair that I did back in 2020 on medadumped image (linked below)


But I also did repair recently with xfsprogs 5.10.0

http://ixion.pld-linux.org/~arekm/xfs/xfs-repair-sdd1-20210228.txt

on actual fs and today it crashed:

[ 3580.278435] XFS (sdd1): xfs_dabuf_map: bno 8388608 dir: inode 36509341678
[ 3580.278436] XFS (sdd1): [00] br_startoff 8388608 br_startblock -2
br_blockcount 1 br_state 0
[ 3580.278452] XFS (sdd1): Internal error xfs_da_do_buf(1) at line 2557
of file fs/xfs/libxfs/xfs_da_btree.c.  Caller xfs_da_read_buf+0x7c/0x130
[xfs]

so 5.10.0 repair also doesn't fix it.

> 
>> Metadump is possible but problematic (will be huge).
>>
> 
> How huge? Will it compress?

53GB

http://ixion.pld-linux.org/~arekm/xfs/sdd1.metadump.gz


> 
>>
>> Jul  9 14:35:51 x kernel: XFS (sdd1): xfs_dabuf_map: bno 8388608 dir:
>> inode 21698340263
>> Jul  9 14:35:51 x kernel: XFS (sdd1): [00] br_startoff 8388608
>> br_startblock -2 br_blockcount 1 br_state 0
> 
> It looks like we found a hole at the leaf offset of a directory. We'd
> expect to find a leaf or node block there depending on the directory
> format (which appears to be node format based on the stack below) that
> contains hashval lookup information for the dir.
> 
> It's not clear how we'd get into this state. Had this system experienced
> any crash/recovery sequences or storage issues before the first
> occurrence?

Yes, not once, that's my "famous" server which saw a lot of fs damage.

Anyway would be nice if repair could fix such messed startblock because
kernel crashes on it so easily (or at least I assume it's because of that).

> 
> Brian
> 
>> Jul  9 14:35:51 x kernel: XFS (sdd1): Internal error xfs_da_do_buf(1) at
>> line 2557 of file fs/xfs/libxfs/xfs_da_btree.c.  Caller
>> xfs_da_read_buf+0x6a/0x120 [xfs]
>> Jul  9 14:35:51 x kernel: CPU: 3 PID: 2928 Comm: cp Tainted: G
>>   E     5.0.0-1-03515-g3478588b5136 #10
>> Jul  9 14:35:51 x kernel: Hardware name: Supermicro X10DRi/X10DRi, BIOS
>> 3.0a 02/06/2018
>> Jul  9 14:35:51 x kernel: Call Trace:
>> Jul  9 14:35:51 x kernel:  dump_stack+0x5c/0x80
>> Jul  9 14:35:51 x kernel:  xfs_dabuf_map.constprop.0+0x1dc/0x390 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_da_read_buf+0x6a/0x120 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_da3_node_read+0x17/0xd0 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_da3_node_lookup_int+0x6c/0x370 [xfs]
>> Jul  9 14:35:51 x kernel:  ? kmem_cache_alloc+0x14e/0x1b0
>> Jul  9 14:35:51 x kernel:  xfs_dir2_node_lookup+0x4b/0x170 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_dir_lookup+0x1b5/0x1c0 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_lookup+0x57/0x120 [xfs]
>> Jul  9 14:35:51 x kernel:  xfs_vn_lookup+0x70/0xa0 [xfs]
>> Jul  9 14:35:51 x kernel:  __lookup_hash+0x6c/0xa0
>> Jul  9 14:35:51 x kernel:  ? _cond_resched+0x15/0x30
>> Jul  9 14:35:51 x kernel:  filename_create+0x91/0x160
>> Jul  9 14:35:51 x kernel:  do_linkat+0xa5/0x360
>> Jul  9 14:35:51 x kernel:  __x64_sys_linkat+0x21/0x30
>> Jul  9 14:35:51 x kernel:  do_syscall_64+0x55/0x100
>> Jul  9 14:35:51 x kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>>
>> Longer log:
>> http://ixion.pld-linux.org/~arekm/xfs-10.txt
>>
>>
>> -- 
>> Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )
>>
> 

(resend because vger still blocks my primary maven domain and most
likely nothing has changed with postmasters attitude, didn't try... :/ )

-- 
Arkadiusz Miśkiewicz, arekm / ( maven.pl | pld-linux.org )

      reply	other threads:[~2021-03-04  8:56 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-15  7:05 xfs_repair doesn't handle: br_startoff 8388608 br_startblock -2 br_blockcount 1 br_state 0 corruption Arkadiusz Miśkiewicz
2020-07-15 11:40 ` Brian Foster
2021-03-04  8:54   ` Arkadiusz Miśkiewicz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3d66e9b-2223-9413-7d66-d348b63660c5@gmail.com \
    --to=a.miskiewicz@gmail.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).