XFS / xfs_repair - problem reading very large sparse files on very large filesystem

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Nikola Ciprich <nikola.ciprich@linuxbox.cz>
To: linux-xfs@vger.kernel.org
Cc: nikola.ciprich@linuxbox.cz
Subject: XFS / xfs_repair - problem reading very large sparse files on very large filesystem
Date: Thu, 4 Nov 2021 10:09:15 +0100	[thread overview]
Message-ID: <20211104090915.GW32555@pcnci.linuxbox.cz> (raw)

Hello fellow XFS users and developers,

we've stumbled upon strange problem which I think might be somewhere
in XFS code.

we have very large ceph-based storage on top which there is 1.5PiB volume
with XFS filesystem. This contains very large (ie 500TB) sparse files,
partially filled with data.

problem is, trying to read those files leads to processes blocked in D
state showing very very bad performance - ~200KiB/s, 50IOPS.

I tried running xfs_repair on the volume, but this seems to behave in
very similar way - very quickly it gets into almost stalled state, without
almost any progress..

[root@spbstdnas ~]# xfs_repair -P -t 60 -v -v -v -v /dev/sdk
Phase 1 - find and verify superblock...
        - max_mem = 154604838, icount = 9664, imem = 37, dblock = 382464425984, dmem = 186750208
Memory available for repair (150981MB) may not be sufficient.
At least 182422MB is needed to repair this filesystem efficiently
If repair fails due to lack of memory, please
increase system RAM and/or swap space to at least 364844MB.
        - block cache size set to 4096 entries
Phase 2 - using internal log
        - zero log...
zero_log: head block 1454674 tail block 1454674
        - scan filesystem freespace and inode maps...
        - found root inode chunk
libxfs_bcache: 0x26aa3a0
Max supported entries = 4096
Max supported entries = 4096
Max utilized entries = 4096
Active entries = 4048
Hash table size = 512
Hits = 0
Misses = 76653
Hit ratio =  0.00
MRU 0 entries =   4048 (100%)
MRU 1 entries =      0 (  0%)
MRU 2 entries =      0 (  0%)
MRU 3 entries =      0 (  0%)
MRU 4 entries =      0 (  0%)
MRU 5 entries =      0 (  0%)
MRU 6 entries =      0 (  0%)
MRU 7 entries =      0 (  0%)
MRU 8 entries =      0 (  0%)
MRU 9 entries =      0 (  0%)
MRU 10 entries =      0 (  0%)
MRU 11 entries =      0 (  0%)
MRU 12 entries =      0 (  0%)
MRU 13 entries =      0 (  0%)
MRU 14 entries =      0 (  0%)
MRU 15 entries =      0 (  0%)
Dirty MRU 16 entries =      0 (  0%)
Hash buckets with   2 entries      5 (  0%)
Hash buckets with   3 entries     11 (  0%)
Hash buckets with   4 entries     30 (  2%)
Hash buckets with   5 entries     36 (  4%)
Hash buckets with   6 entries     57 (  8%)
Hash buckets with   7 entries     90 ( 15%)
Hash buckets with   8 entries     80 ( 15%)
Hash buckets with   9 entries     74 ( 16%)
Hash buckets with  10 entries     62 ( 15%)
Hash buckets with  11 entries     31 (  8%)
Hash buckets with  12 entries     16 (  4%)
Hash buckets with  13 entries     10 (  3%)
Hash buckets with  14 entries      7 (  2%)
Hash buckets with  15 entries      2 (  0%)
Hash buckets with  16 entries      1 (  0%)
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2


        - agno = 3
  

VM has 200GB of RAM, but the xfs_repair does not use more then 1GB,
CPU is idle. it just only reads the same slow speed, ~200K/s, 50IOPS.

I've carefully checked, and the storage speed is much much faster, checked
with blktrace which areas of the volume it is currently reading, and trying
fio / dd on them shows it can perform much faster (as well as randomly reading
any area of the volume or trying randomread or seq read fio benchmarks)

I've found one, very old report pretty much resembling my problem:

https://www.spinics.net/lists/xfs/msg06585.html

but it is 10 years old and didn't lead to any conclusion.

Is it possible there is still some bug common for XFS kernel module and xfs_repair?

I tried 5.4.135 and 5.10.31 kernels, xfs_progs 4.5.0 and 5.13.0
(OS is x86_64 centos 7)

any hints on how could I further debug that?

I'd be very gratefull for any help

with best regards

nikola ciprich


-- 
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@linuxbox.cz
-------------------------------------

next             reply	other threads:[~2021-11-04  9:20 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-04  9:09 Nikola Ciprich [this message]
2021-11-04 16:20 ` XFS / xfs_repair - problem reading very large sparse files on very large filesystem Eric Sandeen
2021-11-05 14:13   ` Nikola Ciprich
2021-11-05 14:17     ` Nikola Ciprich
2021-11-05 14:56       ` Eric Sandeen
2021-11-05 15:59         ` Nikola Ciprich
2021-11-05 16:11           ` Eric Sandeen
2021-11-05 16:19             ` Nikola Ciprich
2021-11-07 22:25               ` Dave Chinner
2021-11-04 23:04 ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211104090915.GW32555@pcnci.linuxbox.cz \
    --to=nikola.ciprich@linuxbox.cz \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox