public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Chris Dunlop <chris@onthe.net.au>
To: linux-xfs@vger.kernel.org
Subject: fstrim and strace considered harmful?
Date: Wed, 18 May 2022 16:59:49 +1000	[thread overview]
Message-ID: <20220518065949.GA1237408@onthe.net.au> (raw)

Hi,

I have an fstrim that's been running for over 48 hours on a 256T thin 
provisioned XFS fs containing around 55T of actual data on a slow 
subsystem (ceph 8,3 erasure-encoded rbd). I don't think there would be an 
an enourmous amount of data to trim, maybe a few T, but I've no idea how 
long how long it might be expected to take. In an attempt to see what the 
what the fstrim was doing, I ran an strace on it. The strace has been 
sitting there without output and unkillable since then, now 5+ hours ago.  
Since the strace, on that same filesystem I now have 123 df processes and 
615 rm processes -- and growing -- that are blocked in xfs_inodegc_flush, 
e.g.:

May 18 15:31:52 d5 kernel: task:df              state:D stack:    0 pid:31741 ppid:     1 flags:0x00004004
May 18 15:31:52 d5 kernel: Call Trace:
May 18 15:31:52 d5 kernel:  <TASK>
May 18 15:31:52 d5 kernel:  __schedule+0x241/0x740
May 18 15:31:52 d5 kernel:  ? lock_is_held_type+0x97/0x100
May 18 15:31:52 d5 kernel:  schedule+0x3a/0xa0
May 18 15:31:52 d5 kernel:  schedule_timeout+0x271/0x310
May 18 15:31:52 d5 kernel:  ? find_held_lock+0x2d/0x90
May 18 15:31:52 d5 kernel:  ? sched_clock_cpu+0x9/0xa0
May 18 15:31:52 d5 kernel:  ? lock_release+0x214/0x350
May 18 15:31:52 d5 kernel:  wait_for_completion+0x7b/0xc0
May 18 15:31:52 d5 kernel:  __flush_work+0x217/0x350
May 18 15:31:52 d5 kernel:  ? flush_workqueue_prep_pwqs+0x120/0x120
May 18 15:31:52 d5 kernel:  ? wait_for_completion+0x1c/0xc0
May 18 15:31:52 d5 kernel:  xfs_inodegc_flush.part.24+0x62/0xc0 [xfs]
May 18 15:31:52 d5 kernel:  xfs_fs_statfs+0x37/0x1a0 [xfs]
May 18 15:31:52 d5 kernel:  statfs_by_dentry+0x3c/0x60
May 18 15:31:52 d5 kernel:  vfs_statfs+0x16/0xd0
May 18 15:31:52 d5 kernel:  user_statfs+0x44/0x80
May 18 15:31:52 d5 kernel:  __do_sys_statfs+0x10/0x30
May 18 15:31:52 d5 kernel:  do_syscall_64+0x34/0x80
May 18 15:31:52 d5 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
May 18 15:31:52 d5 kernel: RIP: 0033:0x7fe9e9db3c07
May 18 15:31:52 d5 kernel: RSP: 002b:00007ffe08f50178 EFLAGS: 00000246 ORIG_RAX: 0000000000000089
May 18 15:31:52 d5 kernel: RAX: ffffffffffffffda RBX: 0000555963fcae40 RCX: 00007fe9e9db3c07
May 18 15:31:52 d5 kernel: RDX: 00007ffe08f50400 RSI: 00007ffe08f50180 RDI: 0000555963fcae40
May 18 15:31:52 d5 kernel: RBP: 00007ffe08f50180 R08: 0000555963fcae80 R09: 0000000000000000
May 18 15:31:52 d5 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe08f50220
May 18 15:31:52 d5 kernel: R13: 0000000000000000 R14: 0000555963fcae80 R15: 0000555963fcae40
May 18 15:31:52 d5 kernel:  </TASK>

Full 1.5M sysrq output at: https://file.io/bWOL8F7mzKI6

That stack trace is uncomfortably familiar:

Subject: Highly reflinked and fragmented considered harmful?
https://lore.kernel.org/linux-xfs/20220509024659.GA62606@onthe.net.au/

FYI:

# xfs_info /vol
meta-data=/dev/vg01/vol          isize=512    agcount=257, agsize=268434432 blks
          =                       sectsz=4096  attr=2, projid32bit=1
          =                       crc=1        finobt=1, sparse=1, rmapbt=1
          =                       reflink=1    bigtime=1 inobtcount=1
data     =                       bsize=4096   blocks=68719475712, imaxpct=1
          =                       sunit=1024   swidth=8192 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=521728, version=2
          =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

Is there something I can do to "unstick" things, or is it time to hit the 
reset, and hope the recovery on mount isn't onerous?

Aside from that immediate issue, what has gone wrong here?

Cheers,

Chris

             reply	other threads:[~2022-05-18  6:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-18  6:59 Chris Dunlop [this message]
2022-05-18  7:07 ` fstrim and strace considered harmful? Chris Dunlop
2022-05-18 15:59   ` Darrick J. Wong
2022-05-18 22:36     ` Chris Dunlop
2022-05-19  0:50       ` Dave Chinner
2022-05-19  2:33         ` Chris Dunlop
2022-05-19  6:33           ` Dave Chinner
2022-05-19 15:25         ` Chris Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220518065949.GA1237408@onthe.net.au \
    --to=chris@onthe.net.au \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox