From: Dave Chinner <david@fromorbit.com>
To: Alex Lyakas <alex@zadara.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: RCU stall in xfs_reclaim_inodes_ag
Date: Tue, 8 Dec 2020 19:07:28 +1100 [thread overview]
Message-ID: <20201208080728.GX3913616@dread.disaster.area> (raw)
In-Reply-To: <6117EC6AA8F04ECA90EAACF20C4A2A7C@alyakaslap>
On Mon, Dec 07, 2020 at 12:18:13PM +0200, Alex Lyakas wrote:
> Hi Dave,
>
> Thank you for your response.
>
> We did some more investigations on the issue, and we have the following
> findings:
>
> 1) We tracked the max amount of inodes per AG radix tree. We found in our
> tests, that the max amount of inodes per AG radix tree was about 1.5M:
> [xfs_reclaim_inodes_ag:1285] XFS(dm-79): AG[1368]: count=1384662
> reclaimable=58
> [xfs_reclaim_inodes_ag:1285] XFS(dm-79): AG[1368]: count=1384630
> reclaimable=46
> [xfs_reclaim_inodes_ag:1285] XFS(dm-79): AG[1368]: count=1384600
> reclaimable=16
> [xfs_reclaim_inodes_ag:1285] XFS(dm-79): AG[1370]: count=1594500
> reclaimable=75
> [xfs_reclaim_inodes_ag:1285] XFS(dm-79): AG[1370]: count=1594468
> reclaimable=55
> [xfs_reclaim_inodes_ag:1285] XFS(dm-79): AG[1370]: count=1594436
> reclaimable=46
> [xfs_reclaim_inodes_ag:1285] XFS(dm-79): AG[1370]: count=1594421
> reclaimable=42
> (but the amount of reclaimable inodes is very small, as you can see).
>
> Do you think this number is reasonable per radix tree?
That's fine. I regularly run tests that push 10M+ inodes into a
single radix tree, and that generally doesn't even show up on the
profiles....
> 2) This particular XFS instance is total of 500TB. However, the AG size in
> this case is 100GB.
Ok. I run scalability tests on 500TB filesystems with 500AGs that
hammer inode reclaim, but it should be trivial to run them with
5000AGs. Hell, let's just run 50,000AGs to see if there's actually
an iteration problem in the shrinker...
(we really need async IO in mkfs for doing things like this!)
Yup, that hurts a bit on 5.10-rc7, but not significantly. Profile
with 16 CPUs turning over 250,000 inodes/s through the cache:
- 3.17% xfs_fs_free_cached_objects ▒
- xfs_reclaim_inodes_nr ▒
- 3.09% xfs_reclaim_inodes_ag ▒
- 0.91% _raw_spin_lock ▒
0.87% do_raw_spin_lock ▒
- 0.71% _raw_spin_unlock ▒
- 0.67% do_raw_spin_unlock ▒
__raw_callee_save___pv_queued_spin_unlock
Which indicate spinlocks are the largest CPU user in that path.
That's likley the radix tree spin locks when removing inodes from
the AG because the upstream code now allows multiple reclaimers to
operate on the same AG.
But even with that, there isn't any sign of holdoff latencies,
scanning delays, etc occurring inside the RCU critical section.
IOWs, bumping up the number of AGs massively shouldn't impact the
RCU code here as the RCU crictical region is inside the loop over
the AGs, not spanning the loop.
I don't know how old your kernel is, but maybe something is getting
stuck on a spinlock (per-inode or per-ag) inside the RCU section?
i.e. maybe you see a RCU stall because the code has livelocked or
has severe contention on a per-ag or inode spinlock inside the RCU
section?
I suspect you are going to need to profile the code when it is
running to some idea of what it is actually doing when the stalls
occur...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
prev parent reply other threads:[~2020-12-08 8:08 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-16 17:45 RCU stall in xfs_reclaim_inodes_ag Alex Lyakas
2020-11-16 21:30 ` Dave Chinner
2020-12-07 10:18 ` Alex Lyakas
2020-12-07 14:15 ` Brian Foster
2020-12-08 8:07 ` Dave Chinner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201208080728.GX3913616@dread.disaster.area \
--to=david@fromorbit.com \
--cc=alex@zadara.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox