From: Gao Xiang <hsiangkao@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [RFC PATCH 1/4] xfs: support deactivating AGs
Date: Thu, 15 Apr 2021 15:08:33 +0800 [thread overview]
Message-ID: <20210415070833.GD1864610@xiangao.remote.csb> (raw)
In-Reply-To: <20210415062824.GN63242@dread.disaster.area>
Hi Dave,
On Thu, Apr 15, 2021 at 04:28:24PM +1000, Dave Chinner wrote:
> On Thu, Apr 15, 2021 at 12:28:37PM +0800, Gao Xiang wrote:
> > Hi Dave,
> >
> > On Thu, Apr 15, 2021 at 01:42:55PM +1000, Dave Chinner wrote:
> > > On Thu, Apr 15, 2021 at 03:52:37AM +0800, Gao Xiang wrote:
> > > > To get rid of paralleled requests related to AGs which are pending
> > > > for shrinking, mark these perags as inactive rather than playing
> > > > with per-ag structures theirselves.
> > > >
> > > > Since in that way, a per-ag lock can be used to stablize the inactive
> > > > status together with agi/agf buffer lock (which is much easier than
> > > > adding more complicated perag_{get, put} pairs..) Also, Such per-ags
> > > > can be released / reused when unmountfs / growfs.
> > > >
> > > > On the read side, pag_inactive_rwsem can be unlocked immediately after
> > > > the agf or agi buffer lock is acquired. However, pag_inactive_rwsem
> > > > can only be unlocked after the agf/agi buffer locks are all acquired
> > > > with the inactive status on the write side.
> > > >
> > > > XXX: maybe there are some missing cases.
> > > >
> > > > Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
> > > > ---
> > > > fs/xfs/libxfs/xfs_ag.c | 16 +++++++++++++---
> > > > fs/xfs/libxfs/xfs_alloc.c | 12 +++++++++++-
> > > > fs/xfs/libxfs/xfs_ialloc.c | 26 +++++++++++++++++++++++++-
> > > > fs/xfs/xfs_mount.c | 2 ++
> > > > fs/xfs/xfs_mount.h | 6 ++++++
> > > > 5 files changed, 57 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c
> > > > index c68a36688474..ba5702e5c9ad 100644
> > > > --- a/fs/xfs/libxfs/xfs_ag.c
> > > > +++ b/fs/xfs/libxfs/xfs_ag.c
> > > > @@ -676,16 +676,24 @@ xfs_ag_get_geometry(
> > > > if (agno >= mp->m_sb.sb_agcount)
> > > > return -EINVAL;
> > > >
> > > > + pag = xfs_perag_get(mp, agno);
> > > > + down_read(&pag->pag_inactive_rwsem);
> > >
> > > No need to encode the lock type in the lock name. We know it's a
> > > rwsem from the lock API functions...
> > >
> > > > +
> > > > + if (pag->pag_inactive) {
> > > > + error = -EBUSY;
> > > > + up_read(&pag->pag_inactive_rwsem);
> > > > + goto out;
> > > > + }
> > >
> > > This looks kinda heavyweight. Having to take a rwsem whenever we do
> > > a perag lookup to determine if we can access the perag completely
> > > defeats the purpose of xfs_perag_get() being a lightweight, lockless
> > > operation.
> >
> > I'm not sure if it has some regression since write lock will be only
> > taken when shrinking (shrinking is a rare operation), for most cases
> > which is much similiar to perag radix root I think.
>
> It's still an extra pair of atomic operation per xfs_perag_get/put()
> call pairs. pag_inactive being true is the slow/rare path, so we
> should be trying to keep the overhead of detecting that path out of
> all our fast paths...
>
> Indeed, xfs_perag_get() already shows up on profiles because of the
> number of calls we make to it, so adding an extra atomic operation
> for this operation is going to be noticable in terms of CPU usage,
> if nothing else.
>
> > The locking logic is that, when pag->pag_inactive = false -> true,
> > the write lock, AGF/AGI locks all have to be taken in advance.
> >
> > >
> > > I suspect what we really want here is active/passive references like
> > > are used for the superblock, and an API that hides the
> > > implementation from all the callers.
> >
> > If my understanding is correct, my own observation these months is
> > that the current XFS codebase is not well suitable to accept !pag
> > (due to many logic assumes pag structure won't go away, since some
> > are indexed/passed by agno rather than some pag reference count).
>
> Maybe so, but that's exactly what this patch is addressing - it's
> adding a way to detect that perag has "gone away"i and should not be
> referenced any more.
>
> It wasn't until I saw this patch that I realised that there is a way
> that we can, in fact, safely handle perags being freed by ensuring
> that the RCU lookup fails for these "active" references....
>
> > Even I think we could introduce some active references, but handle
> > the cover range is still a big project. The current approach assumes
> > pag won't go away except for umounting and blocks allocation / imap/
> > ... paths to access that.
>
> The way I described should work just fine - nobody should be
> accessing the per-ag without a reference gained through a lookup in
> some way. Buffers carry a passive reference, because the per-ag
> teardown will do the teardown of those references before the perag
> is freed. Everything else carries an active reference and so
> teardown cannot begin until all active references go away.
>
> > My current thought is that we could implement it in that way as the
> > first step (in order to land the shrinking functionality to let
> > end-users benefit of this), and by the codebase evolves, it can be
> > transformed to a more gentle way.
>
> I think converting this patchset to active/passive references ias
> I've described solves the problem entirely - there's no "evolving"
> needed as we can solve it with this one structural change...
Since currently even xfs_perag_put() reaches zero, it won't free
the per-ag anyway (it may just use to mark the pointer is no longer
used in the context? not sure what's the exact use of the such pairs),
so in practice I think after active/passive references are introduced,
there is still the only one real reference count that works for the
per-ag lifetime management and currently it doesn't manage whole
lifetime at all...
So (my own understanding is) I think in practice, that approachs
would be somewhat equal to relocate/rearrange xfs_perag_get()/put()
pair to manage the perag lifetime instead. and maybe use xfs_perag_ptr()
to access perag when some reference count is available.
Thanks,
Gao Xiang
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
next prev parent reply other threads:[~2021-04-15 7:08 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-14 19:52 [RFC PATCH 0/4] xfs: support shrinking empty AGs Gao Xiang
2021-04-14 19:52 ` [RFC PATCH 1/4] xfs: support deactivating AGs Gao Xiang
2021-04-15 3:42 ` Dave Chinner
2021-04-15 4:28 ` Gao Xiang
2021-04-15 6:28 ` Dave Chinner
2021-04-15 7:08 ` Gao Xiang [this message]
2021-04-15 8:44 ` Dave Chinner
2021-04-14 19:52 ` [RFC PATCH 2/4] xfs: check ag is empty Gao Xiang
2021-04-15 3:52 ` Dave Chinner
2021-04-15 4:34 ` Gao Xiang
2021-04-14 19:52 ` [RFC PATCH 3/4] xfs: introduce max_agcount Gao Xiang
2021-04-15 3:59 ` Dave Chinner
2021-04-14 19:52 ` [RFC PATCH 4/4] xfs: support shrinking empty AGs Gao Xiang
2021-04-15 4:25 ` Dave Chinner
2021-04-15 5:22 ` Gao Xiang
2021-04-15 8:33 ` Dave Chinner
2021-04-15 17:00 ` Darrick J. Wong
2021-04-15 21:24 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210415070833.GD1864610@xiangao.remote.csb \
--to=hsiangkao@redhat.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox