From: Gao Xiang <hsiangkao@redhat.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org, Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH v2 2/2] xfs: turn on lazysbcount unconditionally
Date: Wed, 21 Apr 2021 04:00:29 +0800 [thread overview]
Message-ID: <20210420200029.GA3028214@xiangao.remote.csb> (raw)
In-Reply-To: <20210420162250.GE3122264@magnolia>
Hi Darrick,
On Tue, Apr 20, 2021 at 09:22:50AM -0700, Darrick J. Wong wrote:
> On Tue, Apr 20, 2021 at 07:08:55PM +0800, Gao Xiang wrote:
> > As Dave mentioned [1], "/me is now wondering why we even bother
> > with !lazy-count anymore.
> >
> > We've updated the agr btree block accounting unconditionally since
> > lazy-count was added, and scrub will always report a mismatch in
> > counts if they exist regardless of lazy-count. So why don't we just
> > start ignoring the on-disk value and always use lazy-count based
> > updates? "
> >
> > Therefore, turn on lazy sb counters if it's still disabled at the
> > mount time, or at remount_rw if fs was mounted as read-only.
> > xfs_initialize_perag_data() is reused here since no need to scan
> > agf/agi once more again.
> >
> > After this patch, we could get rid of this whole set of subtle
> > conditional behaviours in the codebase.
> >
> > [1] https://lore.kernel.org/r/20210417223201.GU63242@dread.disaster.area
> > Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
> > ---
> > Enabling lazysbcount is only addressed in this patch, I'll send
> > out a seperated following patch to cleanup all unused conditions
> > later.
> >
> > Also tr_sb is reused here since only agf is modified for each ag,
> > and before lazysbcount sb feature is enabled (m_update_sb = true),
> > agf_btreeblks field shouldn't matter for such AGs.
> >
> > fs/xfs/libxfs/xfs_format.h | 6 +++
> > fs/xfs/libxfs/xfs_sb.c | 93 +++++++++++++++++++++++++++++++++++---
> > fs/xfs/xfs_mount.c | 2 +-
> > fs/xfs/xfs_super.c | 5 ++
> > 4 files changed, 98 insertions(+), 8 deletions(-)
> >
> > diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> > index 76e2461b9e66..9081d7876d66 100644
> > --- a/fs/xfs/libxfs/xfs_format.h
> > +++ b/fs/xfs/libxfs/xfs_format.h
> > @@ -385,6 +385,12 @@ static inline bool xfs_sb_version_haslazysbcount(struct xfs_sb *sbp)
> > (sbp->sb_features2 & XFS_SB_VERSION2_LAZYSBCOUNTBIT));
> > }
> >
> > +static inline void xfs_sb_version_addlazysbcount(struct xfs_sb *sbp)
> > +{
> > + sbp->sb_versionnum |= XFS_SB_VERSION_MOREBITSBIT;
> > + sbp->sb_features2 |= XFS_SB_VERSION2_LAZYSBCOUNTBIT;
> > +}
> > +
> > static inline bool xfs_sb_version_hasattr2(struct xfs_sb *sbp)
> > {
> > return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) ||
> > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > index 423dada3f64c..6353e0d4cab1 100644
> > --- a/fs/xfs/libxfs/xfs_sb.c
> > +++ b/fs/xfs/libxfs/xfs_sb.c
> > @@ -18,6 +18,7 @@
> > #include "xfs_trace.h"
> > #include "xfs_trans.h"
> > #include "xfs_buf_item.h"
> > +#include "xfs_btree.h"
> > #include "xfs_bmap_btree.h"
> > #include "xfs_alloc_btree.h"
> > #include "xfs_log.h"
> > @@ -841,6 +842,55 @@ xfs_sb_mount_common(
> > mp->m_ag_max_usable = xfs_alloc_ag_max_usable(mp);
> > }
> >
> > +static int
> > +xfs_fixup_agf_btreeblks(
> > + struct xfs_mount *mp,
> > + struct xfs_trans *tp,
> > + struct xfs_buf *agfbp,
> > + xfs_agnumber_t agno)
> > +{
> > + struct xfs_btree_cur *cur;
> > + struct xfs_perag *pag = agfbp->b_pag;
> > + struct xfs_agf *agf = agfbp->b_addr;
> > + xfs_agblock_t btreeblks, blocks;
> > + int error;
> > +
> > + cur = xfs_allocbt_init_cursor(mp, tp, agfbp, agno, XFS_BTNUM_BNO);
> > + error = xfs_btree_count_blocks(cur, &blocks);
> > + if (error)
> > + goto err;
> > + xfs_btree_del_cursor(cur, error);
> > + btreeblks = blocks - 1;
> > +
> > + cur = xfs_allocbt_init_cursor(mp, tp, agfbp, agno, XFS_BTNUM_CNT);
> > + error = xfs_btree_count_blocks(cur, &blocks);
> > + if (error)
> > + goto err;
> > + xfs_btree_del_cursor(cur, error);
> > + btreeblks += blocks - 1;
> > +
> > + /*
> > + * although rmapbt doesn't exist in v4 fses, but it'd be better
> > + * to turn it as a generic helper.
> > + */
> > + if (xfs_sb_version_hasrmapbt(&mp->m_sb)) {
> > + cur = xfs_rmapbt_init_cursor(mp, tp, agfbp, agno);
> > + error = xfs_btree_count_blocks(cur, &blocks);
> > + if (error)
> > + goto err;
> > + xfs_btree_del_cursor(cur, error);
> > + btreeblks += blocks - 1;
> > + }
> > +
> > + agf->agf_btreeblks = cpu_to_be32(btreeblks);
> > + pag->pagf_btreeblks = btreeblks;
> > + xfs_alloc_log_agf(tp, agfbp, XFS_AGF_BTREEBLKS);
> > + return 0;
> > +err:
> > + xfs_btree_del_cursor(cur, error);
> > + return error;
> > +}
> > +
> > /*
> > * xfs_initialize_perag_data
> > *
> > @@ -864,27 +914,51 @@ xfs_initialize_perag_data(
> > uint64_t btree = 0;
> > uint64_t fdblocks;
> > int error = 0;
> > + bool conv = !(mp->m_flags & XFS_MOUNT_RDONLY) &&
> > + !xfs_sb_version_haslazysbcount(sbp);
> > +
> > + if (conv)
> > + xfs_warn(mp, "enabling lazy-counters...");
> >
> > for (index = 0; index < agcount; index++) {
> > + struct xfs_trans *tp = NULL;
> > + struct xfs_buf *agfbp;
> > +
> > + if (conv) {
> > + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_sb,
> > + 0, 0, 0, &tp);
> > + if (error)
> > + return error;
> > + }
> > +
> > /*
> > - * read the agf, then the agi. This gets us
> > + * read the agi, then the agf. This gets us
> > * all the information we need and populates the
> > * per-ag structures for us.
> > */
> > - error = xfs_alloc_pagf_init(mp, NULL, index, 0);
> > - if (error)
> > + error = xfs_ialloc_pagi_init(mp, tp, index);
> > + if (error) {
> > +err_out:
> > + if (tp)
> > + xfs_trans_cancel(tp);
> > return error;
> > + }
> >
> > - error = xfs_ialloc_pagi_init(mp, NULL, index);
> > + error = xfs_alloc_read_agf(mp, tp, index, 0, &agfbp);
> > if (error)
> > - return error;
> > - pag = xfs_perag_get(mp, index);
> > + goto err_out;
> > + pag = agfbp->b_pag;
> > ifree += pag->pagi_freecount;
> > ialloc += pag->pagi_count;
> > bfree += pag->pagf_freeblks;
> > bfreelst += pag->pagf_flcount;
> > + if (tp) {
> > + error = xfs_fixup_agf_btreeblks(mp, tp, agfbp, index);
>
> Lazysbcount upgrades should be done from a separate function, not mixed
> in with perag initialization.
I've seen some previous discussion about multiple AG total scan time cost.
Yeah, if another extra scan really accepts here, I could update instead.
> Also, why is it necessary to walk all the space btrees to set agf_btreeblks?
If my understanding is correct, I think because without lazysbcount,
even pagf_btreeblks is updated unconditionally now, but that counter
is unreliable for quite ancient kernels which don't have such update
logic.
Kindly correct me if I'm wrong here.
>
> > + xfs_trans_commit(tp);
> > + } else {
> > + xfs_buf_relse(agfbp);
> > + }
> > btree += pag->pagf_btreeblks;
> > - xfs_perag_put(pag);
> > }
> > fdblocks = bfree + bfreelst + btree;
> >
> > @@ -900,6 +974,11 @@ xfs_initialize_perag_data(
> > goto out;
> > }
> >
> > + if (conv) {
> > + xfs_sb_version_addlazysbcount(sbp);
> > + mp->m_update_sb = true;
> > + xfs_warn(mp, "lazy-counters has been enabled.");
>
> But we don't log the sb update?
>
> As far as the feature upgrade goes, is it necessary to bwrite the
> primary super to disk (and then log the change)[1] to prevent a truly
> ancient kernel that doesn't support lazysbcount from trying to recover
> the log and ending up with an unsupported feature set?
Not quite sure if it does harm to ancient kernels with such
unsupported feature. may I ask for more details? :)
but yeah, if any issues here, I should follow
1) bwrite sb block first;
2) log sb
>
> [1] https://lore.kernel.org/linux-xfs/161723934343.3149451.16679733325094950568.stgit@magnolia/
>
> > + }
> > /* Overwrite incore superblock counters with just-read data */
> > spin_lock(&mp->m_sb_lock);
> > sbp->sb_ifree = ifree;
> > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> > index cb1e2c4702c3..b3b13acd45d6 100644
> > --- a/fs/xfs/xfs_mount.c
> > +++ b/fs/xfs/xfs_mount.c
> > @@ -626,7 +626,7 @@ xfs_check_summary_counts(
> > * superblock to be correct and we don't need to do anything here.
> > * Otherwise, recalculate the summary counters.
> > */
> > - if ((!xfs_sb_version_haslazysbcount(&mp->m_sb) ||
> > + if ((xfs_sb_version_haslazysbcount(&mp->m_sb) &&
>
> Not clear why the logic here inverts?
.. thus xfs_initialize_perag_data() below can be called then.
Thanks,
Gao Xiang
>
> --D
>
> > XFS_LAST_UNMOUNT_WAS_CLEAN(mp)) &&
> > !xfs_fs_has_sickness(mp, XFS_SICK_FS_COUNTERS))
> > return 0;
> > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > index a2dab05332ac..16197a890c15 100644
> > --- a/fs/xfs/xfs_super.c
> > +++ b/fs/xfs/xfs_super.c
> > @@ -1678,6 +1678,11 @@ xfs_remount_rw(
> > }
> >
> > mp->m_flags &= ~XFS_MOUNT_RDONLY;
> > + if (!xfs_sb_version_haslazysbcount(sbp)) {
> > + error = xfs_initialize_perag_data(mp, sbp->sb_agcount);
> > + if (error)
> > + return error;
> > + }
> >
> > /*
> > * If this is the first remount to writeable state we might have some
> > --
> > 2.27.0
> >
>
next prev parent reply other threads:[~2021-04-20 20:00 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-20 11:08 [PATCH v2 1/2] xfs: don't use in-core per-cpu fdblocks for !lazysbcount Gao Xiang
2021-04-20 11:08 ` [PATCH v2 2/2] xfs: turn on lazysbcount unconditionally Gao Xiang
2021-04-20 16:22 ` Darrick J. Wong
2021-04-20 20:00 ` Gao Xiang [this message]
2021-04-22 0:01 ` Darrick J. Wong
2021-04-22 1:51 ` Gao Xiang
2021-04-22 5:11 ` Zorro Lang
2021-04-20 17:42 ` [PATCH v2 1/2] xfs: don't use in-core per-cpu fdblocks for !lazysbcount Darrick J. Wong
2021-04-20 21:25 ` Dave Chinner
2021-04-20 21:54 ` Gao Xiang
2021-04-21 1:45 ` Dave Chinner
2021-04-21 3:01 ` Gao Xiang
2021-04-22 1:44 ` Dave Chinner
2021-04-22 2:06 ` Gao Xiang
2021-04-22 3:01 ` Dave Chinner
2021-04-22 3:12 ` Gao Xiang
2021-04-22 15:58 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210420200029.GA3028214@xiangao.remote.csb \
--to=hsiangkao@redhat.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox