From: Gao Xiang <hsiangkao@redhat.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org, Zorro Lang <zlang@redhat.com>
Subject: Re: [PATCH] xfs: don't use in-core per-cpu fdblocks for !lazysbcount
Date: Sat, 17 Apr 2021 05:36:25 +0800 [thread overview]
Message-ID: <20210416213625.GC2224153@xiangao.remote.csb> (raw)
In-Reply-To: <20210416211320.GB2224153@xiangao.remote.csb>
On Sat, Apr 17, 2021 at 05:13:20AM +0800, Gao Xiang wrote:
> Hi Darrick,
>
> On Fri, Apr 16, 2021 at 09:00:13AM -0700, Darrick J. Wong wrote:
> > On Fri, Apr 16, 2021 at 05:10:23PM +0800, Gao Xiang wrote:
> > > There are many paths which could trigger xfs_log_sb(), e.g.
> > > xfs_bmap_add_attrfork()
> > > -> xfs_log_sb()
> > > , which overrided on-disk fdblocks by in-core per-CPU fdblocks.
> > >
> > > However, for !lazysbcount cases, on-disk fdblocks is actually updated
> > > by xfs_trans_apply_sb_deltas(), and generally it isn't equal to
> > > in-core fdblocks due to xfs_reserve_block() or whatever, see the
> > > comment in xfs_unmountfs().
> > >
> > > It could be observed by the following steps reported by Zorro [1]:
> > >
> > > 1. mkfs.xfs -f -l lazy-count=0 -m crc=0 $dev
> > > 2. mount $dev $mnt
> > > 3. fsstress -d $mnt -p 100 -n 1000 (maybe need more or less io load)
> > > 4. umount $mnt
> > > 5. xfs_repair -n $dev
> > >
> > > yet due to commit f46e5a174655("xfs: fold sbcount quiesce logging
> > > into log covering"), xfs_sync_sb() will be triggered even !lazysbcount
> > > but xfs_log_need_covered() case when xfs_unmountfs(), so hard to
> > > reproduce on kernel 5.12+.
> >
> > Um, I can't understand this(?), possibly because I can't get to RHBZ and
> > therefore have very little context to start from. :(
>
> Very sorry about that.. I realized it doesn't access at all without some
> permission after sending out the patch. :(
>
> >
> > Are you saying that because the f46e commit removed the xfs_sync_sb
> > calls from unmountfs for !lazysb filesystems, we no longer log the
> > summary counters at unmount? Which means that we no longer write the
> > incore percpu fdblocks count to disk at unmount after we've torn down
> > all the incore space reservations (when sb_fdblocks == m_fdblocks)?
>
> Er.. I think that is by reverse, before commit f46e, we no longer logged
> the summary counters at unmount, due to
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/xfs/xfs_mount.c?h=v5.11#n1177
> xfs_unmountfs
> -> xfs_log_sbcount
> -> !xfs_sb_version_haslazysbcount
> -> return 0 (xfs_sync_sb bypassed).
>
> So the only time we update the ondisk fdblocks was during transactions,
> but xfs_log_sb() corrupted this (due to no summary counters logging at
> unmount).
>
> After f46e, it became
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/xfs/xfs_log.c?h=v5.12-rc2#n982
> xfs_unmountfs
> -> xfs_log_unmount
> -> xfs_log_clean
> -> xfs_log_cover
>
> So if xfs_log_need_covered(mp) == true and
> !xfs_sb_version_haslazysbcount(&mp->m_sb),
> xfs_sync_sb() will be triggered to cover the log, So
> it's hard to reproduce on the current kernel (at least on my side.)
>
> But I have no idea xfs_log_need_covered(mp) is always true at that time,
> and the patchset seems a bit large and (possibly) hard to backport...
>
btw, after checking xfs_check_summary_counts(), I think by now,
sb_fdblocks won't be recalculated if !xfs_sb_version_haslazysbcount
when suddenly power outages (with dirty log...)
So for !xfs_sb_version_haslazysbcount cases, we really rely on the
correct on-disk sb_fdblocks all the time...
Anyway, I also think we should warn !lazysb fs deprecated runtimely
by now (even we have XFS_SUPPORT_V4 build config.)
> >
> > So that means that for !lazysb fses, the only time we log the sb
> > counters is during transactions, and when we do log the counters we
> > actually log the wrong value, since the incore reservations should never
> > escape to disk? Hence the fix below?
>
> Yes
>
> >
> > And then by extension, is the reason that nobody noticed before is that
> > we always used to log the correct value at unmount, so fses with clean
> > logs always have the correct value, and fses with dirty logs will
> > recompute fdblocks after log recovery by summing the AGF free blocks
> > counts?
>
> Nope, prior to 5.12-rc1, I think it was broken for a very long time...
>
> >
> > (Or possibly nobody uses !lazysb filesystems anymore?)
> >
>
> Zorro found this days ago on rhel 8 kernel (4.18, maybe he's doing
> some new testcases to cover this), and I think it was broken for much
> much long time (I don't know which version it was broken first), maybe
> it has little impact since it's just a freespace block counter.
>
> So I think it should be backported to many stable kernel versions (?)
> But I have no idea when it was broken...
>
> > I /think/ the code change looks ok, but as you might surmise from the
> > large quantity of questions, I'm not ready to RVB this yet. The commit
> > message seems like a good place to answer those questions.
> >
> > > After this patch, I've seen no strange so far on older kernels
> > > for the testcase above without lazysbcount.
> > >
> > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1949515
> >
> > This strangely <cough> doesn't seem to be accessible to the public at
> > large, since <cough> someone at RedHat decided to block all Oracle IPs
> > <cough>.
>
> <cough> will get rid of it the next time...
>
> Thanks,
> Gao Xiang
>
> >
> > --D
> >
> > >
> > > Reported-by: Zorro Lang <zlang@redhat.com>
> > > Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
> > > ---
> > > fs/xfs/libxfs/xfs_sb.c | 8 +++++++-
> > > 1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > > index 60e6d255e5e2..423dada3f64c 100644
> > > --- a/fs/xfs/libxfs/xfs_sb.c
> > > +++ b/fs/xfs/libxfs/xfs_sb.c
> > > @@ -928,7 +928,13 @@ xfs_log_sb(
> > >
> > > mp->m_sb.sb_icount = percpu_counter_sum(&mp->m_icount);
> > > mp->m_sb.sb_ifree = percpu_counter_sum(&mp->m_ifree);
> > > - mp->m_sb.sb_fdblocks = percpu_counter_sum(&mp->m_fdblocks);
> > > + if (!xfs_sb_version_haslazysbcount(&mp->m_sb)) {
> > > + struct xfs_dsb *dsb = bp->b_addr;
> > > +
> > > + mp->m_sb.sb_fdblocks = be64_to_cpu(dsb->sb_fdblocks);
> > > + } else {
> > > + mp->m_sb.sb_fdblocks = percpu_counter_sum(&mp->m_fdblocks);
> > > + }
> > >
> > > xfs_sb_to_disk(bp->b_addr, &mp->m_sb);
> > > xfs_trans_buf_set_type(tp, bp, XFS_BLFT_SB_BUF);
> > > --
> > > 2.27.0
> > >
> >
next prev parent reply other threads:[~2021-04-16 21:36 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-16 9:10 [PATCH] xfs: don't use in-core per-cpu fdblocks for !lazysbcount Gao Xiang
2021-04-16 14:10 ` Carlos Maiolino
2021-04-16 20:45 ` Gao Xiang
2021-04-16 16:00 ` Darrick J. Wong
2021-04-16 21:13 ` Gao Xiang
2021-04-16 21:36 ` Gao Xiang [this message]
2021-04-17 0:19 ` Darrick J. Wong
2021-04-17 1:57 ` Dave Chinner
2021-04-17 2:20 ` Gao Xiang
2021-04-17 22:32 ` Dave Chinner
2021-04-17 23:59 ` Gao Xiang
2021-04-18 22:08 ` Dave Chinner
2021-04-19 0:38 ` Gao Xiang
2021-04-20 17:17 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210416213625.GC2224153@xiangao.remote.csb \
--to=hsiangkao@redhat.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=zlang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox