Re: [PATCH] xfs: don't use in-core per-cpu fdblocks for !lazysbcount

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Gao Xiang <hsiangkao@redhat.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org, Zorro Lang <zlang@redhat.com>
Subject: Re: [PATCH] xfs: don't use in-core per-cpu fdblocks for !lazysbcount
Date: Sat, 17 Apr 2021 05:36:25 +0800	[thread overview]
Message-ID: <20210416213625.GC2224153@xiangao.remote.csb> (raw)
In-Reply-To: <20210416211320.GB2224153@xiangao.remote.csb>

On Sat, Apr 17, 2021 at 05:13:20AM +0800, Gao Xiang wrote:
> Hi Darrick,
> 
> On Fri, Apr 16, 2021 at 09:00:13AM -0700, Darrick J. Wong wrote:
> > On Fri, Apr 16, 2021 at 05:10:23PM +0800, Gao Xiang wrote:
> > > There are many paths which could trigger xfs_log_sb(), e.g.
> > >   xfs_bmap_add_attrfork()
> > >     -> xfs_log_sb()
> > > , which overrided on-disk fdblocks by in-core per-CPU fdblocks.
> > > 
> > > However, for !lazysbcount cases, on-disk fdblocks is actually updated
> > > by xfs_trans_apply_sb_deltas(), and generally it isn't equal to
> > > in-core fdblocks due to xfs_reserve_block() or whatever, see the
> > > comment in xfs_unmountfs().
> > > 
> > > It could be observed by the following steps reported by Zorro [1]:
> > > 
> > > 1. mkfs.xfs -f -l lazy-count=0 -m crc=0 $dev
> > > 2. mount $dev $mnt
> > > 3. fsstress -d $mnt -p 100 -n 1000 (maybe need more or less io load)
> > > 4. umount $mnt
> > > 5. xfs_repair -n $dev
> > > 
> > > yet due to commit f46e5a174655("xfs: fold sbcount quiesce logging
> > > into log covering"), xfs_sync_sb() will be triggered even !lazysbcount
> > > but xfs_log_need_covered() case when xfs_unmountfs(), so hard to
> > > reproduce on kernel 5.12+.
> > 
> > Um, I can't understand this(?), possibly because I can't get to RHBZ and
> > therefore have very little context to start from. :(
> 
> Very sorry about that.. I realized it doesn't access at all without some
> permission after sending out the patch. :(
> 
> > 
> > Are you saying that because the f46e commit removed the xfs_sync_sb
> > calls from unmountfs for !lazysb filesystems, we no longer log the
> > summary counters at unmount?  Which means that we no longer write the
> > incore percpu fdblocks count to disk at unmount after we've torn down
> > all the incore space reservations (when sb_fdblocks == m_fdblocks)?
> 
> Er.. I think that is by reverse, before commit f46e, we no longer logged
> the summary counters at unmount, due to 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/xfs/xfs_mount.c?h=v5.11#n1177
>   xfs_unmountfs
>     -> xfs_log_sbcount
>       -> !xfs_sb_version_haslazysbcount
>         -> return 0 (xfs_sync_sb bypassed).
> 
> So the only time we update the ondisk fdblocks was during transactions,
> but xfs_log_sb() corrupted this (due to no summary counters logging at
> unmount).
> 
> After f46e, it became
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/xfs/xfs_log.c?h=v5.12-rc2#n982
>   xfs_unmountfs
>     -> xfs_log_unmount
>       -> xfs_log_clean
>         -> xfs_log_cover
> 
> So if xfs_log_need_covered(mp) == true and
> !xfs_sb_version_haslazysbcount(&mp->m_sb),
> xfs_sync_sb() will be triggered to cover the log, So
> it's hard to reproduce on the current kernel (at least on my side.)
> 
> But I have no idea xfs_log_need_covered(mp) is always true at that time,
> and the patchset seems a bit large and (possibly) hard to backport...
> 

btw, after checking xfs_check_summary_counts(), I think by now,
sb_fdblocks won't be recalculated if !xfs_sb_version_haslazysbcount
when suddenly power outages (with dirty log...)

So for !xfs_sb_version_haslazysbcount cases, we really rely on the
correct on-disk sb_fdblocks all the time...

Anyway, I also think we should warn !lazysb fs deprecated runtimely
by now (even we have XFS_SUPPORT_V4 build config.)

> > 
> > So that means that for !lazysb fses, the only time we log the sb
> > counters is during transactions, and when we do log the counters we
> > actually log the wrong value, since the incore reservations should never
> > escape to disk?  Hence the fix below?
> 
> Yes
> 
> > 
> > And then by extension, is the reason that nobody noticed before is that
> > we always used to log the correct value at unmount, so fses with clean
> > logs always have the correct value, and fses with dirty logs will
> > recompute fdblocks after log recovery by summing the AGF free blocks
> > counts?
> 
> Nope, prior to 5.12-rc1, I think it was broken for a very long time...
> 
> > 
> > (Or possibly nobody uses !lazysb filesystems anymore?)
> > 
> 
> Zorro found this days ago on rhel 8 kernel (4.18, maybe he's doing
> some new testcases to cover this), and I think it was broken for much
> much long time (I don't know which version it was broken first), maybe
> it has little impact since it's just a freespace block counter.
> 
> So I think it should be backported to many stable kernel versions (?)
> But I have no idea when it was broken...
> 
> > I /think/ the code change looks ok, but as you might surmise from the
> > large quantity of questions, I'm not ready to RVB this yet.  The commit
> > message seems like a good place to answer those questions.
> > 
> > > After this patch, I've seen no strange so far on older kernels
> > > for the testcase above without lazysbcount.
> > > 
> > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1949515
> > 
> > This strangely <cough> doesn't seem to be accessible to the public at
> > large, since <cough> someone at RedHat decided to block all Oracle IPs
> > <cough>.
> 
> <cough> will get rid of it the next time...
> 
> Thanks,
> Gao Xiang
> 
> > 
> > --D
> > 
> > > 
> > > Reported-by: Zorro Lang <zlang@redhat.com>
> > > Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
> > > ---
> > >  fs/xfs/libxfs/xfs_sb.c | 8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > > index 60e6d255e5e2..423dada3f64c 100644
> > > --- a/fs/xfs/libxfs/xfs_sb.c
> > > +++ b/fs/xfs/libxfs/xfs_sb.c
> > > @@ -928,7 +928,13 @@ xfs_log_sb(
> > >  
> > >  	mp->m_sb.sb_icount = percpu_counter_sum(&mp->m_icount);
> > >  	mp->m_sb.sb_ifree = percpu_counter_sum(&mp->m_ifree);
> > > -	mp->m_sb.sb_fdblocks = percpu_counter_sum(&mp->m_fdblocks);
> > > +	if (!xfs_sb_version_haslazysbcount(&mp->m_sb)) {
> > > +		struct xfs_dsb	*dsb = bp->b_addr;
> > > +
> > > +		mp->m_sb.sb_fdblocks = be64_to_cpu(dsb->sb_fdblocks);
> > > +	} else {
> > > +		mp->m_sb.sb_fdblocks = percpu_counter_sum(&mp->m_fdblocks);
> > > +	}
> > >  
> > >  	xfs_sb_to_disk(bp->b_addr, &mp->m_sb);
> > >  	xfs_trans_buf_set_type(tp, bp, XFS_BLFT_SB_BUF);
> > > -- 
> > > 2.27.0
> > > 
> >

next prev parent reply	other threads:[~2021-04-16 21:36 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-16  9:10 [PATCH] xfs: don't use in-core per-cpu fdblocks for !lazysbcount Gao Xiang
2021-04-16 14:10 ` Carlos Maiolino
2021-04-16 20:45   ` Gao Xiang
2021-04-16 16:00 ` Darrick J. Wong
2021-04-16 21:13   ` Gao Xiang
2021-04-16 21:36     ` Gao Xiang [this message]
2021-04-17  0:19     ` Darrick J. Wong
2021-04-17  1:57       ` Dave Chinner
2021-04-17  2:20         ` Gao Xiang
2021-04-17 22:32           ` Dave Chinner
2021-04-17 23:59             ` Gao Xiang
2021-04-18 22:08               ` Dave Chinner
2021-04-19  0:38                 ` Gao Xiang
2021-04-20 17:17             ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210416213625.GC2224153@xiangao.remote.csb \
    --to=hsiangkao@redhat.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox