All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Gao Xiang <hsiangkao@redhat.com>
Cc: linux-xfs@vger.kernel.org, "Darrick J. Wong" <djwong@kernel.org>,
	Zorro Lang <zlang@redhat.com>,
	Carlos Maiolino <cmaiolino@redhat.com>
Subject: Re: [PATCH v2 1/2] xfs: don't use in-core per-cpu fdblocks for !lazysbcount
Date: Wed, 21 Apr 2021 11:45:26 +1000	[thread overview]
Message-ID: <20210421014526.GY63242@dread.disaster.area> (raw)
In-Reply-To: <20210420215443.GA3047037@xiangao.remote.csb>

On Wed, Apr 21, 2021 at 05:54:43AM +0800, Gao Xiang wrote:
> Hi Dave,
> 
> On Wed, Apr 21, 2021 at 07:25:06AM +1000, Dave Chinner wrote:
> > On Tue, Apr 20, 2021 at 07:08:54PM +0800, Gao Xiang wrote:
> > > There are many paths which could trigger xfs_log_sb(), e.g.
> > >   xfs_bmap_add_attrfork()
> > >     -> xfs_log_sb()
> > > , which overrides on-disk fdblocks by in-core per-CPU fdblocks.
> > > 
> > > However, for !lazysbcount cases, on-disk fdblocks is actually updated
> > > by xfs_trans_apply_sb_deltas(), and generally it isn't equal to
> > > in-core per-CPU fdblocks due to xfs_reserve_blocks() or whatever,
> > > see the comment in xfs_unmountfs().
> > > 
> > > It could be observed by the following steps reported by Zorro:
> > > 
> > > 1. mkfs.xfs -f -l lazy-count=0 -m crc=0 $dev
> > > 2. mount $dev $mnt
> > > 3. fsstress -d $mnt -p 100 -n 1000 (maybe need more or less io load)
> > > 4. umount $mnt
> > > 5. xfs_repair -n $dev
> > > 
> > > yet due to commit f46e5a174655 ("xfs: fold sbcount quiesce logging
> > > into log covering"), xfs_sync_sb() will also be triggered if log
> > > covering is needed and !lazysbcount when xfs_unmountfs(), so hard
> > > to reproduce on kernel 5.12+ for clean unmount.
> > > 
> > > on-disk sb_icount and sb_ifree are also updated in
> > > xfs_trans_apply_sb_deltas() for !lazysbcount cases, however, which
> > > are always equal to per-CPU counters, so only fdblocks matters.
> > > 
> > > After this patch, I've seen no strange so far on older kernels
> > > for the testcase above without lazysbcount.
> > > 
> > > Reported-by: Zorro Lang <zlang@redhat.com>
> > > Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
> > > Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
> > > ---
> > > changes since v1:
> > >  - update commit message.
> > > 
> > >  fs/xfs/libxfs/xfs_sb.c | 8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > > index 60e6d255e5e2..423dada3f64c 100644
> > > --- a/fs/xfs/libxfs/xfs_sb.c
> > > +++ b/fs/xfs/libxfs/xfs_sb.c
> > > @@ -928,7 +928,13 @@ xfs_log_sb(
> > >  
> > >  	mp->m_sb.sb_icount = percpu_counter_sum(&mp->m_icount);
> > >  	mp->m_sb.sb_ifree = percpu_counter_sum(&mp->m_ifree);
> > > -	mp->m_sb.sb_fdblocks = percpu_counter_sum(&mp->m_fdblocks);
> > > +	if (!xfs_sb_version_haslazysbcount(&mp->m_sb)) {
> > > +		struct xfs_dsb	*dsb = bp->b_addr;
> > > +
> > > +		mp->m_sb.sb_fdblocks = be64_to_cpu(dsb->sb_fdblocks);
> > > +	} else {
> > > +		mp->m_sb.sb_fdblocks = percpu_counter_sum(&mp->m_fdblocks);
> > > +	}
> > 
> > THis really needs a comment explaining why this is done this way.
> > It's not obvious from reading the code why we pull the the fdblock
> > count off disk and then, in  xfs_sb_to_disk(), we write it straight
> > back to disk.
> > 
> > It's also not clear to me that summing the inode counters is correct
> > in the case of the !lazysbcount for the similar reasons - the percpu
> > counter is not guaranteed to be absolutely accurate here, yet the
> > values in the disk buffer are. Perhaps we should be updating the
> > m_sb values in xfs_trans_apply_sb_deltas() for the !lazycount case,
> > and only summing them here for the lazycount case...
> 
> But if updating m_sb values in xfs_trans_apply_sb_deltas(), we
> should also update on-disk sb counters in xfs_trans_apply_sb_deltas()
> and log sb for !lazysbcount (since for such cases, sb counter update
> should be considered immediately.)

I don't follow - xfs_trans_apply_sb_deltas() already logs the
changes to the superblock made in the transaction.

xfs_trans_unreserve_and_mod_sb() does the in-memory counter updates
after xfs_trans_apply_sb_deltas() applies them to the on-disk
superblock in the buffer and logs them.

But nowhere on a !lazysbcount setup are mp->m_sb.sb_fdcount/ifree/
icount values being updated, and hence they are not valid at any
time except for during log quiesce where all the in memory
reservations have been removed and the per-cpu counters are synced
to mp->m_sb.

I'm suggesting that xfs_trans_unreserve_and_mod_sb() also updates
the mp->m_sb.sb_fdcount/ifree/icount values for !lazysbcount, as we
currently do not do this and this will keep them uptodate for any
caller of xfs_sb_to_disk().

i.e. we have three choices:

1. avoid writing the counters in xfs_sb_to_disk() for !lazycount.
2. read them from the buffer before writing them back to the buffer.
3. keep them up to date correctly via xfs_trans_unreserve_and_mod_sb.

#1 is bad because there are cases where we want to write the
counters even for !lazysbcount filesystems (e.g. mkfs, repair, etc).

#2 is essentially a hack around the fact that mp->m_sb is not kept
up to date in the in-memory superblock for !lazysbcount filesystems.

#3 keeps the in-memory superblock up to date for !lazysbcount case
so they are coherent with the on-disk values and hence we only need
to update the in-memory superblock counts for lazysbcount
filesystems before calling xfs_sb_to_disk().

#3 is my preferred solution.

> That will indeed cause more modification, I'm not quite sure if it's
> quite ok honestly. But if you assume that's more clear, I could submit
> an alternative instead later.

I think the version you posted doesn't fix the entire problem. It
merely slaps a band-aid over the symptom that is being seen, and
doesn't address all the non-coherent data that can be written to the
superblock here.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2021-04-21  1:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-20 11:08 [PATCH v2 1/2] xfs: don't use in-core per-cpu fdblocks for !lazysbcount Gao Xiang
2021-04-20 11:08 ` [PATCH v2 2/2] xfs: turn on lazysbcount unconditionally Gao Xiang
2021-04-20 16:22   ` Darrick J. Wong
2021-04-20 20:00     ` Gao Xiang
2021-04-22  0:01       ` Darrick J. Wong
2021-04-22  1:51         ` Gao Xiang
2021-04-22  5:11           ` Zorro Lang
2021-04-20 17:42 ` [PATCH v2 1/2] xfs: don't use in-core per-cpu fdblocks for !lazysbcount Darrick J. Wong
2021-04-20 21:25 ` Dave Chinner
2021-04-20 21:54   ` Gao Xiang
2021-04-21  1:45     ` Dave Chinner [this message]
2021-04-21  3:01       ` Gao Xiang
2021-04-22  1:44         ` Dave Chinner
2021-04-22  2:06           ` Gao Xiang
2021-04-22  3:01             ` Dave Chinner
2021-04-22  3:12               ` Gao Xiang
2021-04-22 15:58                 ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210421014526.GY63242@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=cmaiolino@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hsiangkao@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.