From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Wengang Wang <wen.gang.wang@oracle.com>,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH] xfs: make sure sb_fdblocks is non-negative
Date: Tue, 4 Jun 2024 14:06:12 +1000 [thread overview]
Message-ID: <Zl6StNJmfnCInAdS@dread.disaster.area> (raw)
In-Reply-To: <20240603184327.GC52987@frogsfrogsfrogs>
On Mon, Jun 03, 2024 at 11:43:27AM -0700, Darrick J. Wong wrote:
> On Mon, Jun 03, 2024 at 05:23:40PM +0000, Wengang Wang wrote:
> > > What about the sb_icount and sb_ifree counters? They are also percpu
> > > counters, and they can return transient negative numbers, too,
> > > right? If they end up in the log, the same as this transient
> > > negative sb_fdblocks count, won't that also cause exactly the same
> > > issue?
> > >
> >
> > Yes, sb_icount and sb_ifree are also percpu counters. They have been addressed by
> > commit 59f6ab40fd8735c9a1a15401610a31cc06a0bbd6, right?
>
> icount and ifree don't go through xfs_mod_freecounter, which means that
> they never do the "subtract and see if we went negative" dance that
> fdblocks/frextents does. percpu_counter_sum_positive isn't necessary.
I'm pretty sure that what xfs_mod_freecounter() does has no bearing
on whether percpu_counter_sum() can return a negative number.
percpu_counter_sum() is intentionally designed to be racy w.r.t.
sub-batch percpu counter updates.
That is, percpu_counter_sum() takes the fbc->lock while summing only
to bound the maximum sum error to +/-(nr_cpus * batch size). It does
not serialise against sub-batch percpu count modifications - add/sub
only serialise against a sum when they go over the batch size and
need folding. Hence the sum will serialise against folding
operations, thereby bounding the maximum error across the sum to
+/-batch size on every CPU.
For example, if we have a counter with a batch of 128, 4 CPUs,
fbc->count = 0 and the current percpu count values are:
CPU value
0 32
1 0
2 -32
3 0
We have a sum of zero when everything is static and unchanging.
However, when we look at the dynamic nature of the sum, we can have
percpu counters change during the summing iteration. For example,
say between the sum starting and finishing on CPU 3, we have a
racing subtraction on CPU 0 of 64, and and add of 64 on CPU 1.
Neither trigger batch overruns so don't serialise against a sum in
progress.
If the running sum samples CPU 0 after it updates, and CPU 1 before
it updates, the sum will see:
fbc->count = 0
CPU value
0 -32
1 0
2 -32
3 0
and the sum will be -64. If we then rerun the sum and it doesn't
race with any updates, it will see:
fbc->count = 0
CPU value
0 -32
1 64
2 -32
3 0
and the sum will be zero.
The above example could actually happen with ifree/icount.
Modifications to ifree and icount in
xfs_trans_unreserve_and_mod_sb() aren't serialised against each
other (i.e. multiple alloc/free transactions can be committing and
updating the ifree/icount counters concurrently) and they can also
be running concurrently with the superblock update summing in
xfs_log_sb(). Hence the inherent raciness in percpu_counter_sum() vs
percpu_counter_add() is not mitigated in any way, and so ifree and
icount could have a transient sum that is less than zero....
So AFAICT, then need the _sum_positive() treatment as well.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-06-04 4:06 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-11 0:34 [PATCH] xfs: make sure sb_fdblocks is non-negative Wengang Wang
2024-05-11 1:17 ` Dave Chinner
2024-05-13 17:06 ` Wengang Wang
2024-05-31 15:44 ` Wengang Wang
2024-06-02 23:37 ` Dave Chinner
2024-06-03 17:23 ` Wengang Wang
2024-06-03 18:43 ` Darrick J. Wong
2024-06-04 4:06 ` Dave Chinner [this message]
2024-05-31 15:52 ` Darrick J. Wong
2024-05-31 18:03 ` Wengang Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zl6StNJmfnCInAdS@dread.disaster.area \
--to=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=wen.gang.wang@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox