* [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy
@ 2025-05-13 5:26 ` Christoph Hellwig
2025-05-13 8:56 ` Carlos Maiolino
2025-05-13 21:37 ` Dave Chinner
0 siblings, 2 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-05-13 5:26 UTC (permalink / raw)
To: cem; +Cc: linux-xfs, cen zhang
We don't bother with the ILOCK as this is best-effort and thus a racy
access is ok. Add a data_race() annotation to make that clear to
memory model verifiers.
Reported-by: cen zhang <zzzccc427@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_file.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 84f08c976ac4..a4a2109cb281 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1549,7 +1549,11 @@ xfs_file_release(
*/
if (xfs_iflags_test_and_clear(ip, XFS_ITRUNCATED)) {
xfs_iflags_clear(ip, XFS_EOFBLOCKS_RELEASED);
- if (ip->i_delayed_blks > 0)
+ /*
+ * Don't bother with the ILOCK as this is best-effort and thus
+ * a racy access is ok.
+ */
+ if (data_race(ip->i_delayed_blks) > 0)
filemap_flush(inode->i_mapping);
}
--
2.47.2
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy
2025-05-13 5:26 ` [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy Christoph Hellwig
@ 2025-05-13 8:56 ` Carlos Maiolino
2025-05-13 21:37 ` Dave Chinner
1 sibling, 0 replies; 8+ messages in thread
From: Carlos Maiolino @ 2025-05-13 8:56 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs, cen zhang
On Tue, May 13, 2025 at 07:26:14AM +0200, Christoph Hellwig wrote:
> We don't bother with the ILOCK as this is best-effort and thus a racy
> access is ok. Add a data_race() annotation to make that clear to
> memory model verifiers.
>
> Reported-by: cen zhang <zzzccc427@gmail.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks okay
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
> ---
> fs/xfs/xfs_file.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 84f08c976ac4..a4a2109cb281 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1549,7 +1549,11 @@ xfs_file_release(
> */
> if (xfs_iflags_test_and_clear(ip, XFS_ITRUNCATED)) {
> xfs_iflags_clear(ip, XFS_EOFBLOCKS_RELEASED);
> - if (ip->i_delayed_blks > 0)
> + /*
> + * Don't bother with the ILOCK as this is best-effort and thus
> + * a racy access is ok.
> + */
> + if (data_race(ip->i_delayed_blks) > 0)
> filemap_flush(inode->i_mapping);
> }
>
> --
> 2.47.2
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy
2025-05-13 5:26 ` [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy Christoph Hellwig
2025-05-13 8:56 ` Carlos Maiolino
@ 2025-05-13 21:37 ` Dave Chinner
2025-05-14 4:29 ` Christoph Hellwig
1 sibling, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2025-05-13 21:37 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: cem, linux-xfs, cen zhang
On Tue, May 13, 2025 at 07:26:14AM +0200, Christoph Hellwig wrote:
> We don't bother with the ILOCK as this is best-effort and thus a racy
> access is ok. Add a data_race() annotation to make that clear to
> memory model verifiers.
IMO, that's the thin edge of a wedge. There are dozens of places in
XFS where we check variable values without holding the lock needed
to serialise the read against modification.
For example, i_delayed_blks updates are protected by the ILOCK, so
there's a data race if we read it without the ILOCK held. We do
this in:
xfs_file_release() - the one this patch addresses
xfs_getbmap() - unlocked access for data fork
xfs_can_free_eofblocks() - checked twice without locking
xfs_inodegc_set_reclaimable() - unlocked, but debug
xfs_inode_has_filedata() - unlocked via xfs_inactive()
xfs_vn_getattr() - unlocked
xfs_qm_dqusage_adjust() - unlocked ASSERT
And that's just this one variable in the inode - there are lots of
others we check without bothering to lock the inode.
e.g. pretty much everythign that xfs_vn_getattr() reads is a data
race because "unlocked racy access" is the way stat is designed to
work on Linux.
Hence my question - are we now going to make it policy that every
possible racy access must be marked with data_race() because there
is some new bot that someone is running that will complain if we
don't? Are you committing to playing whack-a-mole with the memory
model verifiers to silence all the false positives from these
known-to-be-safe access patterns?
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy
2025-05-13 21:37 ` Dave Chinner
@ 2025-05-14 4:29 ` Christoph Hellwig
2025-05-14 8:00 ` Carlos Maiolino
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2025-05-14 4:29 UTC (permalink / raw)
To: Dave Chinner
Cc: Christoph Hellwig, cem, linux-xfs, cen zhang, lkmm, linux-kernel
On Wed, May 14, 2025 at 07:37:14AM +1000, Dave Chinner wrote:
> On Tue, May 13, 2025 at 07:26:14AM +0200, Christoph Hellwig wrote:
> > We don't bother with the ILOCK as this is best-effort and thus a racy
> > access is ok. Add a data_race() annotation to make that clear to
> > memory model verifiers.
>
> IMO, that's the thin edge of a wedge. There are dozens of places in
> XFS where we check variable values without holding the lock needed
> to serialise the read against modification.
Yes. And the linux kernel memory consistency model ask us to mark them,
see tools/memory-model/Documentation/access-marking.txt.
This fails painful at first, but I'd actually wish we'd have tools
enforcing this as strongly as possible as developers (well me at least)
seem to think a racy access is just fine more often than they should, and
needing an annotation and a comment is a pretty good way to sure that.
> Hence my question - are we now going to make it policy that every
> possible racy access must be marked with data_race() because there
> is some new bot that someone is running that will complain if we
> don't? Are you committing to playing whack-a-mole with the memory
> model verifiers to silence all the false positives from these
> known-to-be-safe access patterns?
It's not really a "new bot". It has been official memory consistency
policy for a while, but it just hasn't been well enforced. For new code
asking if the review is racy and needs a marking or use READ_ONCE() and
WRITE_ONCE() has been part of the usual review protocol. Reviewing old
code and fixing things we got wrong will take a while, but I'm actually
glad about more bots for that.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy
2025-05-14 4:29 ` Christoph Hellwig
@ 2025-05-14 8:00 ` Carlos Maiolino
2025-05-14 13:04 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Carlos Maiolino @ 2025-05-14 8:00 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Dave Chinner, linux-xfs, cen zhang, lkmm, linux-kernel
Hi.
On Wed, May 14, 2025 at 06:29:46AM +0200, Christoph Hellwig wrote:
> On Wed, May 14, 2025 at 07:37:14AM +1000, Dave Chinner wrote:
> > On Tue, May 13, 2025 at 07:26:14AM +0200, Christoph Hellwig wrote:
> > > We don't bother with the ILOCK as this is best-effort and thus a racy
> > > access is ok. Add a data_race() annotation to make that clear to
> > > memory model verifiers.
> >
> > IMO, that's the thin edge of a wedge. There are dozens of places in
> > XFS where we check variable values without holding the lock needed
> > to serialise the read against modification.
>
> Yes. And the linux kernel memory consistency model ask us to mark them,
> see tools/memory-model/Documentation/access-marking.txt.
>
> This fails painful at first, but I'd actually wish we'd have tools
> enforcing this as strongly as possible as developers (well me at least)
> seem to think a racy access is just fine more often than they should, and
> needing an annotation and a comment is a pretty good way to sure that.
>
> > Hence my question - are we now going to make it policy that every
> > possible racy access must be marked with data_race() because there
> > is some new bot that someone is running that will complain if we
> > don't? Are you committing to playing whack-a-mole with the memory
> > model verifiers to silence all the false positives from these
> > known-to-be-safe access patterns?
>
> It's not really a "new bot". It has been official memory consistency
> policy for a while, but it just hasn't been well enforced. For new code
> asking if the review is racy and needs a marking or use READ_ONCE() and
> WRITE_ONCE() has been part of the usual review protocol. Reviewing old
> code and fixing things we got wrong will take a while, but I'm actually
> glad about more bots for that.
>
I agree with you here, and we could slowly start marking those shared accesses
as racy, but bots spitting false-positivies all the time doesn't help much,
other than taking somebody's else time to look into the report.
Taking as example one case in the previous report, where the report complained
about concurrent bp->b_addr access during the buffer instantiation.
This seems to be covered by the access-marking.txt doc:
"
...
Churning the code base with ill-considered additions of data_race(),
READ_ONCE(), and WRITE_ONCE() is unhelpful.
...
"
Then:
"
...
Use of Plain C-Language Accesses
2. Initialization-time and cleanup-time accesses. This covers
a wide variety of situations, including the uniprocessor
phase of system boot, variables to be used by not-yet-spawned
kthreads, structures not yet published to reference-counted
or RCU-protected data structures, and the cleanup side of any
of these situations.
...
"
So, I think Dave has a point too. Like what happens with syzkaller
and random people reporting random syzkaller warnings.
While I appreciate the reports too, I think it would be fair for the reporters
to spend some time to at least craft a RFC patch fixing the warning.
In this case for example, the reporter explicitly mentioned data_race()
annotations, so, why not just send a patch attempting fix the warning instead of
asking somebody else to do that? IMHO, somebody fidgeting with KCSAN has enough
skills to craft such patch. Of course I might be wrong here, but tweaking debug
knobs over the kernel and asking somebody else to look at the results doesn't
seem scalable.
Cheers.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy
2025-05-14 8:00 ` Carlos Maiolino
@ 2025-05-14 13:04 ` Christoph Hellwig
2025-05-14 23:21 ` Dave Chinner
0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2025-05-14 13:04 UTC (permalink / raw)
To: Carlos Maiolino
Cc: Christoph Hellwig, Dave Chinner, linux-xfs, cen zhang, lkmm,
linux-kernel
On Wed, May 14, 2025 at 10:00:28AM +0200, Carlos Maiolino wrote:
> I agree with you here, and we could slowly start marking those shared accesses
> as racy, but bots spitting false-positivies all the time doesn't help much,
> other than taking somebody's else time to look into the report.
>
> Taking as example one case in the previous report, where the report complained
> about concurrent bp->b_addr access during the buffer instantiation.
I'd like to understand that one a bit more. It might be because the
validator doesn't understand a semaphore used as lock is a lock, but
I'll follow up there.
> So, I think Dave has a point too. Like what happens with syzkaller
> and random people reporting random syzkaller warnings.
>
> While I appreciate the reports too, I think it would be fair for the reporters
> to spend some time to at least craft a RFC patch fixing the warning.
Well, it was polite mails about their finding, which I find useful.
If we got a huge amount of spam that might be different.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy
2025-05-14 13:04 ` Christoph Hellwig
@ 2025-05-14 23:21 ` Dave Chinner
2025-05-16 6:34 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2025-05-14 23:21 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Carlos Maiolino, linux-xfs, cen zhang, lkmm, linux-kernel
On Wed, May 14, 2025 at 03:04:17PM +0200, Christoph Hellwig wrote:
> On Wed, May 14, 2025 at 10:00:28AM +0200, Carlos Maiolino wrote:
> > I agree with you here, and we could slowly start marking those shared accesses
> > as racy, but bots spitting false-positivies all the time doesn't help much,
> > other than taking somebody's else time to look into the report.
> >
> > Taking as example one case in the previous report, where the report complained
> > about concurrent bp->b_addr access during the buffer instantiation.
>
> I'd like to understand that one a bit more. It might be because the
> validator doesn't understand a semaphore used as lock is a lock, but
> I'll follow up there.
Even so, it's not a race because at that point in time the object
cannot be seen by anything other than the process that is
initialising it.
We initialise objects without holding the object locked all over the
place - if this is something that the sanitiser cannot determine
automatically, then we're also going to need annotations to indicate
that the initialisation function(s) contain valid data races.
>
> > So, I think Dave has a point too. Like what happens with syzkaller
> > and random people reporting random syzkaller warnings.
> >
> > While I appreciate the reports too, I think it would be fair for the reporters
> > to spend some time to at least craft a RFC patch fixing the warning.
>
> Well, it was polite mails about their finding, which I find useful.
> If we got a huge amount of spam that might be different.
That's kinda of my point about it being the "thin edge of the
wedge".
Once we indicate that this is something we want reported so we can
address, then the floodgates open.
I'm wary of this, because at this point I suspect that there aren't
a lot of people with sufficient time and knowledge to adequately
address these issues.
Asking the reporter to "send a patch to data_race() that instance"
isn't a good solution to the problem because it doesn't address the
wider issue indicated by the specific report. It just kicks the can
down the road and introduces technical debt that we will eventually
have to address.
We should have learnt this lesson from lockdep - false positive
whack-a-mole shut up individual reports but introduced technical
debt that had to be addressed later because whack-a-mole didn't
address the underlying issues. When the stack of cards fell, someone
(i.e. me) had to work out how to do lockdep annotations properly
(e.g. via the complex inode locking subclass stuff) to make the
issues go away and require minimal long term maintenance.
I don't want to see this pattern repeated.
We need a sane policy for addressing the entire classes of issuesi
underlying individual reports, not just apply a band-aid that
silences the indivual report.
So, let's look at what kcsan provides us with.
We need functions like xfs_vn_getattr(), the allocation AG selection
alogrithms and object initialisation functions to be marked as
inherently racy, because they intentionally don't hold locks for any
of the accesses they make. kcsan provides:
/* comment on why __no_kcsan is needed */
__no_kcsan void
xfs_foo(
struct xfs_bar *bar)
{
...
the __no_kcsan attribute for function declarations to mark the
entire function as inherently racy and so should be ignored.
For variables like ip->i_delayed_blks, where we intentionally
serialise modifications but do not serialise readers, we have:
- uint64_t i_delayed_blks; /* count of delay alloc blks */
+ uint64_t __data_racy i_delayed_blks; /* count of delay alloc blks */
This means all accesses to the variable are considered to be racy
and kcsan will ignore it and not report it. We can do the same for
lip->li_lsn and other variables.
IOWs, we do not need to spew data_race() wrappers or random
READ_ONCE/WRITE_ONCE macros all over the code to silence known false
positives. If we mark the variables and functions we know have racy
accesses, these one-line changes will avoid -all- false positives on
those variables/from those functions.
This, IMO, is a much better solution than "send a patch marking that
access as data_race()". We fix the issue for all accesses and entire
algorithms with simple changes to variable and/or function
declarations.
To reiterate my point: if we are goign to make XFS KCSAN friendly,
we need to work out how to do it quickly and efficiently before we
start changing code. Engaging in random whack-a-mole shootdown games
will not lead to a viable long term solution, so let's not waste
time on playing whack-a-mole.
-Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy
2025-05-14 23:21 ` Dave Chinner
@ 2025-05-16 6:34 ` Christoph Hellwig
0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2025-05-16 6:34 UTC (permalink / raw)
To: Dave Chinner
Cc: Christoph Hellwig, Carlos Maiolino, linux-xfs, cen zhang, lkmm,
linux-kernel
On Thu, May 15, 2025 at 09:21:01AM +1000, Dave Chinner wrote:
> > I'd like to understand that one a bit more. It might be because the
> > validator doesn't understand a semaphore used as lock is a lock, but
> > I'll follow up there.
>
> Even so, it's not a race because at that point in time the object
> cannot be seen by anything other than the process that is
> initialising it.
In the current tree after the folio/vmalloc series b_addr is only
assigned during buffer allocation. But I suspect they tested before that
where b_addr can be set at runtime. Either way it always happens under
b_sema because that is initialized to locked just after allocating
the memory for the buffer.
> I'm wary of this, because at this point I suspect that there aren't
> a lot of people with sufficient time and knowledge to adequately
> address these issues.
I'm more than happy to address these, because proper documentation
of concurrency helps fixing a huge number of bugs, and also really
helps documenting the code. I hate having to spend hours trying to
figure out why something can be safely used lockless or not.
> We should have learnt this lesson from lockdep - false positive
> whack-a-mole shut up individual reports but introduced technical
> debt that had to be addressed later because whack-a-mole didn't
> address the underlying issues.
I'm not sure who "we" is, but I've always pushed back to hacks just
to shut up lockdep. And at the same time I'm immensively grateful
for having lockdep and can't think of working without it anymore.
> We need functions like xfs_vn_getattr(), the allocation AG selection
> alogrithms and object initialisation functions to be marked as
> inherently racy, because they intentionally don't hold locks for any
> of the accesses they make. kcsan provides:
Functions never are racy, specific data access are. So a function wide
assignment is just the dumbest thing ever, this already badly failed
for things like early Java object-level synchronization.
>
> For variables like ip->i_delayed_blks, where we intentionally
> serialise modifications but do not serialise readers, we have:
>
> - uint64_t i_delayed_blks; /* count of delay alloc blks */
> + uint64_t __data_racy i_delayed_blks; /* count of delay alloc blks */
>
> This means all accesses to the variable are considered to be racy
> and kcsan will ignore it and not report it. We can do the same for
> lip->li_lsn and other variables.
But not all access are racy. We rely on proper synchronized accesses
for accounting. Now for something that has a lot of unsynchronized
access, adding a wrapper for them might be fine, but for i_delayed_blks
I don't think we actually have enough for them to bother.
> IOWs, we do not need to spew data_race() wrappers or random
> READ_ONCE/WRITE_ONCE macros all over the code to silence known false
> positives. If we mark the variables and functions we know have racy
> accesses, these one-line changes will avoid -all- false positives on
> those variables/from those functions.
And also drop a lot of the actually useful checks. That's exaxctly
the kind of hack you rant about above.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-05-16 6:34 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <b0B7vqX5e__5JIDhxMANuyma5_RcE4wAhgR1-x-U6YY3qUy_mE220aCGAqsZ-LpEDc06cNfseqOGIgVw4BKmVg==@protonmail.internalid>
2025-05-13 5:26 ` [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy Christoph Hellwig
2025-05-13 8:56 ` Carlos Maiolino
2025-05-13 21:37 ` Dave Chinner
2025-05-14 4:29 ` Christoph Hellwig
2025-05-14 8:00 ` Carlos Maiolino
2025-05-14 13:04 ` Christoph Hellwig
2025-05-14 23:21 ` Dave Chinner
2025-05-16 6:34 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).