From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>, Carlos Maiolino <cem@kernel.org>,
linux-xfs@vger.kernel.org, cen zhang <zzzccc427@gmail.com>,
lkmm@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy
Date: Fri, 16 May 2025 08:34:47 +0200 [thread overview]
Message-ID: <20250516063447.GA14632@lst.de> (raw)
In-Reply-To: <aCUlXbEg9wuyPEB6@dread.disaster.area>
On Thu, May 15, 2025 at 09:21:01AM +1000, Dave Chinner wrote:
> > I'd like to understand that one a bit more. It might be because the
> > validator doesn't understand a semaphore used as lock is a lock, but
> > I'll follow up there.
>
> Even so, it's not a race because at that point in time the object
> cannot be seen by anything other than the process that is
> initialising it.
In the current tree after the folio/vmalloc series b_addr is only
assigned during buffer allocation. But I suspect they tested before that
where b_addr can be set at runtime. Either way it always happens under
b_sema because that is initialized to locked just after allocating
the memory for the buffer.
> I'm wary of this, because at this point I suspect that there aren't
> a lot of people with sufficient time and knowledge to adequately
> address these issues.
I'm more than happy to address these, because proper documentation
of concurrency helps fixing a huge number of bugs, and also really
helps documenting the code. I hate having to spend hours trying to
figure out why something can be safely used lockless or not.
> We should have learnt this lesson from lockdep - false positive
> whack-a-mole shut up individual reports but introduced technical
> debt that had to be addressed later because whack-a-mole didn't
> address the underlying issues.
I'm not sure who "we" is, but I've always pushed back to hacks just
to shut up lockdep. And at the same time I'm immensively grateful
for having lockdep and can't think of working without it anymore.
> We need functions like xfs_vn_getattr(), the allocation AG selection
> alogrithms and object initialisation functions to be marked as
> inherently racy, because they intentionally don't hold locks for any
> of the accesses they make. kcsan provides:
Functions never are racy, specific data access are. So a function wide
assignment is just the dumbest thing ever, this already badly failed
for things like early Java object-level synchronization.
>
> For variables like ip->i_delayed_blks, where we intentionally
> serialise modifications but do not serialise readers, we have:
>
> - uint64_t i_delayed_blks; /* count of delay alloc blks */
> + uint64_t __data_racy i_delayed_blks; /* count of delay alloc blks */
>
> This means all accesses to the variable are considered to be racy
> and kcsan will ignore it and not report it. We can do the same for
> lip->li_lsn and other variables.
But not all access are racy. We rely on proper synchronized accesses
for accounting. Now for something that has a lot of unsynchronized
access, adding a wrapper for them might be fine, but for i_delayed_blks
I don't think we actually have enough for them to bother.
> IOWs, we do not need to spew data_race() wrappers or random
> READ_ONCE/WRITE_ONCE macros all over the code to silence known false
> positives. If we mark the variables and functions we know have racy
> accesses, these one-line changes will avoid -all- false positives on
> those variables/from those functions.
And also drop a lot of the actually useful checks. That's exaxctly
the kind of hack you rant about above.
prev parent reply other threads:[~2025-05-16 6:34 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <b0B7vqX5e__5JIDhxMANuyma5_RcE4wAhgR1-x-U6YY3qUy_mE220aCGAqsZ-LpEDc06cNfseqOGIgVw4BKmVg==@protonmail.internalid>
2025-05-13 5:26 ` [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy Christoph Hellwig
2025-05-13 8:56 ` Carlos Maiolino
2025-05-13 21:37 ` Dave Chinner
2025-05-14 4:29 ` Christoph Hellwig
2025-05-14 8:00 ` Carlos Maiolino
2025-05-14 13:04 ` Christoph Hellwig
2025-05-14 23:21 ` Dave Chinner
2025-05-16 6:34 ` Christoph Hellwig [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250516063447.GA14632@lst.de \
--to=hch@lst.de \
--cc=cem@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=lkmm@lists.linux.dev \
--cc=zzzccc427@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).