* [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy @ 2025-05-13 5:26 ` Christoph Hellwig 2025-05-13 8:56 ` Carlos Maiolino 2025-05-13 21:37 ` Dave Chinner 0 siblings, 2 replies; 8+ messages in thread From: Christoph Hellwig @ 2025-05-13 5:26 UTC (permalink / raw) To: cem; +Cc: linux-xfs, cen zhang We don't bother with the ILOCK as this is best-effort and thus a racy access is ok. Add a data_race() annotation to make that clear to memory model verifiers. Reported-by: cen zhang <zzzccc427@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_file.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 84f08c976ac4..a4a2109cb281 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1549,7 +1549,11 @@ xfs_file_release( */ if (xfs_iflags_test_and_clear(ip, XFS_ITRUNCATED)) { xfs_iflags_clear(ip, XFS_EOFBLOCKS_RELEASED); - if (ip->i_delayed_blks > 0) + /* + * Don't bother with the ILOCK as this is best-effort and thus + * a racy access is ok. + */ + if (data_race(ip->i_delayed_blks) > 0) filemap_flush(inode->i_mapping); } -- 2.47.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy 2025-05-13 5:26 ` [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy Christoph Hellwig @ 2025-05-13 8:56 ` Carlos Maiolino 2025-05-13 21:37 ` Dave Chinner 1 sibling, 0 replies; 8+ messages in thread From: Carlos Maiolino @ 2025-05-13 8:56 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs, cen zhang On Tue, May 13, 2025 at 07:26:14AM +0200, Christoph Hellwig wrote: > We don't bother with the ILOCK as this is best-effort and thus a racy > access is ok. Add a data_race() annotation to make that clear to > memory model verifiers. > > Reported-by: cen zhang <zzzccc427@gmail.com> > Signed-off-by: Christoph Hellwig <hch@lst.de> Looks okay Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> > --- > fs/xfs/xfs_file.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 84f08c976ac4..a4a2109cb281 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -1549,7 +1549,11 @@ xfs_file_release( > */ > if (xfs_iflags_test_and_clear(ip, XFS_ITRUNCATED)) { > xfs_iflags_clear(ip, XFS_EOFBLOCKS_RELEASED); > - if (ip->i_delayed_blks > 0) > + /* > + * Don't bother with the ILOCK as this is best-effort and thus > + * a racy access is ok. > + */ > + if (data_race(ip->i_delayed_blks) > 0) > filemap_flush(inode->i_mapping); > } > > -- > 2.47.2 > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy 2025-05-13 5:26 ` [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy Christoph Hellwig 2025-05-13 8:56 ` Carlos Maiolino @ 2025-05-13 21:37 ` Dave Chinner 2025-05-14 4:29 ` Christoph Hellwig 1 sibling, 1 reply; 8+ messages in thread From: Dave Chinner @ 2025-05-13 21:37 UTC (permalink / raw) To: Christoph Hellwig; +Cc: cem, linux-xfs, cen zhang On Tue, May 13, 2025 at 07:26:14AM +0200, Christoph Hellwig wrote: > We don't bother with the ILOCK as this is best-effort and thus a racy > access is ok. Add a data_race() annotation to make that clear to > memory model verifiers. IMO, that's the thin edge of a wedge. There are dozens of places in XFS where we check variable values without holding the lock needed to serialise the read against modification. For example, i_delayed_blks updates are protected by the ILOCK, so there's a data race if we read it without the ILOCK held. We do this in: xfs_file_release() - the one this patch addresses xfs_getbmap() - unlocked access for data fork xfs_can_free_eofblocks() - checked twice without locking xfs_inodegc_set_reclaimable() - unlocked, but debug xfs_inode_has_filedata() - unlocked via xfs_inactive() xfs_vn_getattr() - unlocked xfs_qm_dqusage_adjust() - unlocked ASSERT And that's just this one variable in the inode - there are lots of others we check without bothering to lock the inode. e.g. pretty much everythign that xfs_vn_getattr() reads is a data race because "unlocked racy access" is the way stat is designed to work on Linux. Hence my question - are we now going to make it policy that every possible racy access must be marked with data_race() because there is some new bot that someone is running that will complain if we don't? Are you committing to playing whack-a-mole with the memory model verifiers to silence all the false positives from these known-to-be-safe access patterns? -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy 2025-05-13 21:37 ` Dave Chinner @ 2025-05-14 4:29 ` Christoph Hellwig 2025-05-14 8:00 ` Carlos Maiolino 0 siblings, 1 reply; 8+ messages in thread From: Christoph Hellwig @ 2025-05-14 4:29 UTC (permalink / raw) To: Dave Chinner Cc: Christoph Hellwig, cem, linux-xfs, cen zhang, lkmm, linux-kernel On Wed, May 14, 2025 at 07:37:14AM +1000, Dave Chinner wrote: > On Tue, May 13, 2025 at 07:26:14AM +0200, Christoph Hellwig wrote: > > We don't bother with the ILOCK as this is best-effort and thus a racy > > access is ok. Add a data_race() annotation to make that clear to > > memory model verifiers. > > IMO, that's the thin edge of a wedge. There are dozens of places in > XFS where we check variable values without holding the lock needed > to serialise the read against modification. Yes. And the linux kernel memory consistency model ask us to mark them, see tools/memory-model/Documentation/access-marking.txt. This fails painful at first, but I'd actually wish we'd have tools enforcing this as strongly as possible as developers (well me at least) seem to think a racy access is just fine more often than they should, and needing an annotation and a comment is a pretty good way to sure that. > Hence my question - are we now going to make it policy that every > possible racy access must be marked with data_race() because there > is some new bot that someone is running that will complain if we > don't? Are you committing to playing whack-a-mole with the memory > model verifiers to silence all the false positives from these > known-to-be-safe access patterns? It's not really a "new bot". It has been official memory consistency policy for a while, but it just hasn't been well enforced. For new code asking if the review is racy and needs a marking or use READ_ONCE() and WRITE_ONCE() has been part of the usual review protocol. Reviewing old code and fixing things we got wrong will take a while, but I'm actually glad about more bots for that. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy 2025-05-14 4:29 ` Christoph Hellwig @ 2025-05-14 8:00 ` Carlos Maiolino 2025-05-14 13:04 ` Christoph Hellwig 0 siblings, 1 reply; 8+ messages in thread From: Carlos Maiolino @ 2025-05-14 8:00 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Dave Chinner, linux-xfs, cen zhang, lkmm, linux-kernel Hi. On Wed, May 14, 2025 at 06:29:46AM +0200, Christoph Hellwig wrote: > On Wed, May 14, 2025 at 07:37:14AM +1000, Dave Chinner wrote: > > On Tue, May 13, 2025 at 07:26:14AM +0200, Christoph Hellwig wrote: > > > We don't bother with the ILOCK as this is best-effort and thus a racy > > > access is ok. Add a data_race() annotation to make that clear to > > > memory model verifiers. > > > > IMO, that's the thin edge of a wedge. There are dozens of places in > > XFS where we check variable values without holding the lock needed > > to serialise the read against modification. > > Yes. And the linux kernel memory consistency model ask us to mark them, > see tools/memory-model/Documentation/access-marking.txt. > > This fails painful at first, but I'd actually wish we'd have tools > enforcing this as strongly as possible as developers (well me at least) > seem to think a racy access is just fine more often than they should, and > needing an annotation and a comment is a pretty good way to sure that. > > > Hence my question - are we now going to make it policy that every > > possible racy access must be marked with data_race() because there > > is some new bot that someone is running that will complain if we > > don't? Are you committing to playing whack-a-mole with the memory > > model verifiers to silence all the false positives from these > > known-to-be-safe access patterns? > > It's not really a "new bot". It has been official memory consistency > policy for a while, but it just hasn't been well enforced. For new code > asking if the review is racy and needs a marking or use READ_ONCE() and > WRITE_ONCE() has been part of the usual review protocol. Reviewing old > code and fixing things we got wrong will take a while, but I'm actually > glad about more bots for that. > I agree with you here, and we could slowly start marking those shared accesses as racy, but bots spitting false-positivies all the time doesn't help much, other than taking somebody's else time to look into the report. Taking as example one case in the previous report, where the report complained about concurrent bp->b_addr access during the buffer instantiation. This seems to be covered by the access-marking.txt doc: " ... Churning the code base with ill-considered additions of data_race(), READ_ONCE(), and WRITE_ONCE() is unhelpful. ... " Then: " ... Use of Plain C-Language Accesses 2. Initialization-time and cleanup-time accesses. This covers a wide variety of situations, including the uniprocessor phase of system boot, variables to be used by not-yet-spawned kthreads, structures not yet published to reference-counted or RCU-protected data structures, and the cleanup side of any of these situations. ... " So, I think Dave has a point too. Like what happens with syzkaller and random people reporting random syzkaller warnings. While I appreciate the reports too, I think it would be fair for the reporters to spend some time to at least craft a RFC patch fixing the warning. In this case for example, the reporter explicitly mentioned data_race() annotations, so, why not just send a patch attempting fix the warning instead of asking somebody else to do that? IMHO, somebody fidgeting with KCSAN has enough skills to craft such patch. Of course I might be wrong here, but tweaking debug knobs over the kernel and asking somebody else to look at the results doesn't seem scalable. Cheers. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy 2025-05-14 8:00 ` Carlos Maiolino @ 2025-05-14 13:04 ` Christoph Hellwig 2025-05-14 23:21 ` Dave Chinner 0 siblings, 1 reply; 8+ messages in thread From: Christoph Hellwig @ 2025-05-14 13:04 UTC (permalink / raw) To: Carlos Maiolino Cc: Christoph Hellwig, Dave Chinner, linux-xfs, cen zhang, lkmm, linux-kernel On Wed, May 14, 2025 at 10:00:28AM +0200, Carlos Maiolino wrote: > I agree with you here, and we could slowly start marking those shared accesses > as racy, but bots spitting false-positivies all the time doesn't help much, > other than taking somebody's else time to look into the report. > > Taking as example one case in the previous report, where the report complained > about concurrent bp->b_addr access during the buffer instantiation. I'd like to understand that one a bit more. It might be because the validator doesn't understand a semaphore used as lock is a lock, but I'll follow up there. > So, I think Dave has a point too. Like what happens with syzkaller > and random people reporting random syzkaller warnings. > > While I appreciate the reports too, I think it would be fair for the reporters > to spend some time to at least craft a RFC patch fixing the warning. Well, it was polite mails about their finding, which I find useful. If we got a huge amount of spam that might be different. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy 2025-05-14 13:04 ` Christoph Hellwig @ 2025-05-14 23:21 ` Dave Chinner 2025-05-16 6:34 ` Christoph Hellwig 0 siblings, 1 reply; 8+ messages in thread From: Dave Chinner @ 2025-05-14 23:21 UTC (permalink / raw) To: Christoph Hellwig Cc: Carlos Maiolino, linux-xfs, cen zhang, lkmm, linux-kernel On Wed, May 14, 2025 at 03:04:17PM +0200, Christoph Hellwig wrote: > On Wed, May 14, 2025 at 10:00:28AM +0200, Carlos Maiolino wrote: > > I agree with you here, and we could slowly start marking those shared accesses > > as racy, but bots spitting false-positivies all the time doesn't help much, > > other than taking somebody's else time to look into the report. > > > > Taking as example one case in the previous report, where the report complained > > about concurrent bp->b_addr access during the buffer instantiation. > > I'd like to understand that one a bit more. It might be because the > validator doesn't understand a semaphore used as lock is a lock, but > I'll follow up there. Even so, it's not a race because at that point in time the object cannot be seen by anything other than the process that is initialising it. We initialise objects without holding the object locked all over the place - if this is something that the sanitiser cannot determine automatically, then we're also going to need annotations to indicate that the initialisation function(s) contain valid data races. > > > So, I think Dave has a point too. Like what happens with syzkaller > > and random people reporting random syzkaller warnings. > > > > While I appreciate the reports too, I think it would be fair for the reporters > > to spend some time to at least craft a RFC patch fixing the warning. > > Well, it was polite mails about their finding, which I find useful. > If we got a huge amount of spam that might be different. That's kinda of my point about it being the "thin edge of the wedge". Once we indicate that this is something we want reported so we can address, then the floodgates open. I'm wary of this, because at this point I suspect that there aren't a lot of people with sufficient time and knowledge to adequately address these issues. Asking the reporter to "send a patch to data_race() that instance" isn't a good solution to the problem because it doesn't address the wider issue indicated by the specific report. It just kicks the can down the road and introduces technical debt that we will eventually have to address. We should have learnt this lesson from lockdep - false positive whack-a-mole shut up individual reports but introduced technical debt that had to be addressed later because whack-a-mole didn't address the underlying issues. When the stack of cards fell, someone (i.e. me) had to work out how to do lockdep annotations properly (e.g. via the complex inode locking subclass stuff) to make the issues go away and require minimal long term maintenance. I don't want to see this pattern repeated. We need a sane policy for addressing the entire classes of issuesi underlying individual reports, not just apply a band-aid that silences the indivual report. So, let's look at what kcsan provides us with. We need functions like xfs_vn_getattr(), the allocation AG selection alogrithms and object initialisation functions to be marked as inherently racy, because they intentionally don't hold locks for any of the accesses they make. kcsan provides: /* comment on why __no_kcsan is needed */ __no_kcsan void xfs_foo( struct xfs_bar *bar) { ... the __no_kcsan attribute for function declarations to mark the entire function as inherently racy and so should be ignored. For variables like ip->i_delayed_blks, where we intentionally serialise modifications but do not serialise readers, we have: - uint64_t i_delayed_blks; /* count of delay alloc blks */ + uint64_t __data_racy i_delayed_blks; /* count of delay alloc blks */ This means all accesses to the variable are considered to be racy and kcsan will ignore it and not report it. We can do the same for lip->li_lsn and other variables. IOWs, we do not need to spew data_race() wrappers or random READ_ONCE/WRITE_ONCE macros all over the code to silence known false positives. If we mark the variables and functions we know have racy accesses, these one-line changes will avoid -all- false positives on those variables/from those functions. This, IMO, is a much better solution than "send a patch marking that access as data_race()". We fix the issue for all accesses and entire algorithms with simple changes to variable and/or function declarations. To reiterate my point: if we are goign to make XFS KCSAN friendly, we need to work out how to do it quickly and efficiently before we start changing code. Engaging in random whack-a-mole shootdown games will not lead to a viable long term solution, so let's not waste time on playing whack-a-mole. -Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy 2025-05-14 23:21 ` Dave Chinner @ 2025-05-16 6:34 ` Christoph Hellwig 0 siblings, 0 replies; 8+ messages in thread From: Christoph Hellwig @ 2025-05-16 6:34 UTC (permalink / raw) To: Dave Chinner Cc: Christoph Hellwig, Carlos Maiolino, linux-xfs, cen zhang, lkmm, linux-kernel On Thu, May 15, 2025 at 09:21:01AM +1000, Dave Chinner wrote: > > I'd like to understand that one a bit more. It might be because the > > validator doesn't understand a semaphore used as lock is a lock, but > > I'll follow up there. > > Even so, it's not a race because at that point in time the object > cannot be seen by anything other than the process that is > initialising it. In the current tree after the folio/vmalloc series b_addr is only assigned during buffer allocation. But I suspect they tested before that where b_addr can be set at runtime. Either way it always happens under b_sema because that is initialized to locked just after allocating the memory for the buffer. > I'm wary of this, because at this point I suspect that there aren't > a lot of people with sufficient time and knowledge to adequately > address these issues. I'm more than happy to address these, because proper documentation of concurrency helps fixing a huge number of bugs, and also really helps documenting the code. I hate having to spend hours trying to figure out why something can be safely used lockless or not. > We should have learnt this lesson from lockdep - false positive > whack-a-mole shut up individual reports but introduced technical > debt that had to be addressed later because whack-a-mole didn't > address the underlying issues. I'm not sure who "we" is, but I've always pushed back to hacks just to shut up lockdep. And at the same time I'm immensively grateful for having lockdep and can't think of working without it anymore. > We need functions like xfs_vn_getattr(), the allocation AG selection > alogrithms and object initialisation functions to be marked as > inherently racy, because they intentionally don't hold locks for any > of the accesses they make. kcsan provides: Functions never are racy, specific data access are. So a function wide assignment is just the dumbest thing ever, this already badly failed for things like early Java object-level synchronization. > > For variables like ip->i_delayed_blks, where we intentionally > serialise modifications but do not serialise readers, we have: > > - uint64_t i_delayed_blks; /* count of delay alloc blks */ > + uint64_t __data_racy i_delayed_blks; /* count of delay alloc blks */ > > This means all accesses to the variable are considered to be racy > and kcsan will ignore it and not report it. We can do the same for > lip->li_lsn and other variables. But not all access are racy. We rely on proper synchronized accesses for accounting. Now for something that has a lot of unsynchronized access, adding a wrapper for them might be fine, but for i_delayed_blks I don't think we actually have enough for them to bother. > IOWs, we do not need to spew data_race() wrappers or random > READ_ONCE/WRITE_ONCE macros all over the code to silence known false > positives. If we mark the variables and functions we know have racy > accesses, these one-line changes will avoid -all- false positives on > those variables/from those functions. And also drop a lot of the actually useful checks. That's exaxctly the kind of hack you rant about above. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-05-16 6:34 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <b0B7vqX5e__5JIDhxMANuyma5_RcE4wAhgR1-x-U6YY3qUy_mE220aCGAqsZ-LpEDc06cNfseqOGIgVw4BKmVg==@protonmail.internalid> 2025-05-13 5:26 ` [PATCH] xfs: mark the i_delayed_blks access in xfs_file_release as racy Christoph Hellwig 2025-05-13 8:56 ` Carlos Maiolino 2025-05-13 21:37 ` Dave Chinner 2025-05-14 4:29 ` Christoph Hellwig 2025-05-14 8:00 ` Carlos Maiolino 2025-05-14 13:04 ` Christoph Hellwig 2025-05-14 23:21 ` Dave Chinner 2025-05-16 6:34 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).