public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Matthew Wilcox <willy@infradead.org>, linux-xfs@vger.kernel.org
Subject: Re: fs-next-20251103 reclaim lockdep splat
Date: Tue, 11 Nov 2025 09:07:16 -0500	[thread overview]
Message-ID: <aRNDFNenNDq3icjw@bfoster> (raw)
In-Reply-To: <aRJg-LcA8RGeqOgQ@dread.disaster.area>

On Tue, Nov 11, 2025 at 09:02:32AM +1100, Dave Chinner wrote:
> On Thu, Nov 06, 2025 at 07:20:39AM -0500, Brian Foster wrote:
> > On Thu, Nov 06, 2025 at 08:05:11AM +1100, Dave Chinner wrote:
> > > On Wed, Nov 05, 2025 at 08:21:51PM +0000, Matthew Wilcox wrote:
> > > > In trying to bisect the earlier reported transaction assertion failure,
> > > > I hit this:
> > > > 
> > > > generic/476       run fstests generic/476 at 2025-11-05 20:16:46
> ....
> > > As I said on #xfs: false positive on the inode lock.
> > > 
> > > Reclaim is running in GFP_KERNEL context, so it's allowed to lock
> > > unreferenced inodes.
> > > 
> > > The inodes that the allocation context holds locked are referenced
> > > inodes, so it cannot self-deadlock on the inode locks it holds
> > > because reclaim does not access or lock referenced inodes.
> > > 
> > > That being said, looking at this patch:
> > > 
> > > https://lore.kernel.org/linux-xfs/20251003134642.604736-4-bfoster@redhat.com/
> > > 
> > > I think the allocation that iomap_fill_dirty_folios() should
> > > probably be using mapping_gfp_constraint(mapping, GFP_KERNEL) rather
> > > than a hard coded GFP_KERNEL allocation. This is deep in the
> > > buffered write path and the xfs ILOCK is held when
> > > iomap_fill_dirty_folios() and it does folio lookups in that
> > > context.
> > > 
> > 
> > There's an outstanding patch to nuke this allocation completely:
> > 
> > https://lore.kernel.org/linux-fsdevel/20251016190303.53881-2-bfoster@redhat.com/
> > 
> > This was also problematic for the ext4 on iomap WIP, so combined with
> > the cleanup to use an iomap flag this seemed more elegant overall.
> 
> Ok, that looks like a good way to get rid of the allocation, so
> you can add
> 
> Acked-by: Dave Chinner <dchinner@redhat.com>
> 
> to it.
> 

Thanks. I'll repost that one standalone since the rest of that series
still needs some work.

> > The patch series it's part of still needs work, but this one is just a
> > standalone cleanup. If I can get some acks on it I'm happy to repost it
> > separately to take this issue off the table..
> > 
> > > Huh - that kinda feels like a lock order violation. ILOCK is not
> > > supposed to be held when we do page cache operations as the lock
> > > order defined by writback operations is folio lookup -> folio lock
> > > -> ILOCK.
> > > 
> > > So maybe this is a problem here, but not the one lockdep flagged...
> > 
> > Yeah.. but filemap_get_folios_dirty() is somewhat advisory. It is
> > intended for use in this context, so only trylocks folios and those that
> > it cannot lock, it just assumes are dirty||writeback and includes them
> > in the batch for locking later (where later is defined as after the
> > iomap callback returns where iomap typically does folio lookup/lock for
> > buffered writes).
> 
> I guess I don't understand why this needs to be done under the
> ilock. I've read the patches, and it doesn't explain to me why we
> need to look up the pages under the ILOCK? The only constraint I see
> is trimming the extent map to match the end of a full fbatch array,
> but that doesn't seem like it can't be done by iomap itself.
> 
> Why do we need to do this dirty-folio batch lookup under the
> ILOCK instead of as a loop in iomap_zero_range() that grabs a fbatch
> for the unwritten mapping before calling iomap_zero_iter()
> repeatedly until we either hit the end of the mapping or there are
> no more dirty folios over the mapping?
> 

It doesn't need to be under ilock, just either under ilock or before
iomap_begin to rule out problematic inconsistency (i.e. seeing a clean
folio && unwritten mapping and racing with writeback). This was
discussed at multiple points during review of the original batch series.

The longer term prospects for this are to potentially use it for
multiple different operations, and therefore lift it into core iomap
(not necessarily just zero range). The reason it's done this way to
start was mainly just to manage scope and complexity. The use case is
bounded, so it was easiest to work this in where it is and defer
complexity associated with broader use to later.

That might sound pedantic, but when you consider things like dependent
changes like incremental iter advancement, the impedence mismatch
between some of these changes and the ext4-on-iomap WIP, and the little
issues that fall out of that, this is all much easier to develop,
support and improve with as little unnecessary complexity as possible. I
think when some of the kinks the rest of that series is trying to
address are worked out, that might be a reasonable point to consider
such a lift.

Brian

> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 


      reply	other threads:[~2025-11-11 14:02 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-05 20:21 fs-next-20251103 reclaim lockdep splat Matthew Wilcox
2025-11-05 21:05 ` Dave Chinner
2025-11-06 12:20   ` Brian Foster
2025-11-10 22:02     ` Dave Chinner
2025-11-11 14:07       ` Brian Foster [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aRNDFNenNDq3icjw@bfoster \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox